Dabble DB

The Dabble Blog

Which Incremental Paths?

Incrementality. It’s been springing up all over the place in our posts, a trend I’m certain will continue. This being said, I think it’s useful to throw in a measure of caution: like all tools, incrementality is no silver bullet.

As a developer building software systems, I’m skeptical about any design approach that isn’t incremental. However, this is very different from being happy with any incremental path to a solution. Not all incremental paths are created equal. In fact, I’d claim that one of the more important skills possessed by expert designers in any field is the ability to distinguish good incremental paths from bad ones. How does one do this? Of course, there isn’t going to be a simple, comprehensive answer to this question, but I believe one necessary component is a strong awareness of the major decisions at play: in particular, it is very important to know which ones will be difficult to change later, the reasons why you might want to change them, and how likely it is that these reasons will come into play.

Let’s look at a dangerous example of this in the data management space: deciding on the amount of structure you’d like to impose on your data. By “structure” here I don’t mean presentation structure; rather breaking the data down into smaller parts, with the same kind of data always broken down into the same kind of parts: for example, the items in a to-do list or first, middle and last names for a person. Deciding on structure is particularly dangerous because totally valid short-term concerns can conflict with and potentially overwhelm totally valid long-term concerns. In the short-term, there are some strong forces pushing against the imposition of structure. Doing so certainly requires more thought (what exactly are the boundaries for different data components?). If you have any legacy data that needs to be incorporated, especially un- or semi-structured data, this complicates things all the more. Being a good, lazy incremental citizen, it feels like you should probably just impose a minimum amount of structure now, and worry about increasing it later. Unfortunately, this is an extremely difficult decision to change further down the road. Once you are managing even a moderate amount of data, automating migration to a more structured format, which typically means trying to parse arbitrary text, is extremely difficult, if not impossible–if it isn’t, that means it probably wasn’t actually that unstructured to begin with. Usually, you are left to either expend a huge amount of manual labor or to just take a pass on the benefits of structure.

Clearly, just “going incremental” isn’t enough here; some further help is required. For one thing, it’s helpful to be able quickly and fairly painlessly distinguish cases where structure isn’t particularly useful from those where structure might be useful. I usually find that this has to do with the overall “scope” of the data: How many people will look at it? How long will it be of interest? Will people want to look at this data in multiple ways? Strictly speaking, it is largely an affirmative answer to this last question that begs structure, but increased numbers of people looking at data over an increased period of time tend to make the desire for multiple views more likely. So, if structure isn’t necessary, great, paste it into an email to your Gmail account and it will be available and searchable for as long as you desire.

Even in cases where it seems structure might be useful, I don’t believe things have to be so bad, provided data management systems don’t make schema/structure definition too big a deal. There is usually more than one way to be incremental. Maybe you need structure, but that doesn’t mean you need to go off and do a requirements analysis and build a complete schema. If the system allows you to simply add a few useful slots for your data, stick some data in them, and repeat as necessary, you can get both the benefits of structure and incrementality.

What we don’t want data management systems doing is encouraging users to just thow in their data in an unstructured manner, with a vague promise that they can simply add structure later–doing so will require an incredible amount of work. Stepping back to the broader question of different kinds of incremental paths in general, those of us building data management systems need to try to subtly guide our users down the good, productive incremental routes, and nudge them away from the dangerous ones. Even expert designers appreciate time-tested patterns.


The comments to this entry are closed.