Out of Vietnam, Part 1

posted by cjh, 21 October 2007

Object/Relational mapping has been called the Vietnam of Computer Science, meaning, I think, that it’s become an intractable problem that we never needed to get into in the first place. Actually, it was unavoidable, but there’s a way of hiding the problem, which is the subject of this series of articles.

The core of software design is expression; how do we express what we want a system to do, to be, and to achieve? It’s hard for software folk to think clearly about this. We’re conditioned by having problems handed out on a sheet of paper during our training. We’re taught to break them down, decompose them, by various methods. We worry and argue about the right way to go about decomposing problems.

In reality, we never receive problems fully-formed like this, and so we seldom have to decompose them. Instead, our clients witter on about how this should have one of those, and how a thing is on this list unless that condition holds… and we have to compose a system out of these fragmentary utterances. Composition and aggregation, not decomposition, is our main activity. In the process, we try to distil and create conceptual purity from the original communications.

In choosing how to aggregate things, we take various approaches. Object-Oriented practitioners group things mostly by shared behaviour. Database people struggle to avoid duplication while clustering things to maximise disk throughput and transactional reliability. In both cases, the attempt to maintain purity is moderated by the need to work within the bounds of physical computer hardware - main memory on the one hand, disk drives on the other. One is volatile, the other persistent. These two place very different constraints on the shape of an optimum solution. Both are based in the real world, so the problem is to some extent unavoidable.

It gets worse though… neither solution is very close to the original problem statement, which shuts out the domain expert. We actually have not a two-way problem, but a three-way one, played by three roles:

  1. The Business Analyst or domain expert
  2. The Software Designer/architect
  3. The Data Managers

In general, none of these wants quite the same things or talks the same language as the others, and none really accepts the other’s view on things. Depending on who you ask, they’ll always point to another group as being the origin of the communication problem. So we have a stand-off, and rocks get thrown in all directions. This is the most pernicious and costly communication problem in the software industry.

To get out of Vietnam, we have to create a language in which all three groups can be equally fluent, and which gives each group what they need. We need a language of facts which is at once formal and accessible, and which can be automatically and efficiently mapped to objects and to normalised database designs. It must reflect natural verbalisations, yet have an unambiguous meaning. It’s not UML or Barker ER notation. Think it sounds too hard? Come back to read the next instalment.