Out of Vietnam, Part 1.
Posted by Clifford Heath on October 21, 2007 at 10:44 PM
Object/Relational mapping has been called the Vietnam of Computer Science, meaning, I think, that it’s become an intractable problem that we never needed to get into in the first place. Actually, it was unavoidable, but there’s a way of hiding the problem, which is the subject of this series of articles.
The core of software design is expression; how do we express what we want a system to do, to be, and to achieve? It’s hard for software folk to think clearly about this. We’re conditioned by having problems handed out on a sheet of paper during our training. We’re taught to break them down, decompose them, by various methods. We worry and argue about the right way to go about decomposing problems.
In reality, we never receive problems fully-formed like this, and so we seldom have to decompose them. Instead, our clients witter on about how this should have one of those, and how a thing is on this list unless that condition holds… and we have to compose a system out of these fragmentary utterances. Composition and aggregation, not decomposition, is our main activity. In the process, we try to distil and create conceptual purity from the original communications.
In choosing how to aggregate things, we take various approaches. Object-Oriented practitioners group things mostly by shared behaviour. Database people struggle to avoid duplication while clustering things to maximise disk throughput and transactional reliability. In both cases, the attempt to maintain purity is moderated by the need to work within the bounds of physical computer hardware - main memory on the one hand, disk drives on the other. One is volatile, the other persistent. These two place very different constraints on the shape of an optimum solution. Both are based in the real world, so the problem is to some extent unavoidable.
It gets worse though… neither solution is very close to the original problem statement, which shuts out the domain expert. We actually have not a two-way problem, but a three-way one, played by three roles:
- The Business Analyst or domain expert
- The Software Designer/architect
- The Data Managers
In general, none of these wants quite the same things or talks the same language as the others, and none really accepts the other’s view on things. Depending on who you ask, they’ll always point to another group as being the origin of the communication problem. So we have a stand-off, and rocks get thrown in all directions. This is the most pernicious and costly communication problem in the software industry.
To get out of Vietnam, we have to create a language in which all three groups can be equally fluent, and which gives each group what they need. We need a language of facts which is at once formal and accessible, and which can be automatically and efficiently mapped to objects and to normalised database designs. It must reflect natural verbalisations, yet have an unambiguous meaning. It’s not UML or Barker ER notation. Think it sounds too hard? Come back to read the next instalment.
How to ruin a Rails project
Posted by Clifford Heath on October 18, 2007 at 10:37 PM
There are lots of ways to ruin any project. I’ve seen most of them over the last few decades, but this year I’ve been called in to salvage a series of Rails projects that were, well, off the rails, in some ways that maybe special to Rails. So I’ll try to steer clear of the ordinary foul-ups, and focus on the ones that Rails seems to attract.
We have four months before the website is needed, and Rails is so productive that we don’t need to get started yet. We can deliver the specifications in a couple of months or so, and everyone will be ready to knock out the website in two weeks. Right. Let me know how that goes, ok?
Databases suck, no-one wants to write SQL, and I can’t do all my validations in it anyhow, so why should I do any? We’ll do things the Rails Way and put all that stuff in the code where it’s easy. After all, who needs a uniqueness constraint if the code always checks for an existing record before inserting a new one, right? Nothing can go wrong with that can it?
Indexes? Add them after users complain that the site is too slow - even if it was obvious after a moment’s thought that they were always going to be needed. MySQL is so bad at optimizing queries that it might as well be forced to do full table scans it was probably going to do anyway. And besides, it worked just fine with the 5 test records I put in the test fixtures manually.
Performance doesn’t matter, so if the site is too slow, well, at least it was quick to develop. And when the client urgently needs a report that should take five seconds to produce, but because it’s a five-way join and you didn’t add any indexes it times out in Apache’s mod_proxy after the regulation five minutes, well, that’s why you turn your mobile phone off at night and ensure you can never be found online, right? That way you can get a good night’s sleep while the client is tearing out his hair and losing his business.
Foreign keys. You don’t need the database to enforce them if you get the code right. No need to actually take a look at the database from time to time to see whether the invariants your code is supposed to enforce are actually held. So when you later make administrative changes and delete records that other ones refer to, well, ActiveRecord is good about providing a nil that should do nothing, and if not, well, there’s always an exception catcher to tell you your mistake.
Oh, yes, exceptions. The Rails log is full of them, but they’re mostly from Chinese hackers trying to find hidden features, or irrelevant little deadlocks or races that made some user redo their work. No big deal, it only happens occasionally. No need to deploy one of the nice plugins that send you email when you get an exception, of course. That would just mean you’d have to go and find out why it happened, and Rails exists to reduce boring work.
If it works for one user, it’ll work for hundreds, won’t it? Transactions and locks are for banks, not for websites. And two-phase commit, that’s engagement & marriage isn’t it, not something you’d use in a payment protocol? Oh, and I sprinkled a few magic Model.transaction {} blocks around the place, and they must work, because people who should understand such things said they work.
Release management is for wimps. Just use the SVN trunk, and when you check in code, check it out on the test server, let the client look it over, then deploy it to production. No need even to log in to do that, just cap deploy - you can do it without getting out of your pyjamas. All your developers are demigods who never make mistakes anyhow, so if one on one side of the city deploys the other one’s code into production without even Skyping or picking up the phone, there won’t be any unforeseen interactions, will there now?
It was so easy to write, any fool can see it’s correct. TDD is fine for some slow thinkers, and we’re glad Rails makes it easy for them, but seriously, do you expect me to write 100 lines of code to test 50, when I can see perfectly well that there aren’t any errors in it? And besides, if there is an error, it’ll be a one-line fix. Barely even need to finish my latte first, it’ll be fixed in a moment. Not necessarily the moment before it makes the site melt down, but that’s what backups are for, right?
Hmm, backups. That would have been a good idea. That would have helped when, after discovering we hadn’t planned far enough ahead to see the one feature that was going to make all the difference on the big day, we let folk type data directly into the database using an unvalidated, unlogged administration feature. Pity they deleted the entire contents of a critical table… And even then, we might have been able to cobble together a script to reconstruct the transactions that were lost, except that the Rails log only lists the form parameters, not the saved session variables that form the context in which those parameters were relevant.
Discipline? Who needs discipline or forethought when you’re agile?