Thursday, January 03, 2008

Data accessibility

As a Commons committee reports that the recent loss of 25 million child benefit records might be just the tip of the iceberg, I'm still struggling to come to terms with the fact that this data was stored on an Access database and then dumped onto CDs and bunged in the internal mail.

25M records on an Access database... the last time I benchmarked Access on Windows against MySQL (a FREE download for Windows as well as Unixen and a database engine that's good enough for Google) on Linux, MySQL ran 200 times faster. The method of the benchtest was to install Windows 2000 (it was a while ago), without tweaking, on a machine, suck in a database dump, run and time a set of queries, then install Linux (Debian) on the same machine, suck in the same database dump and run the same queries.

Access is old technology. If you're on Windows and are budget-sensitive, you can now use MSDE (a hobbled version of SQL Server), SQL Server or MySQL, all at reasonable or no cost. It makes absolutely no sense to use Access. Even if the skillset of your staff makes you hesitate to install a proper SQL database management system, ODBC will let them USE Access to connect to a backend server, so they can stay within their comfort zones while using appropriate technology.

Of course, the correct solution would be to commission a front-end that runs in a web browser, although the MTAS farce suggests even this task that is within the range of a twenty year old startup entrepreneur would be managed incompetently by the government. And that's the problem. When an organisation can commission software that is vulnerable to query string modification, a security issue so incredibly basic that it isn't even regularly discussed (unlike cross site scripting and sql injection), then it's really hard to suggest with confidence any courses of action.

But a proper DBMS will allow other forms of access to the system to be developed, such as online access. This can be highly secure.

If an audit of the child benefit database was needed, and I'm sure it was, the auditors didn't need all the data, they needed access to all the data, a very different thing. They would have been using sampling techniques, not examining every number, and they could have done this online, perhaps through a VPN to enhance security.

So far as cost is concerned, I have set up similar systems for a few thousand pounds, and I'm not especially cheap.

But there's more. Using a proper DBMS gives a scalable, resilient system. MySQL supports both replication and clustering, even in the free version. It supports transactions, essential where multiple and mutually dependent entries have to be made (commonplace with any system that handles financial data).

But our government uses software that was designed for, and belongs to the era of, the stand-alone PCs of the 1980s. This is incredibly old, inappropriate technology. Making data loss a crime would in itself represent a massive injustice under these circumstances. It would be like making sinking a crime for the captain of the Titanic.

Instead, it's quite clear that a complete overhaul of the government's IT systems is needed.

Hire a handful of people from any proper technology business for a six month secondment and get it sorted out. This isn't a matter of money or the criminal law, so neither of those will cure the problem. It's an issue of competence, pure and simple.

2 comments:

Unknown said...

Damned fast for largely Read-Only access, but Transaction support was an after-thought bolted on some time later. So what's wrong with PostgreSQL instead?

Peter Risdon said...

Nothing. PostgreSQL is excellent.