Thursday, 7 November 2013

Software Engineering Meets Politics

It is a feature of our times that a software development project can become a major rolling news item in the mainstream media. In the US, the problem-ridden launch of the "Obamacare" website has inevitably been used as a stick with which to beat the Affordable Care Act specifically, the president personally, and government generally. In the UK, the more modest mess that is the Universal Credit system is variously held up as proof of the incompetence of Ian Duncan Smith, the coalition, or government of every stripe. There are plenty of bloggers poring over the political significance of each failure (trivial, in my view), but I want to look at the way that software engineering itself has been dragged reluctantly centre-stage.

A little background first. The US Patient Protection and Affordable Care Act extends private health insurance to potentially 24 million people not covered by existing public schemes, such as Medicare (for the old and disabled) and Medicaid (for the poorest). It doesn't create a state insurance body (the "single payer" approach advocated by dangerous radicals), but instead requires regulated private insurers to offer state-wide policies with a common rate ("community rating"). In return, they get government subsidies to make premiums affordable. Enrolment is mandated - i.e. young, healthy adults cross-subsidise older or chronically ill adults. As there are multiple insurers, the scheme operates as a state-level marketplace or exchange, preserving the principle (or illusion) that the individual selects a private provider. To complicate matters, 23 states are creating their own exchanges (or already have them in place, e.g. Massachusetts), while 27 are defaulting to the federal exchange. On top of that, the exchange needs to interface with multiple other federal systems for the purposes of ID and entitlement verification. That's one big mutha.

The Act was passed in March 2010. The prime contractor, CGI Federal, was awarded the contract in December 2011. This time-lag probably reflects a combination of turf-wars in Washington, a volatile spec due to continuing Republican attempts at sabotage, the difficulty of creating a brief for such a complex system, and the intrinsic dysfunction of the government procurement process. If March 2010 was the moment the starting pistol was fired, the next Presidential election in November 2016 is the finishing-tape. If Obamacare isn't fully bedded-in by then, an incoming Republican President could conceivably abandon or gradually choke it. To be irreversible, Obamacare will need to have completed at least one successful cycle - a calendar year of coverage and claims - without major problems. This implies at least two full years of operation (i.e. year one problems solved for year two), hence go-live for 2014 was an immovable object.

There are reports that political manoeuvring meant the spec could not be nailed down until after Obama's re-election in November 2012, which meant the start of development for the core system didn't commence till spring this year. In effect, a massive and complex systems integration project was to be implemented inside about 7 months. The central site was launched on the 1st of October. Americans wanting health insurance cover from the 1st of January need to have enrolled (and made a first payment) by the 15th of December, though enrolment remains open till the end of next March. It is clear that the system was buggy at launch, a fact conceded by the US government after three weeks as "glitches". This has led to the promise of a "tech surge" to resolve all issues by the end of November.

A number of commentators have greeted the "tech surge" announcement by reference to Brooks's Law: "adding manpower to a late software project makes it later". This has been subtly misinterpreted by Clay Shirky: "adding more workers tends to delay things, as the complexity of additional communications swamps the boost from having extra help". Fred Brooks's point was about attempts to speed-up development, usually in order to hit a management-dictated deadline, and how this becomes counter-productive after a certain point - i.e. you can absorb a modest increment in labour midway, but not a doubling of heads at the eleventh hour. In a post-deployment situation, where you are trouble-shooting a semi-stable system, bringing in fresh eyes and novel thinking is actually a good idea, as the developers may be too close to the code (or determined to cover-up errors), so citing Brooks's Law in respect of the "tech surge" is not insightful.

Shirky, as a techno-booster, sees government as the problem: "The business of government, from information-gathering to service delivery, will be increasingly mediated by the internet". In other words, government must change to suit the technology. This belief is founded on the assumption that technology is apolitical. The corollary of this is that government simply can't do technology: "Everything about the way the government builds large technical projects contrasts unfavorably [to non-government], from specification to procurement, to hiring, to management". Strangely, the recently-revealed competence of the NSA and GCHQ in delivering massive projects has not altered this view, any more than the precedent of the Manhattan or Apollo projects did.

It's worth noting that the limited evidence available points to integration and infrastructure issues as much as code bugs. This would support the suspicion that the tight window for development precluded adequate integration, regression and load testing. In other words, the problems are a consequence of the politics: a complex healthcare system, eleventh-hour compromises on operation that affected the software design, and a ridiculously small window for development and rollout. It's a wonder they got anything out the door. In this light, Shirky's criticism is not just ill-informed, it is clearly an attempt to create a false dichotomy between technology and politics, with the implication that the former is always and everywhere corrupted by the latter ("let my people go!"). In reality, they are inextricably intertwined.

A couple of weeks ago, Rusty Foster in the New Yorker pointed to the apparent lack of a decent spec: "building a complex software product without a clear, fixed set of specifications is impossible ... [it] is like being told to build a skyscraper without any blueprints, while the client keeps changing the desired location of things like plumbing and wiring". This suggests that in terms of software engineering, Rusty was stuck in a daydream of chisel-jawed architects straight out of Mad Men (perhaps he imagines this is New Yorker house-style). A client who keeps changing their mind is the premise of agile software development. To be fair to Foster, it's pretty clear that this was a waterfall project, rather than an agile one, which is to be expected when dealing with myriad contractors keen to milk government, so the absence of a spec ("big design") should have set alarm bells ringing. To judge from the testimony of the contractors before Congress, they were unable to cope with the government's late changes to the plan and the skimpy "big testing" at the end of development (two weeks is mentioned). Whichever way you look at it, this suggests that they were the wrong choice as contractors.

Amusingly, Foster appears to have discovered agile within a week of his earlier article (possibly because of some helpful comments below the line), though he doesn't appear to fully understand it : "An agile version of the project, for example, might have first released just the log-in component, or 'front door', for public use, before developing any of the tools to find and buy insurance plans". Yeah, that'll work. A population, a sizable proportion of whom still believe in "death panels" and think Obama is Nigerian, will have no trouble carrying out end-user testing and providing structured feedback. This is priceless: "An agile would have evolved, over time, in incremental steps that were always subjected to real-world use and evaluation". Foster appears to have missed the point that incremental evolution and real-world use is precisely what has been happening since the 1st of October.

One consequence of the US debacle may be a PR boost for agile, but that may be a mistake. Not only can agile be a disaster if treated as another commodity bought from a clueless consultancy, or if the customer is incapable of being agile themselves (customer involvement is the sine qua non of agile success), but it isn't necessarily appropriate for a systems integration project, or at least not without significant tailoring. It's pretty clear that the claims originally made by the DWP that the Universal Credit system would be "agile" were specious. Not only was this another spaghetti project - partially consolidating lots of legacy systems into one "simplified" whole - but the lead contractors were the usual waterfall-friendly suspects: HP, Accenture, CapGemini and IBM. This, combined with the traditional failings of "weak management, ineffective control and poor governance", inevitably led to by-the-numbers failure: malfunctioning software and financial waste (aka big profits for the aforementioned contractors).

The system was going to be a partial failure come what may, largely because of the political constraints. I'm actually surprised it is working as well as it is, and I suspect that it will be operating at an acceptable level for most users by the end of the month. That said, there will be a long tail of "glitches" into next year. Some problems will not be shaken out until the end of the first full annual cycle. But, as noted above, the watershed will be a largely trouble-free 2015. Obama will be able to point to Obamacare as his signal policy achievement, and may even point to the way "we sorted it out" as evidence that government can get stuff done. His opponents will stop attacking the policy because of its popularity (just as they did with Medicare and Medicaid) and will point to the launch problems as evidence that government screws stuff up. Plus ├ža change.

The mini-me of Universal Credit will probably grind on and be quietly shelved after the 2015 general election. The neoliberals in Labour probably like the idea, but will want to rebrand and relaunch, which means happy days again for the big consultancies. If the Tories get back in, IDS will be gone and a change in priorities will marginalise the project. The ultimate aim was never simplification or a better service for claimants, but a reduction in the total welfare bill. There will be plenty of other ways to achieve the same goal, and plenty of other software "solutions" to punt. Meanwhile, software engineering is going to have to get used to the limelight.

No comments:

Post a Comment