Saturday, September 21, 2013

Riding the bow wave

From Chapter 2: The Mythical Man-Month

Excerpt
Failure to allow enough time for system test, in particular, is peculiarly disastrous. Since the delay comes at the end of the schedule, no one is aware of schedule trouble until almost the delivery date. (page 20)

From "Acquisition Archetypes: The Bow Wave Effect"1 

Excerpt
Making Wakes - geograph.org.uk - 1485246"We don't compromise on schedule delivery date... we just kept dropping functionality... A growing mass of work had to be done at the end."

Brooks will repeatedly admonish us to allocate 50% of the schedule to test.2 I'm assuming he has the waterfall model in mind. If so, there is more to the test phase than just testing; the waterfall test phase includes the integration of separately-developed, 'finished' pieces. Typically these are big pieces. For a space system this integration will include the ground system command development infrastructure with the on-board command infrastructure, the command system and with the telemetry system, the downlinked telemetry with prediction analysis tools, the onboard fault protection software with the power, attitude control, thermal and propulsion software...and that's just a sample. In the waterfall model, if all the previous steps were done correctly, everything should fit together like so many Lego blocks. That's the theory--a theory seldom if NEVER works. Only the most unwitting manager would insist that software integration is simply a matter of testing properly implemented pieces. (Unlike sasquatch, they exist and manage NASA budgets!)

In practice the test effort usually exposes important gaps in development. Why? The two principal culprits: 'the bow wave' and 'interface management.'

The bow wave
A development bow wave is the unfinished work that has been pushed to the end and bites you from behind at the end of a development cycle.

Bow waves build because development is the process of discovery. There are always unanticipated problems: A broken library, a compiler bug, a design mismatch, an incorrect requirement, an error condition that emerges from the blue... The possibilities are endless.

These discoveries happen in the context of programmatic commitments made in the form of budget and schedule. The definition of a good team (i.e. one that functions without disruptive 'help' from management) is one that meets its commitments. Once you add in procrustean schedule notions like EVM, you have a bow-wave-generating recipe for delaying or modifying features for the sake of sunny-day progress reports. Meanwhile, the bow wave builds as the unanticipated collides with the commitment. It resembles a Ponzi scheme predicated on a schedule slips or subsequent releases. Just like the Ponzi scheme, the piper must be paid and payment inevitably comes during 'Test.' Consequently, what was 90% complete prior to test, suddenly becomes 50% complete and that hefty 'test' schedule becomes synonymous with good planning, albeit at an exorbitant price.

Interface management
A large system, like a space system, is built by a dozen-or-two teams each working on separate functional pieces the must be integrated to work as a whole.3 Just how these pieces will fit together is typically captured in document, called a Interface Control Document (ICD) that describes the inputs and outputs of each piece. The ICD may also describe the correct steps for functional pieces to interact (often called a protocol).

As a rule ICDs were seldom accurate and interfaces tough to manage. Here's a few of they reasons why:
  • Interface change is fundamentally a serial activity. While an intended changes may be documented, actual implementation may be different. Consequently, the documentation may not reflect the actual artifact.
  • Implementers are busy. In the rush to a deadline, a developer will be hard pressed to record each changes.
  • Auto documentation by document generators is no panacea. Distribution needs to be timely. If developers from other teams implement immature designs, the code may be whipsawed and the stability of the build will be jeopardized.
  • Over time, a software systems will develop undocumented interfaces that make the system brittle. An interface change will break the system in unanticipated ways that only show up during test.
  • Interface and protocol documentation methods are squishy, if not misleading, and easily misinterpreted. Rigorous definitions, like those provided through formal methods, require a specialist with an advanced degree, narrowly focused skills and plenty of time. Except for the you're-lucky-to-get, all-star programmer, something better is needed for people doing the real work.
  • Bureaucratic organizations guard their interfaces--with good reason. To lose ownership is to lose funding; a interface change may cause a function piece to slip from the organizational grasp. Consequently, inter-organizational interface changes typically requires painful committee work among distrustful managers and a grievously painful negotiation. Meanwhile, the necessities of project requirements, budget and schedule will drive developers to built solutions that then plant the seeds of system ossification.
  • Interface improvements will be vetoed if it weakens an organization role.

The result: problems with interface management usually swell the bow wave during the so-called test phase. Viola, developers will found busily 'fixing' problems late in the cycle by fixing interfaces and building new features. Test is hardly just testing.

Experienced software engineers know how to cope with interface change and the bow wave. Most who build software for a living figure this is just how things are done and ride the wave. There are, however, techniques like early integration and iterative development in common practice and go a long way towards mitigating schedule and budget risk. Call me a curmudgeon, but I don't believe these techniques scale to large system developments that cross organizational boundaries. The problem is not just software, it's software in the context of large organizations with parochial interests. I don't believe we know how to manage that.


1. "Acquisition Archetypes: The Bow Wave Effect". SEI white paper, 2007. http://www.sei.cmu.edu/library/assets/bowwave.pdf

2. For a bit of skepticism on my part, see Orders of Complexity.

3. I discussed some of the challenges that stem from functional decomposition in Humpty-Dumpty Effect

No comments:

Post a Comment