Tuesday, June 25, 2013

The last bug

From Chapter 1:

Excerpt
...finding nitty little bugs is just work... So testing drags on and on, the last difficult bugs taking more time to find than the first. (page 9)

Brooks continues his discussion of developers "inherent woes" with a nod to the drudgery of testing. In the "Mythical Man-Month" chapter he'll assert that testing is the most underestimated part of a project and recommend that testing should entail half the schedule.

As a software development manager, I never able to allocate close to that amount of schedule for testing. For a tasks that developed new software, I usually laid a schedule with the following guidelines:
  • A 6-month (i.e. 26 week) delivery boundary. A shorter schedule precluded the introduction of significant new features. A longer schedule was vulnerable to funding uncertainty.
  • A coding period of 16 weeks. Coding time drives other aspects of the schedule. The more coding, the more planning and testing. Longer schedules are too difficult to plan; too many details are yet to be discovered. Shorter schedules are prone to schedule slip from the inevitable surprises.
  • A planning period of 4 weeks with additional preliminary planning occurring during the test period for the previous delivery. In practice, this is almost always too little, but a longer period often leads to a development stall. (Topic for another posting)
  • An informal test period of 4 weeks. During this period there is no new feature development. The only code changes are bugs are fixes. The goal is to prepare the system for "formal" testing.
  • A formal test period of 4 weeks concluding with a "run for the record" or "acceptance test". In theory, but never in practice, there is minimal re-coding.
  • If a product is developed incrementally (i.e. in stages), the schedule is adjusted to so that earlier builds have more planning and later builds have more testing.

In practice, this outline often equated to a forced march. There was no respite between releases. My teams were often pushed to the limits of a their endurance. Burn-out was always a serious concern, especially for teams who worked on infrastructure systems like ground systems. Fundamentally, this was a consequence of the budgets and oversold expectations. Sadly, this is the norm for teams working on products for missions. I came to accept as fact that realistic budgets and schedules do not sell. If faced with the choice of be overworked or get laid off, what would you chose? For the sake of discussion... Let's say that NASA software development managers were routinely able to follow Brooks advice and allocate 1/2 of schedule to test. Would that be enough? It depends. A experienced manager will consider the maturity of the product, the complexity of the system, the skill of the team and the risk inherent in the application. Whatever the case, it's simply impossible to fully test any software product. There are two principle reasons:
  • Testing for all the possible conditions of the system is a practical impossibility. An experience test lead will have a instinct for the tests that matter.
  • a space system cannot be tested in the actual operational environment. Here's a point of major departure from the systems Brooks managed. A space system can only be tested in simulations or analogous environments.
    For example, the MSL landing code was tested by running millions of simulation using high-fidelity models of the spacecraft and Mars. Consider that challenge! Both the simulation code and the control system code have to be right. The simulation must be an accurate depiction. The control system must accurately analyze the simulated sensor data. Complementary errors must not correct for one another.
It is simply not enough for the space system to be bug free--it must accurately capture the physics.

Over the years, I came to believe teams have an intuitively reliable knowledge of product readiness. You might say the same of an author and her novel or a visual artist and his painting. It's somewhat miraculous. Somehow you just know.

Regrettably, this sort of intuition is insufficient for a management cadre steeped in the engineering catechism of rigor. 'Engineering' requires objective evidence, data and proof of rigor. So, as a practical matter, an elaborate charade of test completeness is concocted on paper to demonstrate to management's satisfaction that everyone is acting responsibly. After all, we must not fail.

No comments:

Post a Comment