Saturday, December 21, 2013

Turning the system inside out

From Ratus rattus: A digression from the previous post

Excerpt
"Sponsors and upper management should not be exposed to development details even when those details drive cost and schedule." 
Santi di Tito - Niccolo Machiavelli's portrait
Niccolo Machiavelli (1469-1527)

From The Prince1

Excerpt
There are thee different kinds of brains, the one understands things unassisted, the other understands things when shown by others, the third understands neither alone nor with the explanations of others. The first kind is most excellent, the second also excellent, but the third useless. (Chapter 22, page 104)


In the Rattus rattus posting, I listed a few "frustrated utterances of immutable facts" that adversely impact the lives of the NASA software development community.

The single greatest challenge I faced as a NASA software development manager was finding ways to communicate key decisions to management without delving into technical details. That's right. It was inevitably a mistake to get technical with our management. If we did, we would likely be derailed by tangential questions or hostile interlocutors.

This was a fact of life I never fully accepted. After all, I was working for NASA, the home of advanced technologies, best and the brightest, the nation's stake in the future.

It was 1998. I was fresh blood, full of ideas; the "ancestral pieties"2 didn't apply. What I saw was very smart people working with decade-old technologies. Time to put the past behind. I'd been hired into a group with forward thinkers in leadership roles. We wanted to use the new tools: object-oriented languages, real-time operating systems with protected memory and compilers that supported generic programming. I was about take part in building the next generation space system using state of the art software.

Our goals were at odds with a lunch-time scuttlebutt that was punctuated with aphorisms like "software is an evil necessity," or "there's never time to do it right; there's always time to do it over." As yet, I had no appreciation of the machinery that preserved the established order. I would get my first exposure soon enough.

The implementation phase of our project was about to start. The first major review was around the corner.3 The team had decided to adopt C++ as our programing language. We needed funds for new tools, infrastructure and training. We needed management buy-in. Our manager asked me to present a rationale for using a new language and not selecting Ada or C, the programming languages used on the last missions.4 I prepared a balanced, 20-slide deck with code examples that illustrating the benefits and the pitfalls of C++.5

Five slides into the pre-review walk-through, my manager sent me to the showers. I had too much detail. I had highlighted potential difficulties. By discussing issues that would interest a responsible software developer, I had unwittingly painted a picture of a disaster in the making. My boss warned me that my material would freak the project management and we would surely get 'help' we did not want. If you happen to be a software engineer working in a hardware-centric, government-sponsored bureaucracy, the last thing you need is 'help.'

I was learning. Slowly. There were bigger surprises ahead.

Since I was the new guy, I often reached out to the team's most-experienced programmers for advice. We were about to start programming, but I did not yet have adequate requirements. One day, over coffee, I started grousing to one of the senior guys about the lack of requirements. He smiled knowingly. "Our requirements were useless," he said. By requirements he meant "shall" statements. Given what I'd seen, I had to agree. My personal favorite awful requirement was, "The software shall not harm the hardware." "The Systems Engineers haven't a clue," he said, "we just have to figure it out ourselves."

"What about testing?" I asked. You need requirements to know what to test. "We test it," he said. He meant the programming team. "We show the testers what we did and they just repeat it. No value added." Suffice to say it is NOT considered a best practice for programmers to test their own code. Then there was this clincher. "All that counts is the code. If it's in the code, it's on the mission."

So happened that this particular programmer was an expert user of the system he was building. He knew what the system should do and could build a usable system without requirements. Still, I was skeptical that his code pass the delivery review. Surely there would be a reckoning. There wasn't. That review went something like this: "Code delivered on time and tested." His delivery was a rip-roaring success. Management was sastified--there was no apparent cause for worry.

During the past decade, our development practices became more rigorous. The agency adopted a set of required processes for software.6

In spirit, these process mandates are reasonable. In practice, they levy a significant, typically unfunded, burden that produces a mountain of paper that describes the code and how it was built. So much paper that only a small portion of the documents are carefully read. An even smaller portion are treated to thoughtful analysis by an engineer with sufficient expertise to render a useful opinion. Nevertheless, these documents become an official record of engineering thoroughness--a certification of the quality of the code. When reviews roll around, management can conveniently meet their obligations by ticking down a list of required documents to see if any of their number is missing.

It's a very practical arrangement that has become settled convention. Developers are free to do their work without exposing coding details or any risks that might be associated with design choices. Managers are assured by the process machinery that all is in order without taking the trouble to understanding the software. For much of the schedule, the project purrs along with happy sounding Earned Value metrics until the predicable budget overruns and schedule slips (which the required process did nothing to alleviate) light up the FEVER charts with red and yellow like a Christmas tree .

For years I failed to abide by this unwillingness to understand the details of the software. However, as I assumed greater management responsibility, I came to appreciate, even accept, why software engineering was the red-headed step child on NASA projects. In the conventional view, a spacecraft is fundamentally a very complicated piece of hardware; it just happens to have some software inside. Managing a $400-500M enterprise is hard enough without getting bogged down in the minutia of software piece parts. The spacecraft-as-hardware is a cultural mindset with roots that reach back to the Apollo era. The vestiges from that time live on in the project WBS and major milestone reviews. For example: a typical WBS places the flight software under the avionics subsystem a bureaucracy away from the ground system software. Similarly, a typical three-day, 36-hour gateway review, allows but a couple of hours for the discussion of the flight and ground software efforts.

And yet, in project after project, Brooks' 40-year-old admonitions reign supreme--software remains a persistently vexing management problem. The code is late and over budget. It doesn't do what it is supposed to do. There are bugs. The maintenance costs a fortune. There always a plague of technical gotyas that resist a simple fix. The tools that worked for the last system no longer work. The explanations from the software people are arcane and bewildering. Is it any wonder why project management is distrustful of software when such a small portion of the budget repeatedly causes so much trouble?

I have been on the receiving end of this management skepticism. There is no good answer to questions like: "Why are you reinventing the wheel?" Or, "Are those changes really needed?" Technical reasons, no matter how good, sound defensive or seem to obfuscate. I've heard it on reliable authority that in their corner offices senior managers confide to each other that the software problem stems from a lack of discipline and a lackadaisical attitude about commitment. So when a crisis of budget or schedule beckons and a management decision is required, it's usually rendered with the preamble, "I don't know anything about software, but..." True enough. It's a decision made on the basis of mistrust without knowledge of the details.

If you've read elsewhere in this blog, you'll know I believe that the next-generation space system must be a software-intensive system and not a spacecraft-as-hardware system. In other words, the design and implementation of the software would become an overarching project concern that links power, propulsion, mass, attitude control, navigation, operational concepts and fault protection. This means turning the system concept inside out so that project leadership makes a priority of understanding the software and how it connects across the system. Until that happens, the development of a smart, reliable, affordable system capable of complex operations, human or robotic, will remain beyond the reach of space system engineering.

Still, the underlying problem of managing a large, complicated development effort remains. The leadership must be able to understand the software and still orchestrate the work of the many engineering efforts by the collaborating disciplines. No one can master all.

Management will need to have an intuition about the software to grasp which details matter. Intuition that only comes from the experience of writing code under deadline pressure for an unknown user. Code that is designed for change, reuse and longevity. The kind of code Brooks called a "programming systems product." Only then will a manager have the gut-wrenching experiences needed to understand why software development is not like a music box. It is a discovery process that varies with the maturity of the team, the tools and the product. Failing that experience, it's very unlikely a manager will have a reliable intuition.

To the best of my knowledge, there is not, nor has there been, a single senior manager in NASA who has worked as a professional programmer. After all, NASA is an mature, hardware-centric, government bureaucracy with an entrenched culture. Cultural adjustments are disruptive. Of all the challenges that face the Agency, introducing a software-centric focus may be the most difficult.

Whitehead famously writes about the advance of civilization through the effect of certain ideas. "An idea is a prophecy which procures its own fulfillment."7 A reworking of NASA to prepare for the development of a next generation's space system is not beyond the realm of possibility. The transition could be realized in a single administration by an enlightened, determined leadership. It has happened in the past when the agency was born. It could happen again.


1. Michiavelli, N., The Prince. Translated by Ricci, L. Revised by Vincent, E.R.P., Oxford University Press, World Classics. Reprinted 1968.
2. Nifty phrase lifted from A.N. Whitehead. "Adventures in Ideas". Free Press Paperback. 1933.
3. A Preliminary Design Review (PDR)that occurred in the spring of 1998.
4. The Cassini flight software was written in Ada. The flight software for Pathfinder was written in C.
5. C++ is a very powerful but difficult programming language because it is easy to make subtle errors that lead to bugs and performance issues. I personally had a dozen books well-read books that provided programming guidance.
6. For a sample of the required NASA processes see the NASA Process web site.
7. A.N. Whitehead. "Adventures in Ideas". Free Press Paperback. 1933. Page 42.