Tuesday, February 12, 2019

Book Review: Tracy Kidder - The Soul of a New Machine

There are many, many project management books that purport to reveal the ultimate system for surmounting the myriad challenges to releasing a product on-time, in-budget, and with all the promised features. It's a popular and useful genre, even if much of the material is just reshuffled and rebranded old bromides, but sometimes the most helpful and memorable way to offer project management advice is to just pick a single case study and dive in deep to explore the group dynamics that result - or don't - in a successful product. I'd previously read Mountains Beyond Mountains, Kidder's excellent profile of Dr. Paul Farmer, and this much earlier work, which won him a Pulitzer, is just as detailed, thoughtful, and revealing. It's about the race from 1978 to 1980 by one team of computer engineers at Data General to develop and release a 32-bit "minicomputer" (one of the many charmingly antiquated terms that will give those who know their industry history a smile) called the Eclipse in competition with another, more-prestigious team that's been given a more glamorous project in a shiny new office, with the fate of the company looming in the background. Heroes and villains are the keys to great drama, and so as the narrative follows the protagonists, who are working on "Eagle", a 32-bit extension of the existing 16-bit line of computer hardware instead of the brand-new computer of their dreams that they imagine their counterparts are gleefully assembling, their struggles to design, build, test, debug, and actually finish a computer without more hacks, kludges, and shortcuts than are absolutely unavoidable in such a short time take on a mythic glow that anyone working on a big project in the tech industry under a tight deadline will immediately recognize, despite the passage of nearly 40 years.

If I had to pick a single part of the book that best-represents why the book would make a worthy addition to a computer engineering syllabus, it would be the chapter "The Case of the Missing NAND Gate". It's an almost self-contained episode towards the end of the book, where, late in the development cycle, several engineers are attempting to debug an erratic logic failure, which occurs just often enough to be indicative of a real problem but not so often as to be easily reproducible. Kidder relays the team's efforts to determine if this diagnostic failure is at root a software or a hardware issue, with an amusing layer of "antagonistic camaraderie" on top of their troubleshooting, as all of them had a hand in designing the machine and each wants to solve the problem but none wants to have the root cause bear their fingerprints. This was back in the era when computer design involved the frequent use of oscilloscopes and it was often a genuine question if chips on a board weren't properly spaced for optimal signal timing, so fans of vintage computing will really enjoy as Kidder walks the reader through the finer points of system caches, assembly microcode, page faults, and logic gates while various engineers, working in shifts, propose and reject theories to explain the anomaly. It's a genuine puzzle, and Kidder does a great job explaining just what the problem is and why it's so difficult to diagnose and eventually solve, translating the arcane technical details of the fault with the various components of the system architecture until it's not just lucid but even enthralling. Here's his rendition of one potential explanation from one engineer named Guyer:

"The diagnostic program originally puts the target instruction at address 21765, and then, sometime later on, it moves the target instruction to 21766. But the IP never gets word of the change, though the System Cache does. Now, sometime after the target instruction is switched from mailbox 21765 to 21766, the program directs Gollum to execute the instruction at 21766. The IP receives this command and looks through its cache. It says to itself, in effect, 'Mailbox 21766? I've got that address and there's an instruction in it. Let's run it.' But in the I-cache, the target instruction is still at 21765, and mailbox number 21766 contains an error message. In short, the I-cache contains an outdated piece of memory. Why didn't it get updated along with the other parts of the memory system? Maybe, Guyer writes, the System Cache is to blame. The System Cache is supposed to know exactly what is in the I-cache. If an instruction or data gets moved to a new address, the System Cache is supposed to tell the IP to throw away the outdated mailbox and get the new one, the one with the target instruction in it. Somewhere back in the program, Guyer figures, the System Cache lost track of what was in the I-cache. It forgot that the IP had the target instruction in mailbox 21765, and so, when the change was made in the location of the target instruction, it never told the IP to get rid of the old, outdated mailbox. Guyer likes this hypothesis. He records it with mounting enthusiasm; and describing it later, he repossesses the feeling, speaking rapidly, gesturing with both hands. Then he stops, puts his hands on the table, and says, 'Of course, it was completely wrong.'"

The book is also notable for broader reasons. Massachusetts was a much larger center of the technology industry in the 1970s and 80s than it is today, and the "Route 128" cluster competed directly with Silicon Valley for talent and prestige. However, the Eclipse team's main antagonists were not in California but in North Carolina, giving the modern reader a glimpse of the "flight to the Sunbelt" in embryo that has helped the Research Triangle, among other places, at Massachusetts' relative expense. Data General was founded by former employees of Digital Equipment Corporation; I've read articles arguing that Massachusetts' relatively strict enforcement of noncompete agreements was a major force that drove tech firms to less strict jurisdictions, but that doesn't seem to have been as large an issue here as the typical lure of lower taxes. However, prospective MBAs should scrutinize closely the decision by corporate management to have two different teams working on overlapping products, as ultimately the highly-regarded North Carolina team working on the prestigious brand-new 32-bit machine (dubbed "the Fountainhead Project", with hilarious irony) was upstaged by the "Eagle" team, whose less-ambitious 32-bit extension of the 16-bit Eclipse became a huge moneymaker for the company. Now, hindsight is 20/20, and it's obviously impossible to consistently tell ex ante if internal competition, which is often positive, will in the end have wasted resources. After all, the Eagle team did produce an extremely successful product, although we don't know how much was spent on the Fountainhead team. But lack of clear focus is always risky, and corporate politics can have damaging downstream effects on teams of even very smart people.

But any look into the subtleties of nerd psychology has to account for the fact that the drive to create cool technology is often far more powerful than any corporate folly, even and perhaps especially if that involves extremely long hours of hard work. Occasionally the concept of "mushroom management" is invoked, which turns out to mean "put 'em in the dark, feed 'em shit, and watch 'em grow", and one paradoxical upside of not being the top brass' favorite project is that, with protective leadership, that can actually mean more opportunity to produce. There's an interesting detail in the life story of Tom West, the top manager for the Eagle project: "He went to Amherst College, in western Massachusetts, where he studied the natural sciences. He did so without academic distinction, and it happened that Amherst was just then embracing a new Calvinist fad called the underachiever program: young men whose brains seemed much better than their grades were expelled for a year, so that they might improve their characters. At Amherst, certainly, and possibly in the entire nation, West became the first officially branded underachiever. It was something he'd always remember." This story takes place after the end of the naive cyberhippie movement of the Whole Earth Catalog/"All watched over by machines of loving grace" era, so that technoromance had been firmly replaced by a more modern engineering sensibility, but there's still poignancy of the ceremony at the end of the project, where the team members come to grips with how much of themselves they've put into what would be released as the Data General Eclipse MV/8000, elevates what could have been just an unusually lengthy product diary into an account of creation that justly deserved its Pulitzer. One of the engineers had a typical complaint:
What a way to design a computer! 'There's no grand design,' thinks Rosen. 'People are just reaching out in the dark, touching hands.' Rosen is having some problems with his own piece of the design. He knows he can solve them, if he's just given the time. But the managers keep saying, 'There's no time.' Okay, Sure. It's a rush job. But this is ridiculous. No one seems to be in control; nothing's ever explained. Foul up, however, and the managers come at you from all sides. 'The whole management structure,' said Rosen. 'Anyone in Harvard Business School would have barfed.'
Maybe, but the reason why his project shipped and his rival's didn't wasn't because he had superior consultants from Harvard. As Kidder recounts from attending a trade conference: "It seemed to me that computers have been used in ways that are salutary, in ways that are dangerous, banal and cruel, and in ways that seem harmless if a little silly. But what fun making them can be!"

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.