Specifications as DNA, Code as RNA

Even in the paleolithic period of the PDP-7, Unix programmers had always been more prone than their counterparts elsewhere to treat old code as disposable. This was doubtless a product of the Unix tradition's emphasis on modularity, which makes it easier to discard and replace small pieces of systems without losing everything. Unix programmers have learned by experience that trying to salvage bad code or a bad design is often more work than rebooting the project. Where in other programming cultures the instinct would be to patch the monster monolith because you have so much work invested in it, the Unix instinct is usually to scrap and rebuild.

The IETF tradition reinforced this by teaching us to think of code as secondary to standards. Standards are what enable programs to cooperate; they knit our technologies into wholes that are more than the sum of the parts. The IETF showed us that careful standardization, aimed at capturing the best of existing practice, is a powerful form of humility that achieves more than grandiose attempts to remake the world around a never-implemented ideal.

After 1980, the impact of that lesson was increasingly widely felt in the Unix community. Thus, while the ANSI/ISO C standard from 1989 is not completely without flaws, it is exceptionally clean and practical for a standard of its size and importance. The Single Unix Specification contains fossils from three decades of experimentation and false starts in a more complicated domain, and is therefore messier than ANSI C. But the component standards it was composed from are quite good; strong evidence for this is the fact that Linus Torvalds successfully built a Unix from scratch by reading them. The IETF's quiet but powerful example created one of the critical pieces of context that made Linus Torvalds's feat possible.

Respect for published standards and the IETF process has become deeply ingrained in the Unix culture; deliberately violating Internet STDs is simply Not Done. This can sometimes create chasms of mutual incomprehension between people with a Unix background and others prone to assume that the most popular or widely deployed implementation of a protocol is by definition correct — even if it breaks the standard so severely that it will not interoperate with properly conforming software.

The Unix programmer's respect for published standards is more interesting because he is likely to be rather hostile to a-priori specifications of other kinds. By the time the ‘waterfall model’ (specify exhaustively first, then implement, then debug, with no reverse motion at any stage) fell out of favor in the software-engineering literature, it had been an object of derision among Unix programmers for years. Experience, and a strong tradition of collaborative development, had already taught them that prototyping and repeated cycles of test and evolution are a better way.

The Unix tradition clearly recognizes that there can be great value in good specifications, but it demands that they be treated as provisional and subject to revision through field experience in the way that Internet-Drafts and Proposed Standards are. In best Unix practice, the documentation of the program is used as a specification subject to revision analogously to an Internet Proposed Standard.

 

Unlike other environments, in Unix development the documentation is often written before the program, or at least in conjunction with it. For X11, the core X standards were finished before the first release of X and have remained essentially unchanged since that date. Compatibility among different X systems is improved further by rigorous specification-driven testing.

The existence of a well-written specification made the development of the X test suite much easier. Each statement in the X specification was translated into code to test the implementation, a few minor inconsistencies were uncovered in the specification during this process, but the result is a test suite that covers a significant fraction of the code paths within the sample X library and server, and all without referring to the source code of that implementation.

 
-- Keith Packard  

Semiautomation of the test-suite generation proved to be a major advantage. While field experience and advances in the state of the graphics art led many to criticize X on design grounds, and various portions of it (such as the security and user-resource models) came to seem clumsy and over-engineered, the X implementation achieved a remarkable level of stability and cross-vendor interoperation.

In Chapter 9 we discussed the value of pushing coding up to the highest possible level to minimize the effects of constant defect density. Implicit in Keith Packard's account is the idea that the X documentation constituted no mere wish-list but a form of high-level code. Another key X developer confirms this:

 

In X, the specification has always ruled. Sometimes specs have bugs that need to be fixed too, but code is usually buggier than specs (for any spec worth its ink, anyway).

 
-- Jim Gettys  

Jim goes on to observe that X's process is actually quite similar to the IETF's. Nor is its utility limited to constructing good test suites; it means that arguments about the system's behavior can be conducted at a functional level with respect to the specification, avoiding too much entanglement in implementation issues.

 

Having a well-considered specification driving development allows for little argument about bug vs. feature; a system which incorrectly implements the specification is broken and should be fixed.

I suspect this is so ingrained into most of us that we lose sight of its power.

A friend of mine who worked for a small software firm east of Bellevue wondered how Linux applications developers could get OS changes synchronized with application releases. In that company, major system-level APIs change frequently to accommodate application whims and so essential OS functionality must often be released along with each application.

I described the power held by the specifications and how the implementation was subservient to them, and then went on to assert that an application which got an unexpected result from a documented interface was either broken or had discovered a bug. He found this concept startling.

Discerning such bugs is a simple matter of verifying the implementation of the interface against the specification. Of course, having source for the implementation makes that a bit easier.

 
-- Keith Packard  

This standards-come-first attitude has benefits for end users as well. While that no-longer-small company east of Bellevue has trouble keeping its office suite compatible with its own previous releases, GUI applications written for X11 in 1988 still run without change on today's X implementations. In the Unix world, this sort of longevity is normal — and the standards-as-DNA attitude is the reason why.

Thus, experience shows that the standards-respecting, scrap-and-rebuild culture of Unix tends to yield better interoperability over extended time than perpetual patching of a code base without a standard to provide guidance and continuity. This may, indeed, be one of the most important Unix lessons.

Keith's last comment brings us directly to an issue that the success of open-source Unixes has brought to the forefront — the relationship between open standards and open source. We'll address this at the end of the chapter — but before doing that, it's time to address the practical question of how Unix programmers can actually use the tremendous body of accumulated standards and lore to achieve software portability.