Two traps to avoid

Some of the IPC methods we've discussed in this chapter are historical fossils. While BSD-style sockets over TCP/IP have become something like a universal IPC method, there are still live controversies over the right way to partition by multiprogramming. We'll take a brief look at two that have been imported to the Unix world but — for good reasons — don't flourish here.

Despite occasional exceptions such as NFS and the GNOME project, attempts to import CORBA, ASN.1, and other forms of remote-procedure-call interface have largely failed — these technologies have not been naturalized into the Unix culture.

There seem to be several underlying reasons for this. One is that RPC interfaces are not readily discoverable; that is, it is difficult to query these interfaces for their capabilities, and difficult to monitor them in action without building one-off tools as complex as the programs being monitored (we examined some of the reasons for this in Chapter 6 (Transparency)). They have the same version skew problems as libraries, but those problems are harder to track because they're distributed and not generally obvious at link time.

The usual argument for RPC is that it permits “richer” interfaces than methods like text streams — that is, interfaces with a more elaborate and application-specific ontology of data types. But the Rule of Simplicity applies! We observed in Chapter 4 (Modularity) that one of the functions of interfaces is as choke points that prevent the implementation details of modules from leaking into each other. Therefore, the argument in favor of RPC is also an argument that it increases global complexity rather than minimizing it.

As a related issue, interfaces that have richer type signatures also tend to be more complex, therefore more brittle. Structs are more likely to mismatch than strings, and if the type ontologies of the programs on each side don't exactly match it can be very hard to teach them to communicate at all — and fiendishly difficult to resolve bugs.

With classical RPC, it's too easy to do things in a complicated and obscure way instead of keeping them simple. RPC seems to encourage the production of large, baroque, over-engineered systems with obfuscated interfaces, high global complexity, and serious version-skew and reliability problems — a perfect example of thick glue layers run amok.

Windows DCOM and COM+ are perhaps the archetypal examples of how bad this can get, but there are plenty of others. Apple abandoned OpenDoc, and both CORBA and the once wildly hyped Java RMI have receded from view in the Unix world as people have gained field experience with them. This may well be because these methods don't actually solve more problems than they cause.

Andrew S. Tanenbaum and R. van Renesse have given us a detailed analysis of the general problem in A Critique of the Remote Procedure Call Paradigm [Tanenbaum&vanRenesse], a paper which should serve as a strong cautionary note to anyone considering an architecture based on RPC.

All these problems may predict long-term difficulties for the relatively few Unix projects that use RPC. Of these projects, perhaps the best known is the GNOME desktop effort[57]. These problems also contribute to the notorious security vulnerabilities of exposing NFS servers.

Unix tradition, on the other hand, strongly favors transparent and discoverable interfaces. This is one of the forces behind the Unix culture's continuing attachment to IPC via textual protocols. It is often argued that the parsing overhead of textual protocols is a performance problem relative to binary RPCs — but RPC interfaces tend to have latency problems that are far worse, because (a) you can't readily anticipate how much data marshaling and unmarshaling a given call will involve, and (b) the RPC model tends to encourage programmers to treat network transactions as cost-free. Adding even one additional round trip to a transaction interface tends to add network latency that swamps any overhead from parsing or marshaling.

Even if text streams were less efficient than RPC — the performance loss would be marginal and linear, the kind better addressed by upgrading your hardware than by expending development time or adding architectural complexity. Anything you might lose in performance by using text streams, you gain back in the ability to design systems that are simpler — easier to monitor, to model, and to understand.

Today, RPC and the Unix attachment to text streams are converging in an interesting way, through protocols like XML-RPC and SOAP. While these don't solve all of the more general problems pointed out by Tanenbaum and van Renesse, they do in some ways combine the advantages of both text-stream and RPC worlds.

Though Unix developers have long been comfortable with computation by multiple cooperating processes, they do not have a native tradition of using threads (processes that share their entire address spaces). These are a recent import from elsewhere, and the fact that Unix programmers generally dislike them is not merely accident or historical contingency.

From a complexity-control point of view, threads are a bad substitute for lightweight processes with their own address spaces; the idea of threads is native to operating systems with expensive process-spawning and weak IPC facilities.

By definition, though daughter threads of a process typically have separate local-variable stacks, they share the same global memory. The task of managing contentions and critical regions in this shared address space is quite difficult and a fertile source of global complexity and bugs. It can be done, but as the complexity of one's locking regime rises, the chance of races and deadlocks due to unanticipated interactions rises correspondingly.

Threads are a fertile source of bugs because they can too easily know too much about each others' internal states. There is no automatic encapsulation, as there would be between processes with separate address spaces that must do explicit IPC to communicate.

Thread developers have been waking up to this problem; recent thread implementations and standards show an increasing concern with providing thread-local storage, which is intended to limit problems due to the shared global address space. As threading APIs move in this direction, thread programming starts to look more and more like a controlled use of shared memory.

 

Threads often prevent abstraction. In order to prevent deadlock, you often need to know how and if the library you are using uses threads in order to avoid deadlock problems. Similarly, the use of threads in a library could be affected by the use of threads at the application layer.

 
--David Korn  

To add insult to injury, threading has performance costs that erode its advantages over conventional process partitioning. While threading can get rid of the overhead of rapidly switching process contexts, locking shared data structures so threads won't step on each other can be just as expensive.

 

The X server, able to execute literally millions of ops/second, is not threaded; it uses a poll/select loop. Various efforts to make a multithreaded implementation have come to no good result. The costs of locking and unlocking get too high for something as performance-sensitive as graphics servers.

 
--Jim Gettys 

This problem is fundamental; it has also been a continuing issue in the design of Unix kernels for symmetric multiprocessing. As your resource-locking gets finer-grained, latency due to locking overhead can increase fast enough to swamp the gains from locking less core memory.

Accordingly, while we should seek ways to break up large programs into simpler cooperating processes, the use of threads within processes should be a last resort rather than a first. Often, you may find you can avoid them with techniques like asynchronous I/O using SIGIO, or via shared memory and System V semaphores.

Keep it simple. If you can use limited shared memory, SIGIO, or poll(2)/select(2) rather than threading, do it that way.

One final difficulty with threads is that threading standards still tend to be weak and underspecified (as of mid-2003). Theoretically conforming libraries for Unix standards such as POSIX threads (1003.1c) can nevertheless exhibit alarming differences in behavior across platforms, especially with respect to signals, interactions with other IPC methods, and resource cleanup times. Windows and classic MacOS have native threading models and interrupt facilities quite different from Unix's and will often require considerable porting effort even for simple threading cases. The upshot is that you cannot count on threaded programs to be portable.

The combination of threads, remote-procedure-call interfaces and heavyweight object-oriented design is especially dangerous. Used sparingly and tastefully, any of these techniques can be valuable — but if you are ever invited onto a project that is supposed to feature all three, fleeing in terror might well be an appropriate reaction.

We have previously observed that programming in the real world is all about managing complexity. Tools to manage complexity are good things. But when the effect of those tools is to proliferate complexity rather than controlling it, we would be better off throwing them away and starting from zero. An important part of the Unix wisdom is to never forget this.



[57] GNOME's main competitor, KDE, started with CORBA but abandoned it in their 2.0 release. They have been on a quest for lighter-weight IPC methods ever since.