Friday, July 29, 2005 

Observability

Software design is about -ilities: extendibility, reusability, scalability, etc. As designers, we recognize that software is not simply about function but is also about form. We shape our software to obtain specific - ilities that we consider important for the project. This is, by the way, one of the reasons why I’m highly skeptical about some XP predicaments like building the easiest thing and then refactoring - software architecture is much more than piecemeal improvements over simplistic code.
A property that is seldom considered in OOP/OOD is observability: by this, I mean the possibility of (easily) understanding the system state by observing, for instance, a memory dump. Object oriented programs are often structurally more complex than procedural programs. More exactly, they have a more complex static and dynamic structure. Their run-time state is frequently shaped as a relatively complex network of interconnected, fine-grained objects. A direct observation of a memory dump is not trivial; changing data on the fly, on a running system, is even more complex.
Observability is not a fundamental requirement for most systems. However, high-availability, critical systems may benefit from observability. A very common strategy when building those systems is to move all the data to a well-known, static global area. Not only dynamic allocation is banned (as it often is on critical systems) but in extreme cases also local, stack-based data is avoided, so that all the program state is easily observable. Diversion: it has also been suggested (although I definitely do not agree :-) that this strong separation of code and data leads to easier maintainability: see "The Separation Principle: A Programming Paradigm" by Yasushi Kambayashi and Henry F. Ledgard, IEEE Software, March/April 2004.
Recently, I've been dealing with a system where observability was considered important. The team was willing to use object oriented techniques (C++), and in theory they could even reuse several portions of code from a larger system. However, in the large system observability wasn't considered important, and the original developers used dynamic allocation in many places. The team was then confronted with a choice: reuse existing, debugged, known code, but give up observability (and accept dynamic allocation as well) or copy/paste the code and change it to use only global variables, possibly introducing bugs.
Here is where the power of C++ shows up immediately (assuming we don't give up too easily and we know the language): we can have observability and reusability, sharing code with systems doing dynamic allocations. All we have to do in the small, observable system is:
- create a structure for the global state: this allows easy observability.
- in this structure, provide enough space for all the entities you need at run-time.
- overload operator new for the different classes that are dynamically instantiated. Operator new should just return the address of one existing slot in the global state. You know in advance how many objects you need (otherwise, you won't have an observable system anyway), so you’re sure that there will be a free slot when new is called.
That's pretty much it, since most embedded systems never release memory: they build their structures at startup, and keep them alive forever.
Note that the large system won’t redefine operator new, and it will probably allocate and free a number of objects during its lifetime. Still, by redefining operator new in the small observable system, we don’t have to link the standard allocator code, and we obtain an important design property without compromising reuse.
For a real-world example of the importance of observability, see this report on the mars rover . Last diversion :-) : over the years, I've found that too many programmers don’t know about priority inversion (mentioned in the article). If you build multithreaded programs, especially real-time stuff, you gotta know it!