Sunday, January 10, 2010 

Delaying Decisions

Since microblogging is not my thing, I decided to start 2010 by writing my longer post ever :-). It will start with a light review of a well-known principle and end up with a new design concept. Fasten your seatbelt :-).

The Last Responsible Moment
When we develop a software product, we make decisions. We decide about individual features, we make design decisions, we make coding decisions, we even decide which bugs we really want to fix before going public. Some decisions are taken on the fly; some, at least in the old school, are somewhat planned.

A key principle of Lean Development is to delay decisions, so that:
a) decisions can be based on (yet-to-discover) facts, not on speculation
b) you exercise the wait option (more on this below) and avoid early commitment

The principle is often spelled as "Delay decisions until the last responsible moment", but a quick look at Mary Poppendieck's website (Mary co-created the Lean Development approach) shows a more interesting nuance: "Schedule Irreversible Decisions at the Last Responsible Moment".

Defining "Irreversible" and "Last Responsible" is not trivial. In a sense, there is nothing in software that is truly irreversible, because you can always start over. I haven't found a good definition for "irreversible decision" in literature, but I would define it as follows: if you make an irreversible decision at time T, undoing the decision at a later time will entail a complete (or almost complete) waste of everything that has been created after time T.

There are some documented definitions for "last responsible moment". A popular one is "The point when failing to decide eliminates an important option", which I found rather unsatisfactory. I've also seen some attempts to quantify that better, as in this funny story, except that in the real world you never have a problem which is that simple (very few ramifications in the decision graph) and that detailed (you know the schedule beforehand). I would probably define the Last Responsible Moment as follows: time T is the last responsible moment to make a decision D if, by postponing D, the probability of completing on schedule/budget (even when you factor-in the hypothetical learning effect of postponing) decreases below an acceptable threshold. That, of course, allows us to scrap everything and restart, if schedule and budget allows for it, and in this sense it's kinda coupled with the definition of irreversible.

Now, irreversibility is bad. We don't want to make irreversible decisions. We certainly don't want to make them too soon. Is there anything we can do? I've got a few important things to say about modularity vs. irreversibility and passive vs. proactive option thinking, but right now, it's useful to recap the major decision areas within a software project, so that we can clearly understand what we can actually delay, and what is usually suggested that we delay.

Major Decision Areas
I'll skip on a few very-high-level, strategic decisions here (scope, strategy, business model, etc). It's not that they can't be postponed, but I need to give some focus to this post :-). So I'll get down to the more ordinarily taken decisions.

Choosing the right people for the project is a well-known ingredient for success.

Are we going XP, Waterfall, something in between? :-).

Feature Set
Are we going to include this feature or not?

What is the internal shape (form) of our product?

Much like design, at a finer granularity level.

Now, "design" is an overly general concept. Too general to be useful. Therefore, I'll split it into a few major decisions.

Architectural Style
Is this going to be an embedded application, a rich client, a web application? This is a rather irreversible decision.

Goes somewhat in pair with Architectural Style. Are we going with an embedded application burnt into an FPGA? Do you want to target a PIC? Perhaps an embedded PC? Is the client a Windows machine, or you want to support Mac/Linux? A .NET server side, or maybe Java? It's all rather irreversible, although not completely irreversible.

3rd-Party Libraries/Components/Etc
Are we going to use some existing component (of various scale)? Unless you plan on wrapping everything (which may not even be possible), this often end up being an irreversible decision. For instance, once you commit yourself to using Hibernate for persistence, it's not trivial to move away.

Programming Language
This is the quintessential irreversible decision, unless you want to play with language converters. Note that this is not a coding decisions: coding decisions are made after the language has been chosen.

Structure / Shape / Form
This is what we usually call "design": the shape we want to impose to our material (or, if you live in the "emergent design" side, the shape that our material will take as the final result of several incremental decisions).

So, what are we going to delay? We can't delay all decisions, or we'll be stuck. Sure, we can delay something in each and every area, but truth is, every popular method has been focusing on just a few of them. Of course, different methods tried to delay different choices.

A Little Historical Perspective
Experience brings perspective; at least, true experience does :-). Perspective allows to look at something and see more than it's usually seen. For instance, perspective allows to look at the old, outdated, obsolete waterfall approach and see that it (too) was meant to delay decisions, just different decisions.

Waterfall was meant to delay people decisions, design decisions (which include platform, library, component decisions) and coding decisions. People decision was delayed by specialization: you only have to pick the analyst first, everyone else can be chosen later, when you know what you gotta do (it even makes sense -)). Design decision was delayed because platform, including languages, OS, etc, were way more balkanized than today. Also, architectural styles and patterns were much less understood, and it made sense to look at a larger picture before committing to an overall architecture.
Although this may seem rather ridiculous from the perspective of a 2010 programmer working on Java corporate web applications, most of this stuff is still relevant for (e.g.) mass-produced embedded systems, where choosing the right platform may radically change the total development and production cost, yet choosing the wrong platform may over-constrain the feature set.

Indeed, open systems (another legacy term from late '80s - early '90s) were born exactly to lighten up that choice. Choose the *nix world, and forget about it. Of course, the decision was still irreversible, but granted you some latitude in choosing the exact hw/sw. The entire multi-platform industry (from multi-OS libraries to Java) is basically built on the same foundations. Well, that's the bright side, of course :-).

Looking beyond platform independence, the entire concept of "standard" allows to delay some decision. TCP/IP, for instance, allows me to choose modularly (a concept I'll elaborate later). I can choose TCP/IP as the transport mechanism, and then delay the choice of (e.g.) the client side, and focus on the server side. Of course, a choice is still made (the client must have TCP/IP support), so let's say that widely adopted standards allow for some modularity in the decision process, and therefore to delay some decision, mostly design decisions, but perhaps some other as well (like people).

It's already going to be a long post, so I won't look at each and every method/principle/tool ever conceived, but if you do your homework, you'll find that a lot of what has been proposed in the last 40 years or so (from code generators to MDA, from spiral development to XP, from stepwise refinement to OOP) includes some magic ingredient that allows us to postpone some kind of decision.

It's 2010, guys
So, if you ain't agile, you are clumsy :-)) and c'mon, you don't wanna be clumsy :-). So, seriously, which kind of decisions are usually delayed in (e.g.) XP?

People? I must say I haven't seen much on this. Most literature on XP seems based on the concept that team members are mostly programmers with a wide set of skills, so there should be no particular reason to delay decision about who's gonna work on what. I may have missed some particularly relevant work, however.

Feature Set? Sure. Every incremental approach allows us to delay decisions about features. This can be very advantageous if we can play the learning game, which includes rapid/frequent delivery, or we won't learn enough to actually steer the feature set.
Of course, delaying some decisions on feature set can make some design options viable now, and totally bogus later. Here is where you really have to understand the concept of irreversible and last responsible moment. Of course, if you work on a settled platform, things get simpler, which is one more reason why people get religiously attached to a platform.

Design? Sure, but let's take a deeper look.

Architectural Style: not much. Quoting Booch, "agile projects often start out assuming a given platform and environmental context together with a set of proven design patterns for that domain, all of which represent architectural decisions in a very real sense". See my post Architecture as Tradition in the Unselfconscious Process for more.
Seriously, nobody ever expected to start with a monolithic client and end up with a three-tier web application built around a MVC pattern just by coding and refactoring. The architectural style is pretty much a given in many contemporary projects.

Platform: sorry guys, but if you want to start coding now, you gotta choose your platform now. Another irreversible decision made right at the beginning.

3rd-Party Libraries/Components/Etc: some delay is possible for modularized decisions. If you wanna use hibernate, you gotta choose pretty soon. If you wanna use Seam, you gotta choose pretty soon. Pervasive libraries are so entangled with architectural styles that it's relatively hard to delay some decisions here. Modularized components (e.g. the choice of a PDF rendering library) are simple to delay, and can be proactively delayed (see later).

Programming Language: no way guys, you have to choose right here, right now.

Structure / Shape / Form: of course!!! Here we are. This is it :-). You can delay a lot of detailed design choices. Of course, we always postpone some design decision, even when we design before coding. But let's say that this is where I see a lot of suggestions to delay decisions in the agile literature, often using the dreaded Big Upfront Design as a straw man argument. Of course, the emergent design (or accidental architecture) may or may not be good. If I had to compare the design and code coming out of the XP Episode with my own, I would say that a little upfront design can do wonders, but hey, you know me :-).

OK guys, what follows may sound a little odd, but in the end it will prove useful. Have faith :-).
You can get better at everything by doing anything :-), so why not getting better at delaying decisions by playing Windows Solitaire? All you have to do is set the options in the hardest possible way:

now, play a little, until you have to make some decision, like here:

I could move the 9 of spades or the 9 of clubs over the 10 of hearts. It's an irreversible decision (well, not if you use the undo, but that's lame :-). There are some ramifications for both choices.
If I move the 9 of clubs, I can later move the king of clubs and uncover a new card. After that, it's all unknown, and no further speculation is possible. Here, learning requires an irreversible decision; this is very common in real-world projects, but seldom discussed in literature.
If I move the 9 of spades, I uncover the 6 of clubs, which I can move over the 7 of aces. Then, it's kinda unknown, meaning: if you're a serious player (I'm not) you'll remember the previous cards, which would allow you to speculate a little better. Otherwise, it's just as above, you have to make an irreversible decision to learn the outcome.

But wait: what about the last responsible moment? Maybe we can delay this decision! Now, if you delay the decision by clicking on the deck and moving further, you're not delaying the decision: you're wasting a chance. In order to delay this decision, there must be something else you can do.
Well, indeed, there is something you can do. You can move the 8 of aces above the 9 of clubs. This will uncover a new card (learning) without wasting any present opportunity (it could still waste a future opportunity; life it tough). Maybe you'll get a 10 of aces under that 8, at which point there won't be any choice to be made about the 9. Or you might get a black 7, at which point you'll have a different way to move the king of clubs, so moving the 9 of spades would be a more attractive option. So, delay the 9 and move the 8 :-). Add some luck, and it works:

and you get some money too (total at decision time Vs. total at the end)

Novice solitaire players are also known to make irreversible decision without necessity. For instance, in similar cases:

I've seen people eagerly moving the 6 of aces (actually, whatever they got) over the 7 of spades, because "that will free up a slot". Which is true, but irrelevant. This is a decision you can easily delay. Actually, it's a decision you must delay, because:
- if you happen to uncover a king, you can always move the 6. It's not the last responsible moment yet: if you do nothing now, nothing bad will happen.
- you may uncover a 6 of hearts before you uncover a king. And moving that 6 might be more advantageous than moving the 6 of aces. So, don't do it :-). If you want to look good, quote Option Theory, call this a Deferral Option and write a paper about it :-).

Proactive Option Thinking
I've recently read an interesting paper in IEEE TSE ("An Integrative Economic Optimization Approach to Systems Development Risk Management", by Michel Benaroch and James Goldstein). Although the real meat starts in chapter 4, chapters 1-3 are probably more interesting for the casual reader (including myself).
There, authors recap some literature about Real Options in Software Engineering, including the popular argument that delaying decisions is akin to a deferral option. They also make important distinctions, like the one between passive learning through deferral of decisions, and proactive learning, but also between responsiveness to change (a central theme in agility literature) and manipulation of change (relatively less explored), and so on. There is a a lot of food for thought in those 3 chapters, so if you can get a copy, I suggest that you spend a little time pondering over it.
Now, I'm a strong supporter of Proactive Option Thinking. Waiting for opportunities (and then react quickly) is not enough. I believe that options should be "implanted" in our project, and that can be done by applying the right design techniques. How? Keep reading : ).

The Invariant Decision
If you look back at those pictures of Solitaire, you'll see that I wasn't really delaying irreversible decisions. All decisions in solitaire are irreversible (real men don't use CTRL-Z). Many decisions in software development are irreversible as well, especially when you are in a tight budget/schedule, so starting over is not an option. Therefore, irreversibility can't really be the key here. Indeed, I was trying to delay Invariant Decisions. Decisions that I can take now, or I can take later, with little or no impact on the outcomes. The concept itself may seem like a minor change from "irreversible", but it allows me to do some magic:
- I can get rid of the "last responsible moment" part, which is poorly defined anyway. I can just say: delay invariant decisions. Period. You can delay them as much as you want, provided they are still invariant. No ambiguity here. That's much better.
- I can proactively make some decisions invariant. This is so important I'll have to say it again, this time in bold: I can proactively make some decisions invariant.

Invariance, Design, Modularity
If you go back to the Historical Perspective paragraph, you can now read it under a different... perspective :-). Several tools, techniques, methods can be adopted not just to delay some decision, but to create the option to delay the decision. How? Through careful design, of course!

Consider the strong modularity you get from service-oriented architecture, and the platform independence that comes through (well-designed) web services. This is a powerful weapon to delay a lot of decisions on one side or another (client or server).

Consider standard protocols: they are a way to make some decision invariant, and to modularize the impact of some choices.

Consider encapsulation, abstraction and interfaces: they allow you to delay quite a few low-level decisions, and to modularize the impact of change as well. If your choice turn out to be wrong, but it's highly localized (modularized) you may afford undoing your decision, therefore turning irreversible into reversible. A barebone example can be found in my old post (2005!) Builder [pattern] as an option.

Consider a very old OOA/OOD principle, now somehow resurrected under the "ubiquitous language" umbrella. It states that you should try to reflect the real-world entities that you're dealing with in your design, and then in your code. That includes avoiding primitive types like integer, and create meaningful classes instead. Of course, you have to understand what you're doing (that is, you gotta be a good designer) to avoid useless overengineering. See part 4 of my digression on the XP Episode for a discussion about adding a seemingly useless Ball class (that is: implanting a low cost - high premium option).
Names alter the forcefield. A named concept stands apart. My next post on the forcefield theme, by the way, will explore this issue in depth :-).

And so on. I could go on forever, but the point is: you can make many (but not all, of course!) decisions invariant, if you apply the right design techniques. Most of those techniques will also modularize the cost of rework if you make the wrong decision. And sure, you can try to do this on the fly as you code. Or you may want to to some upfront design. You know what I'm thinking.

OK guys, it took quite a while, but now we have a new concept to play with, so more on this will follow, randomly as usual. Stay tuned.

Labels: , , , , , ,

Tuesday, December 15, 2009 

A little more on DSM and Gravity

In a recent paper ("The Golden Age of Software Architecture" Revisited, IEEE Software, July/August 2009), Paul Clements and Mary Shaw conclude by talking about Conformance Checking. Indeed, although many would say that the real design/architecture is represented by code, a few :-) of us still think that code should reflect design, and that conformance of code to design should be automatically checked when possible (not necessarily in any given project; not all projects are equal).
Conformance checking is not always simple; quoting Clements and Shaw: "Many architectural patterns, fundamental to the system’s design taken forward into code, are undetectable once programmed. Layers, for instance, usually compile right out of existence."

The good news is that layers can be easily encoded in a DSM. While doing so, I would use an extension of the traditional yes/no DSM, as I've anticipated in a comment to the previous post. While the traditional DSM is basically binary (yes/no), in many cases we are better off with a ternary DSM. That way, we can encode three different decisions:
Yes-now: there is a dependency, and it's here, right now.
Not-now: there is no dependency right now, but it wouldn't be wrong to have one.
Never: adding this dependency would violate a fundamental design rule.

A strong layered system requires some kind of isolation between layers. Remember gravity: new things are naturally attracted to existing things.
Attraction is stronger in the direction of simplicity and lack of effort: if no effort is required to violate architectural integrity, sooner or later it will be violated. Sure, conformance checking may help, but it would be better to set up the gravitational field so that things are naturally attracted to the right place.

The real issue, therefore, is the granularity of the DSM for a layered system. Given the fractal nature of software, a DSM can be applied at any granularity level: between functions, classes, "logical" components, "physical" components. Unless your system is quite small, you probably want to apply the DSM at the component level, which also means your layers should appear at the component level.

Note the distinction between logical and physical component. If you're working in a modern language/environment (like .NET or Java), creating a physical component is just a snap. Older languages, like C++, never got the idea of component into the standard, for a number of reasons; in fact, today this is one of the most limiting factors when working on large C++ system. In that case, I've often seen designer/programmers creating "logical" components out of namespaces and discipline. I've done that myself too, and it kinda works.

Here is the catch: binary separation between physical components is stronger than the logical separation granted from using different namespaces, which in turn is stronger than the separation between two classes in the same namespace, which is much stronger than the separation between two members of the same class.
More exactly, as we'll see in a forthcoming post, a binary component may act as a better shield and provide stronger isolation.

If a binary component A uses binary component B, and B uses binary component C, but does not reveal so in its interface (that is, public/protected members of public classes in B do not mention types defined in C) A knows precious nothing about C.
Using C from A requires that you discover C existence, then the existence of some useful class inside C. Most likely, to do so, you have to look inside B. At that point, adding a new service inside B might just be more convenient. This is especially true if your environment does not provide you with free indirect references (that is, importing B does not inject a reference to C "for free").
Here is again the interplay between good software design and properly designed languages: a better understanding of software forces could eventually help to design better languages as well, where violating a design rule should be harder than following the rule.

Now, if A and B are logical components (inside a larger, physical component D), then B won't usually act as a shield, mostly because the real (physical) dependency will be placed between D and C, not between B and D. Whatever B can access, A can access as well, without any additional effort. The gravitational field of B is weaker, and some code might be attracted to A, which is not what the designer wanted.

Therefore, inasmuch as your language allows you to, a physical component is always the preferred way to truly isolate one system from another.

OK, this was quite simple :-). Next time, I'll go back to the concept of frequency and then move to isolation!

Labels: , , , ,

Sunday, February 22, 2009 

Notes on Software Design, Chapter 4: Gravity and Architecture

In my previous posts, I described gravity and inertia. At first, gravity may seem to have a negative connotation, like a force we constantly have to fight. In a sense, that's true; in a sense, it's also true for its physical counterpart: every day, we spend a lot of energy fighting earth gravity. However, without gravity, like as we know it would never exist. There is always a bright side :-).

In the software realm, gravity can be exploited by setting up a favorable force field. Remember that gravity is a rather dumb :-) force, merely attracting things. Therefore, if we come up with the right gravitational centers early on, they will keep attracting the right things. This is the role of architecture: to provide an initial, balanced set of centers.

Consider the little thorny problem I described back in October. Introducing Stage 1, I said: "the critical choice [...] was to choose where to put the display logic: in the existing process, in a new process connected via IPC, in a new process connected to a [RT] database".
We can now review that decision within the framework of gravitational centers.

Adding the display logic into the existing process is the path of least resistance: we have only one process, and gravity is pulling new code into that process. Where is the downside? A bloated process, sure, but also the practical impossibility of sharing the display logic with other processes.
Reuse requires separation. This, however, is just the tip of the iceberg: reuse is just an instance of a much more general force, which I'll cover in the forthcoming posts.

Moving the display logic inside a separate component is a necessary step toward [independent] reusability, and also toward the rarely understood concept of a scaled-down architecture.
A frequently quoted paper from David Parnas (one of the most gifted software designers of all times) is properly titled "Designing Software for Ease of Extension and Contraction" (IEEE Transactions on Software Engineering, Vol. 5 No. 2, March 1979). Somehow, people often forget the contraction part.
Indeed, I've often seen systems where the only chance to provide a scaled-down version to customers is to hide the portion of user interface that is exposing the "optional" functionality, often with questionable aesthetics, and always with more trouble than one could possibly want.

Note how, once we have a separate module for display, new display models are naturally attracted into that module, leaving the acquisition system alone. This is gravity working for us, not against us, because we have provided the right center. That's also the bright side of the thorny problem, exactly because (at that point, that is, stage 2) we [still] have the right centers.

Is the choice of using an RTDB to further decouple the data acquisition system and the display system any better than having just two layers?
I encourage you to think about it: it is not necessarily trivial to undestand what is going on at the forcefield level. Sure, the RTDB becomes a new gravitational center, but is a 3-pole system any better in this case? Why? I'll get back to this in my next post.

Architecture and Gravity
Within the right architecture, features are naturally attracted to the "best" gravitational center.
The "right" architecture, therefore, must provide the right gravitational centers, so that features are naturally attracted to the right place, where (if necessary) they will be kept apart from other features at a finer granularity level, through careful design and/or careful refactoring.
Therefore, the right architeture is not just helping us cope with gravity: it's helping us exploit gravity to our own advantage.

The wrong architecture, however, will often conjure with gravity to preserve itself.
As part of my consulting activity, I’ve seen several systems where the initial partitioning of responsibility wasn’t right. The development team didn’t have enough experience (with software design and/or with the problem domain) to find out the core concepts, the core issues, the core centers.
The system was partitioned along the wrong lines, and as mass increased, gravity kicked in. The system grew with the wrong form, which was not in frictionless contact with the context.
At some point, people considered refactoring, but it was too costly, because mass brings Inertia, and inertia affects any attempt to change direction. Inertia keeps a bad system in a bad state. In a properly partitioned system, instead, we have many options for change: small subsystems won’t put up much of a fight. That’s the dream behind the SOA concept.
I already said this, but is worth repeating: gravity is working at all granularity levels, from distributed computing down to the smallest function. That's why we have to keep both design and code constantly clean. Architecture alone is not enough. Good programmers are always essential for quality development.

What about patterns? Patterns can lower the amount of energy we have to spend to create the right architecture. Of course, they can do so because someone else spent some energy re-discovering good ideas, cleaning them up, going through shepherding and publishing, and because we spent some time learning about them. That said, patterns often provide an initial set of centers, balancing out some forces (not restricted to gravity).
Of course, we can't just throw patterns against a problem: the form must be in effortless contact with the real problem we're facing. I've seen too many good-intentioned (and not so experienced :-) software designers start with patterns. But we have to understand forces first, and adopt the right patterns later.

Enough with mass and gravity. Next time, we're gonna talk about another primordial force, pushing things apart.

See you soon, I hope!

Labels: , , , , ,

Wednesday, January 14, 2009 

Notes on Software Design, Chapter 3: Mass, Gravity and Inertia

I thought I could discuss the whole concept of Gravity and its implications in 2 or 3 (long) posts. While writing, I realized I'll need at least 4 or 5. So, this time I'll talk a little about how we can cope with gravity, and about the concept of Inertia. Next time, I'll discuss how we can exploit gravity, and why (despite the obvious cost) it is important that we do not surrender to (or ignore) gravity.

How do we cope with gravity? Needless to say, we have to spend some energy to move away from the amorphous big blob. As usual, we can also borrow some of that energy from someone (or something) else. Here are a few well-proven ideas:

- Architecture. I used to define architecture as "an overall structure, providing a natural place for features and concepts". I could now say that architecture must provide the right centers, or (from the viewpoint of mass and gravity) the right gravitational centers, so that the system can grow harmoniously. The right architecture is also the key to exploit gravity. More about this (and about the role of design patterns) next time.

- Refactoring. While architecture requires some kind of upfront investment, refactoring fights gravity in a more piecemeal, continuous fashion.
Although Refactoring and Emergent Design are often seen as the arch-enemies of Architecture, they are not. Experienced developers know that both are needed, as they work at different scales.
No amount of architecture, for instance, will ever prevent small-scale gravity to attract more code into existing functions. When we add a new feature (maybe under a tight deadline) gravity suggests to add that feature in place, often without even breaking the smallest separation unit – the function.
Conversely, gravity (and even more so Inertia) does not allow refactoring to scale economically beyond some (hard to identify) threshold.

- Measurement and Correction. While refactoring is often performed on-the-fly by programmers, fixing bad smells as they go, we can also use automatic tools to help us keep the code within some quality bounds. See Simple Metrics and More on Code Clones for a few ideas. Of course, measures provide guidance, but then the usual refactoring techniques must be applied.

- Visualization. More on this another time.

- Better Languages and Technologies. At some granularity level, technology becomes either a boon or an hindrance. Consider components: creating binary, release-to-release compatible components in C++ is a nightmare. .NET, for instance, does a much better job. Languages with a simple grammar, like Java and C#, or with strong support for reflection, also allows better tools to be built (see next point)

- Better Tools. Consider web services. They provide a relatively painless way to create a distributed system. The lack of pain doesn't really come from SOAP (which isn't that stroke of genius), but from the underlying HTTP/XML infrastructure and from the widely available, easily interoperable WSDL tools. Consider also refactoring: without good tools, it's a relatively error-prone activity. Refactoring tools make it much easier to fight gravity, moving code around with relatively little effort.

On Inertia
Mass brings gravity. Gravitational attraction works to preserve the existing structure (at the fractal levels I discussed in Chapter 1). In the physical world, however, we have another interesting manifestation of mass, called Inertia. There are many formulations of the concept (see the wikipedia page for details), but what is most interesting here is the simple F=m*a equation. We apply external forces (human work) to a system, but systems with a large mass won't easily change their state of rest or motion (including their current direction).

What is, then, the state of rest/motion for a software system? We could provide several analogies. To find the best analogy for acceleration, we need the best analogy for speed. To find the best analogy for speed, we need the best analogy for space.

The underlying idea must be that we apply some effort to move our software through space. What is the nature of that space? A few real-world examples are needed. Consider a C++/MFC application; we want to migrate the GUI layer to C#/.NET (interestingly, "migration" is commonly used to indicate motion in space). Consider a monolithic, legacy application that must be exposed as a service; or a web application that requires some performance improvement. Sure, all this may require some change in mass too (as some code will be added, some removed), but what is required is to move the software to a different place. What is that place, or, inside which kind of space do we want to move? I encourage you to think about this on your own for a while, before reading further.

My answer is rather simple: that space is the decision space. Software is built by making a number of decisions: we choose languages, technologies, architectural styles, coding styles (e.g. error handling styles, readability/efficiency trade offs, etc.), and so on. We also choose a development process, a team, etc.
Some of those decisions are explicit and carefully worked out. Some are taken on the fly as we code. At any given time, our software is located in a specific (albeit difficult to define) place inside a huge, multi-dimensional decision space. Each decision affects some portion of code. Some are clearly separated. Some are pervasive or cross-cutting.

Software development is a learning process; therefore, some of those decisions will be wrong. Some will be right for a while, but since real-world software does not live in a vacuum, we'll have to change them anyway later.
Changing a decision requires moving our software through the decision space: every decomposition unit affected by that decision will be touched, therefore adding to the mass to be moved (hence the deadly cost of cross-cutting, pervasive concerns).

Inertia explains why some decisions are so hard to change. Any decision we change is bound to require a change in the state of rest, or motion, of our software, because we want to move it into another place.
Some of those decisions impact a large mass of software, and therefore a strong force must be applied. Experience shows that after a critical mass is reached, it becomes so hard to even understand what to do, that software becomes an immovable object (therefore requiring an irresistible force :-).

Of course, small systems won't show much inertia, which explains why the dynamics of programming in the small are different from the dynamics of programming in the large.

Also, speed and acceleration depends also on time. I'll save this for a later time, as I still have to understand a few things better :-)

Enough for today. See you guys soon!

Labels: , ,

Wednesday, September 03, 2008 

Horizontal scaling

I finally took the time to read BASE: An ACID Alternative by Dan Pritchett in ACM Queue (the paper is free also for non-members). It's short and simple, but highlights a frequent problem in horizontal scaling: when you start partitioning databases (or services, for that matter) you step into distributed transactions, and that's kinda slow, man.

Of course, we get into distributed transactions because we want to preserve transactional thinking and the ACID properties. However, as Dan notes, we can trade some (short-term) consistency for higher availability and performance. As you'll see, that raises the bar for infrastructure, requiring at least a persistent queuing mechanism.

The author comes from Ebay, so we can assume he knows a thing or two about scalability :-). Also, the paper is very readable, and the same techniques can be successfully used in very complex systems (I remember I used similar techniques in quite a few banking applications). The trick is to let go some of the orthodoxy about transactions, and design your data layer for scalability.

Still on the issue of scalability, I can also recommend another article from ACM Queue (again, free to non-members): Learning from THE WEB by Adam Bosworth (VP of engineering at Google).
I especially like point 3 because it's so damn unorthodox: It is acceptable to be stale much of the time. Again, it's not easy to accept this, and to adapt our applications (and requirements!) accordingly. In many cases, we just can't do it. In many (many more than people are inclined to accept) we can.

There is also a strong connection with the BASE approach above, and both requires a little of out-of-the-box thinking to be applied. We may have to tweak requirements a little, to move closer to the sweet spot between technology and business. More on this another time :-).

Labels: , ,

Tuesday, May 27, 2008 

On the concept of Form (3): the Force Field

Warning: :-) this post is going to be somehow conceptual. I'll soon move to some real-world, software-based example, but I really need to introduce some concepts first.

The notion of force field might be unfamiliar to some, so I'll borrow a great example from Alexander himself. Consider your first "requirement" (for a system yet to be built) as a permanent magnet of some size and shape. If you place a flat glass over that magnet, and drop some iron filings on it, the iron will naturally dispose along the magnetic field lines. That gives us an image of [a section of] the force field. Now add another magnet: the shape of the field will change, as the magnets are interacting, thereby shaping a more complex force field.
We can change the [shape of the] field in many ways: moving magnets around, changing their shape, their magnetization, or even adding some shields around magnets.

The great thing about the magnetic field is that we can somehow observe its shape. Indeed, if our goal was to create a form that can be put into effortless contact with the field, we'll just have to replicate the same form that the magnetic field is giving to the iron filings. As Alexander says (NoTSoF, page 21), "once we have a diagram of forces [...] this will in essence also describe the form as a complementary diagram of forces".
In the real world, and even more so in the software world, we are never so lucky: the force field is invisible and tends also to be highly unstable.

Usually, the force field of a software project starts with Requirements. Requirements are often categorized in some way, like "functional" and "nonfunctional", or "user requirements" and "system requirements. However, requirements of any kind are just like magnets: they contribute to shape the overall field.

Requirements are just one kind of force, that is, they are not alone in shaping the field. Many technological choices we make, sometimes very (or too) early, are also shaping the force field.
Consider a simple business application. Once you decide that you'll build a web application, you have added quite a few powerful magnets. If you're familiar with JSP and EJB, you are naturally tempted to choose those technologies early on. That's like adding quite a few powerful magnets again. Or maybe it's like adding a magnetic shield: it really depends on context.

Sometimes, technology makes the field simpler: the right infrastructure should simplify the field, that is, it should act more like a shield than like a magnet. In this sense, infrastructure should be chosen when the dominant forces are known, unlike what happens in many projects, where infrastructure (usually a superstructure in disguise) is chosen too early, thereby making the overall field even more complex.

We also shape the field, so to speak, by choosing what to ignore and what to postpone in any given release. Anything we ignore, like anything we postpone, won't be allowed to shape the field right now.
This is fine, as long as the corresponding magnets will be placed somehow distant from the others (good modularity), possibly with some kind of magnetic shielding in between (stable interfaces). It's also fine if we can ignore it forever. Any attempt to temporarily ignore a strictly interacting force will wreak havoc later on, as our form will no longer match the resulting force field. Refactoring can accommodate minor misfits with the ideal form, but won't help much when the force field changes radically (see also my notes on refactoring here).

Here lies one of the architect's fundamental abilities: the intuitive understanding that something can be beneficially postponed, while something else must be dealt with immediately, because its influence on the force field is so strong that doing otherwise will shift us toward the wrong kind of form.

It is important to understand the role of choice in exploiting instability. Too often, software developers tend to see requirements as "fixed". They don't like to negotiate: it's much easier to fight the compiler than the marketing guys.
A good architect, however, can't miss the opportunity to simplify the field by moving some magnets around. That requires the ability to see the overall picture and the fine details at the same time. Here is Alexander again (page 18): "this ability to deal with several layers of form-context boundaries in concert is an important part of what we often refer to as the designer's sense of organization. The internal coherence of an ensemble depends on a whole net of such adaptations".
That ain't an easy feat. It requires an understanding of the business, the users, and the technology. And even more important, it requires a willingness to act on that knowledge. The power of choice extends to the infrastructure: sometimes, by willingly postponing a technological choice until the force fields takes shape, we can make a better, more "natural" choice.

This can be hard for some developers: they want certainty, and they want it now. In my experience, that goes in pair with the willingness to adopt a sub-optimal, but repetitive and context free solution for a wide class of problems, instead of adopting several optimal, but reasoned and context-dependent solution for smaller classes of problems.

Unfortunately, choosing the "wrong" technology is very much like choosing the wrong shape or orientation for a building. To quote Alexander once again (page 29): "Instead of orienting the house carefully for sun and wind, the builder conceives its organization without concern for orientation, and light, heat, and ventilation are taken care of by fans, lamps, and other kinds of peripheral devices. Bedrooms are not separated from living rooms in plan, but are placed next to one another and the walls between them stuffed with acoustic insulation".
I think we can easily see a parallel with software here: a misfit technology is chosen early on. As a consequence, you find yourself adding more and more technology (fans, lamps, insulation) to satisfy the end-user needs. "Modern" web applications seem to have taken this path: faced with a difficult field, they're layering one technology on top the other, desperately trying to overcome the problems of the previous layer.

Next time, in no particular order: agility, unstable requirements, early coding, TDD, "seeing" the field, internal and external representations, is UML any useful, order within chaos (dominant forces), constructive force field and systematic techniques, and whatever else will come to my mind :-).

Labels: , , ,

Sunday, January 27, 2008 

Being 10 Years Behind (part 2)

Do you remember "Windows DNA"? If you can't, don't blame yourself, because the MSDN doesn't remember either :-).
Indeed, it seems like Microsoft took good care of removing most of the material on Windows DNA from its developer-oriented website.
However, here comes the TechNet website to the rescue (well, at least till they realize it :-). As you see, the much touted "Distributed interNet Applications Architecture" was the usual 3 tier blurb. There is no date on that web page, but the "Windows DNA" stuff is about ten years old.

Sure, Windows DNA was all based on COM+ components, most likely implemented in Visual Basic, maybe glued to a presentation logic written in old-style ASP (VBScript all around). But look at that architecture again. Does it look familiar?

Let's take a look at some recent Microsoft-oriented paper on application architecture. For instance, in Microsoft .NET Pet Shop 3.x: Design Patterns and Architecture of the .NET Pet Shop the Microsoft-flavored "Data Access Layer" is introduced, along with a general architecture (see fig. 3) which looks absolutely identical to the old "Windows DNA" stuff.

Dig deeper (fig. 5, 6, 8, 9) and you'll also realize that the DAL structure is a mirror of the DB structure (that is, basically one class for each table). Looks really like the decade-old, fragile architecture I described in my previous post, except this paper is "just" 5 years old. Particularly dreadful is the "business entities" yellow box in fig. 8, spanning the 3 tiers with a set of hard-coded structures (which end up being a mirror of the database tables).

Fast forward to the present (sort of), and you get introductory papers like Creating a Data Access Layer where again the same basic architecture is rehashed under the .NET 2.0 newfangled classes and wizards.

And oh yeah, if you really wanna feel up-to-date, LINQ will take care of the DB, no more SQL, thank you. Except they've just embedded SQL in C#, thereby exposing your code to the same fragility WRT changes in the database schema.

Now, why is Microsoft pushing (through authors and evangelists) old stuff like that? I've partially answered in a comment to a previous post, but I'll add a little more. It's not that they're not smart enough to do better. It's that they think we are not smart enough to do better (Sun doesn't think much differently either).
Indeed, the architecture they're selling is easy to explain, easy to understand, easy to implement piecemeal, without much thinking. It's almost a Marketecture (short for Marketing Architecture, contrast with Technical Architecture).

Here are a few half-baked thoughts for those of you with a little time to spare :-) and a sincere interest about creating modern (or post-modern) architectures:

A) Information Hiding is about hiding likely changes. Likely changes in a database-oriented architecture are:
1) the database engine itself (oracle, sql server, etc). That includes the SQL-dialect of the database, so don't rely entirely on odbc, ado and the like.
2) the data access technology (remember odbc, rdo, dao, ado,, 2.0, linq, all have been sold as the ultimate technology, yet every 2 years or so we get a new one).
3) the database schema itself.
The old-style architecture may do something about 1 and 2, but precious nothing for 3 (which is going to consume most of your time anyway).

Now, the database schema may change for several reasons. Over time, you will:
- normalize
- denormalize
- add/drop fields
- add/drop tables
- re-route relationships
- change cardinality in relationships
You need to understand the most likely changes, as these are shaping your context (and therefore influence the best form)

B) The interface between a (well-designed) Data Layer and the Business Layer must be loose. It shouldn't break when the database schema changes because you added a field. Therefore, if the interface is based on strongly typed entities which mirror the database schema, you're doomed.

C) The interface between a (well-designed) Business Layer and the UI Layer or Service Layer must be loose. See above.

D) Don't lock the architecture on the worst case. We all know that a lot of code behind the UI is not that smart.
In many cases, given a robust validation layer, which can be designed to be very flexible and dynamic, the business layer won't do much except routing data to / from the data layer.
Don't make the business layer a necessary burden. Make it an important, yet optional component that kicks in only when important business logic is needed.

E) Reflection is the key to flexible DB applications.

F) You can only get so far with language-based reflection at the Data Layer level, because SQL is too old/primitive. Sooner or later, you'll need to attach more semantics to each field than your DB wants you to (especially if you don't want to tie yourself to a single DB vendor). Be creative :-), as this would take too much space for a single post.

G) Static typing is great inside each layer. It's also great at the interface level when the structure we're talking about is stable. It's truly bad when you want to expose a flexible or changing structure.
Remember why we conceived XML in the first place? Data are fluid!

Ok, there would be more to say about semistructured data, service-oriented architectures and the like, but that will have to wait.

I'll just repeat my caveat: be wary about buying an architecture from your vendor. Apply a good dose of critical thinking and look for the real value in your specific context.
You wouldn't buy the architectural blueprint of your house from a bricks or pipes vendor, no matter the quality of those bricks and pipes. You normally shouldn't buy your application architecture from your language, tools, or operating system vendor either.

Labels: ,

Monday, January 14, 2008 

Being 10 Years Behind (part 1)

In the last two years I've been working quite closely with a company, designing the fourth generation of a successful product. Indeed, a few of my posts have been inspired by the work I did on that project.
What we have now is a small, flexible, fast web application where we definitely pushed the envelope using AOP-like techniques pervasively, although in .NET/C#.

Compared with the previous generation, our application has more features, is much easier to customize (a must), is much easier to use thanks to the task-oriented design of the HCI (the previous generation was more slanted toward the useless computer approach) and also about 10 times faster (thanks to better database design and a smarter business layer).
Guess what, the source code size is just about 1/30 of what we had before (yeap, 30 times less), excluding generated code to read/write some XML files. The previous application was written in a mixture of C++, VB6, Perl, Python, C#.

Now, the company is considering a strict partnership with an Asian corporation. They have a similar product, in ASP.NET / C# as well. It took them something like 30 times our man-months to write it, so the general feeling was that it should have been "more powerful". Time to look at the features, but hey, features are basically the same, although their product is not task-oriented. If anything, they lack a lot of our customization interface.

Time to look at the code, as the code never lies.
The code is probably 50 times bigger, with no flexibility whatsoever. If you need to attach one more field to a business concept, just to track some custom information, you probably have to change the database, 5-8 classes between the database and the GUI, and the GUI itself.
In most cases, in our application we just need to open the administration console, add the field, specify validation rules, and that's it.
If you have special needs, you write a custom class, decorate the class with attributes, and we take care of all the plumbing to instantiate / call your class in the critical moments (that's one of the several places where the AOP-like stuff kick in).

What struck me when I looked at that code was the (depressing :-) similarity with a lot of old-style Java code I've been seeing over the years, especially in banking environments.

There is an EntityDAO (data access object) package with basically one class for each business entity. That class is quite stupid, basically dealing with persistence (CRUD) and exposing properties. Those classes are used "at the bottom" of the system, that is, near to the database.

Then there is an Entity package where (again!) there is basically one class for each (business) entity. The class is completely stupid, offering only get/set methods. Those classes are used "at the top" of the system, that is, near to the GUI or external services.

There is a BusinessLogic package where Entities gets passed to various classes as parameters, and EntityDAO objects are used to carry out persistency-related tasks.
Actually, inside the BusinessLogic lies a lot of database-related code, sometimes with explicit manipulation of DataRow classes. The alternative would have been to create much more EntityDAO classes.

Here and there, the coupling between the BusinessLogic and the database must have seemed too strong, so an EntityReader package has been created, where more sophisticated (yet still stupid) entities (or collections) are built using EntityDAO classes.

Finally, you just need :-) something to call from your service or GUI layer. The ubiquitous BusinessFacade package is therefore introduced, implementing a very large, use-case driven interface (put in yet another package), taking Entity instances as parameters and using the BusinessLogic.

At that point, people invariably realize that services need much more logic than what is provided by the BusinessLogic package, and so go ahead and create a (very sad) BusinessHelper package, where they complement all the missing parts in the BusinessLogic, most often by direct database access.

Then we have other subsystems (cache, SQL, and so on) all built around XXXManager classes, which we can ignore.

Of course, in the end everything is so coupled with the database schema that just adding a field results in a nightmare. And you get a lot of code to maintain as well. Good luck. Meanwhile, the Ruby On Rails guys are creating (simple) applications faster than the other guys can spell BusinessHelper. Say good-bye to productivity.

We can blame it on static typing, but reality is much simpler. That architecture is wrong. Is at least 10 years behind from the state of practice, which means is probably 15 to 20 years behind from the state of the art.
The problem is, that ancient architecture was popularized years ago mostly by language and tools vendors, or by people who thinks Architects don't have to understand code or the real problem being solved, just to replicate a trivial-yet-humonguous structure everywhere. It's basically a decontextualized architecture (more on this next time).

Indeed, if you look at the Java literature, you can find good books dating back to the early decade (like "EJB Design Patterns" by Floyd Marinescu, year 2002), where the pros and cons of several choices adopted in that overblown yet fragile architectural model are discussed. When a Patterns book appears, a few years of practice are gone by. That was 2002; now, ten years are gone, and yet developers still fall into the same trap.

It gets worse. While even the most outdated Java applications are gradually moving away from that model (see Untangling Enterprise Java by Chris Richardson for a few ideas), Microsoft evangelists are so excited about it. They happily go around (re)selling an architecture that is remarkably similar to the 10-years-behind behemoth above.

Which brings me to "Being 10 Years Behind (part 2)". Stay tuned :-).

Labels: , , ,

Wednesday, November 28, 2007 

Architecture as Tradition in the Unselfconscious Process

In my previous post, On the concept of Form (1), I mentioned how Architecture is providing viscosity, and therefore playing the role Alexander ascribed to tradition.

I've also proposed that the unselfconscious design process, which is very similar to the emergent design concept held so dearly by many agilists, requires some degree of tradition, and therefore, an underlying architecture. I've also gone so far as to propose the idea that many agile projects begin with a "traditional" architecture in mind:
Now, although some people in the XP/agile camp might disagree, refactoring is a viable solution only when the desired rate of change is slow, and only when the gap to fill is small. In other words, only when the overall architecture (or plain structure) is not challenged: maybe it's dictated by the J2EE way of doing things, or by the Company One True Way of doing things, or by the Model View Controller police, and so on. Truth is, without an overall architecture resisting change, a neverending sequence of small-scale refactoring may even have a negative large-scale impact.

In the past few days, I've been reading "The Economics of Architecture-First," by Grady Booch, IEEE Software, Sept/Oct, 2007. Here is an interesting excerpt:
Now, strict agilists might counter that an architecture-first approach is undesirable because we should allow a system's architecture to emerge over time. On the one hand, they're absolutely correct: a system's architecture is simply the name we give to the artifact that results from the many local design decisions made over a software-intensive system's lifetime. On the other hand, they're wrong: agile projects often start out assuming a given platform and environmental context together with a set of proven design patterns for that domain, all of which represent architectural decisions in a very real sense.

I could almost call this synchronicity :-).

For more on emergent architecture (or structure), see my now-old post Infrastructure and Superstructure.

Labels: , ,