Sunday, October 12, 2008 

Some Small Design Issues (part 1)

In a previous post, I talked about some small, yet thorny design problems I was facing. As I started writing about them, it became clear that real-world design problems are never so small: there is always a large context that is somehow needed to fully understand the issues.

Trying to distill the problem to fit a blog post is a nightmare: it takes forever, it slows me down to the point I'm not blogging anymore, and is exactly the opposite of what I meant when I wrote Blogging as Destructuring a few years ago. On a related note, Ed Yourdon (at his venerable age :-) is moving toward microblogging for similar reasons.

Still, there is little sensible design talk around, so what does a [good] man gotta do? Simplify the problem even more, split the tale in a few episodes, and so on.
I said "tale" because I'll frame the design problem as a story. I don't mean to imply that things went exactly this way. Actually, I wasn't even there at the time. However, looking at the existing artifacts, it seems reasonable that they somehow evolved that way.
Also, an incremental story is an interesting narrative device for a design problem, as it allows to put every single decision in perspective, and to reason about the non-linear impact of some choices.

Stage 1 - the beginning
We have some kind of industrial process control system. We want to show a few process variables on a large-size numeric display, like in the picture below:

At this point the problem is quite simple, yet we have one crucial choice to make: where do we put the new logic?
We have basically three alternatives:
1) inside an existing module/process, that is, "close to the data"
2) in a new/external process, connected via IPC to the existing process[es]. Connection might be operating in a push or pull mode, depending on update rate and so on (we'll ignore this for sake of simplicity).
3) in a new/external process, obtaining data through a [real-time] database or a similar middleware. The existing processes would have to publish the process variables on the database. The new process might pull data or be pushed data, depending on the data source.
Even at this early stage, we need to make an important architectural decision. It's interesting to see that in very small systems, where all the data is stored inside one process, alternative (1) is simpler. Everything we need "is just there", so why don't we add the display logic too?
This is how simple, lean systems get to be complex, fragile, bulky systems: it's just easier to add code where the critical mass is.
So, let's say our guys went for alternative (3). We have one data source where all the relevant variables are periodically stored.

Now, we just need to know which variables we want to show, and in which row. For simplicity, the configuration could be stored inside the database itself, like this (through an OO perspective):

Using an ugly convention, "-1" as a row value indicates that the process variable isn't shown at all.

Stage 2 - into the real world
Customers may already have a display, or the preferred supplier may discontinue a model, or sell a better/cheaper/more reliable one, and so on. Different displays have different protocols, but they're just multi-line displays nonetheless.
Polymorphism is just what the doctor ordered: looking inside the Display component, we might find something like this:

It's trivial to keep most of the logic to get and format data unchanged. Only the protocol needs to become an extension point. Depending on the definition of protocol (does it extend to the physical layer too?) we may have a slightly more complex, layered design, but let's keep it simple - there are no challenges here anyway.

Stage 3 - more data, more data!
Processes gets more and more complex, and customers want to see more data. More than one display is needed. Well, it's basically trivial to modify the database to store the display number as well.
A "display number" field is then added to the Process Variable Descriptor. Note that at this point, we need a better physical display, as the one in the picture above has hard-coded labels. We may want to add one more field to the descriptor (a user-readable name), and our protocol class may or may not need some restyling to account for this [maybe optional] information. The multiplicity between "Everything Else" and "Display Protocol" is no longer 1. Actually, we have a qualified association, using the display number as a key (diagram not shown). No big deal.
Note: at this stage, a constraint has been added, I guess by software engineers, not process engineers: the same process variable can't be shown on two different displays. Of course, a different database design could easily handle this limitation, but it wasn't free, and it wasn't done.

Hmmm, OK, so far, so good. No thorny issues. See you soon for part 2 :-).

Labels: , ,