Monday, January 14, 2008 

Being 10 Years Behind (part 1)

In the last two years I've been working quite closely with a company, designing the fourth generation of a successful product. Indeed, a few of my posts have been inspired by the work I did on that project.
What we have now is a small, flexible, fast web application where we definitely pushed the envelope using AOP-like techniques pervasively, although in .NET/C#.

Compared with the previous generation, our application has more features, is much easier to customize (a must), is much easier to use thanks to the task-oriented design of the HCI (the previous generation was more slanted toward the useless computer approach) and also about 10 times faster (thanks to better database design and a smarter business layer).
Guess what, the source code size is just about 1/30 of what we had before (yeap, 30 times less), excluding generated code to read/write some XML files. The previous application was written in a mixture of C++, VB6, Perl, Python, C#.

Now, the company is considering a strict partnership with an Asian corporation. They have a similar product, in ASP.NET / C# as well. It took them something like 30 times our man-months to write it, so the general feeling was that it should have been "more powerful". Time to look at the features, but hey, features are basically the same, although their product is not task-oriented. If anything, they lack a lot of our customization interface.

Time to look at the code, as the code never lies.
The code is probably 50 times bigger, with no flexibility whatsoever. If you need to attach one more field to a business concept, just to track some custom information, you probably have to change the database, 5-8 classes between the database and the GUI, and the GUI itself.
In most cases, in our application we just need to open the administration console, add the field, specify validation rules, and that's it.
If you have special needs, you write a custom class, decorate the class with attributes, and we take care of all the plumbing to instantiate / call your class in the critical moments (that's one of the several places where the AOP-like stuff kick in).

What struck me when I looked at that code was the (depressing :-) similarity with a lot of old-style Java code I've been seeing over the years, especially in banking environments.

There is an EntityDAO (data access object) package with basically one class for each business entity. That class is quite stupid, basically dealing with persistence (CRUD) and exposing properties. Those classes are used "at the bottom" of the system, that is, near to the database.

Then there is an Entity package where (again!) there is basically one class for each (business) entity. The class is completely stupid, offering only get/set methods. Those classes are used "at the top" of the system, that is, near to the GUI or external services.

There is a BusinessLogic package where Entities gets passed to various classes as parameters, and EntityDAO objects are used to carry out persistency-related tasks.
Actually, inside the BusinessLogic lies a lot of database-related code, sometimes with explicit manipulation of DataRow classes. The alternative would have been to create much more EntityDAO classes.

Here and there, the coupling between the BusinessLogic and the database must have seemed too strong, so an EntityReader package has been created, where more sophisticated (yet still stupid) entities (or collections) are built using EntityDAO classes.

Finally, you just need :-) something to call from your service or GUI layer. The ubiquitous BusinessFacade package is therefore introduced, implementing a very large, use-case driven interface (put in yet another package), taking Entity instances as parameters and using the BusinessLogic.

At that point, people invariably realize that services need much more logic than what is provided by the BusinessLogic package, and so go ahead and create a (very sad) BusinessHelper package, where they complement all the missing parts in the BusinessLogic, most often by direct database access.

Then we have other subsystems (cache, SQL, and so on) all built around XXXManager classes, which we can ignore.

Of course, in the end everything is so coupled with the database schema that just adding a field results in a nightmare. And you get a lot of code to maintain as well. Good luck. Meanwhile, the Ruby On Rails guys are creating (simple) applications faster than the other guys can spell BusinessHelper. Say good-bye to productivity.

We can blame it on static typing, but reality is much simpler. That architecture is wrong. Is at least 10 years behind from the state of practice, which means is probably 15 to 20 years behind from the state of the art.
The problem is, that ancient architecture was popularized years ago mostly by language and tools vendors, or by people who thinks Architects don't have to understand code or the real problem being solved, just to replicate a trivial-yet-humonguous structure everywhere. It's basically a decontextualized architecture (more on this next time).

Indeed, if you look at the Java literature, you can find good books dating back to the early decade (like "EJB Design Patterns" by Floyd Marinescu, year 2002), where the pros and cons of several choices adopted in that overblown yet fragile architectural model are discussed. When a Patterns book appears, a few years of practice are gone by. That was 2002; now, ten years are gone, and yet developers still fall into the same trap.

It gets worse. While even the most outdated Java applications are gradually moving away from that model (see Untangling Enterprise Java by Chris Richardson for a few ideas), Microsoft evangelists are so excited about it. They happily go around (re)selling an architecture that is remarkably similar to the 10-years-behind behemoth above.

Which brings me to "Being 10 Years Behind (part 2)". Stay tuned :-).

Labels: , , ,

All this stuff makes me remember your "Record Oriented Architecture" articles. I think basically the two architectures are very similar. Is that true?
Very very interesting, I can't wait to see what you write next.
A new layer can solve a problem that a really smart architect at MS or SUN has discovered in some quite general domain.
The diligent programmer is in troubles when such a problem is "generated" by the architecure itself... it is not a real (domain specific) problem, but exists only in a specific architecture... he obtained a sub-optimal solution and he paid an extra cost (for free;)). So are smart guys at SUN or MS really so smart ? Maybe they have another goal: let the dummy-diligent programmer make a reasonably good and predictable work. Maybe a reasonably good and predictable work can be archived only by reasonably good programmers ;)
Fulvio: the old-style architecture I described in this post and the architecture I described 10 years ago in those papers share the same domain, but are quite different.

The record-oriented architecture (ROA) I described was designed around the needs of simple, yet highly dynamic, 2-tier rich-client applications.

It was, I would say, relatively "modern" back then, although to keep the total length and complexity down to a reasonable size I neglected to explain what later turned out to be a central component (the statement + statement broker).

Curiously enough, my "modern" approach to record-oriented applications still has some similarities, but scales well to N-tiers applications and is more slanted toward that missing (statement broker) component :-).

That said, a central theme in the ROA papers was that creating rigid entities at the data layer level, sending those entities at the business level, and then sending the same (or similarly rigid) entities to the GUI/Service level was based on a complete misunderstanding of the "information hiding" concept, and that an inherent fragility would follow. Not to mention code redundancy, which is why so many programmers involved in similarly bad architectures are looking for code generation or AOP-like techniques to partially circumvent the issue.

These are basically the same problems I've found in the outdated architecture I described here. That's why it's 10 years behind :-).
Andrea: I was going to write something similar in part 2.

There are many reasons why vendor-driven architectures are seldom good. Among those, the idea that the dummy, hard-working programmer should be able to create predictable (albeit redundant and fragile) code without much critical thinking.

It's not that engineers at SUN and Microsoft don't understand the issues (although Microsoft is not making any serious database-centric application). It's that they're selling an application architecture to the masses, and they believe the average programmer is, well, below-average :-)).

Of course, everyone reading my blog is (by necessity :-) above average, so the naked truth is: your vendor is not thinking about you :-).
As it would take a few more days before I get back to the blog, I'll add something for the hungry: layers must be loosely coupled.

Strongly typed entities are a very strong form of coupling. Oh, did I say form? :-).
- I've spent some time last days searching for "the statement broker" on the internet. Unfortunately I still haven't found any reference. What should be the responsibility of that component?

- Your statement: "Strongly typed entities are a very strong form of coupling" lead me to this: POJO + AOP (Hibernate, Spring and the like) are not the "right" architecture (I can see what could be the problems even if i never had experience on large-scale business systems) because pojo are strongly typed entities. But the only one technique i can identify as a candidate solution is based on reflection like the one in ROA series. On the behavioural side we can decouple entities through interfaces, but the discussed problem is decoupling the structure, and we can achive this through self-describing entities. Am I right?
Fulvio: I guess that's because I've kinda invented the concept :-). Actually, it's likely I didn't invent it, I just don't know who (else) did.

POJO + AOP is a powerful combination in a (relatively) specific context. It's not a panacea, as usual. And it's just a part of the picture (toward the bottom, that is, persistence). What about the other side (toward the UI/Service)? Do we want to tie everything together with rigid structures?
As you properly understood, at some point there is a need to "go meta" and relie more on reflective / dynamic data types (although language-based reflection may not be enough).
When you look at it, it's curious how the real benefits of XML have been forgotten so quickly (by providing overly restrictive schema definition languages). It's also weird that those benefits have been tied so closely with XML, while they are a property of dynamic, reflective, semi-structured data types.

Collaterally, the biggest issue with all those proponents of infrastructural layers is that they (almost) never document the forces surrounding their design.
They claim to have solved, once and forever, in any given context, a difficult problem. Sure :-).
LINQ, for instance, is being sold as the next best thing, but is useful in a very narrow context as well.

Understanding the contexts where POJO+AOP is a winning solution, and the contexts where LINQ is a winning solution, and so on, is a somewhat difficult but extremely useful exercise for whoever is involved in making technology decisions on database-oriented applications...
Post a Comment

<< Home