Tuesday, May 13, 2008 

Natural language

Some (most :-) of my clients are challenging. Sometimes the challenge comes from the difficult technical problems they face. That's the best kind of challenge.
Sometimes the challenge comes from people: that's the worst kind of challenge, and one that right now is better left alone.
Sometimes the challenge comes from the organization, which means it also comes from people, but with a different twist. Challenges coming from the organization are always tough, but overcoming those challenges can really make a difference.

One of my challenging clients is a rather large company in the financial domain. They are definitely old-school, and although upper management can perfectly see how software is permeating and enabling their business, middle management tend to see software as a liability. In their eternal search for lower costs, they moved most of the development offshore, keeping only an handful of designers and all the analysts in-house. Most often, design is done offshore as well, for lack of available designers on this side of the world.

Analysts have a tough job there. On one side, they have to face the rest of the company, which is not software-friendly. On the other side, they have to communicate clear requirements to the offshore team, especially to the designers, who tend to be very technology-oriented.
To make things more complicated, the analysts often find themselves working on unfamiliar sub-domains, with precise regulations but also with large gray areas that must be somehow understood and communicated.
Icing on the cake: some of those financial instruments do not even exist in the local culture of the offshore team, making communication as difficult as ever.

Given this overall picture, I've often recommended analysts to spend some time creating a good domain model (usually, a UML class diagram, occasionally complemented by some activity diagrams).
The model, with unambiguous associations, dependencies, multiplicities, and so on, will force them to ask the right questions, and will make it easier for the offshore designer to acquaint himself with the problem. Over time, this suggestion has been quite helpful.
However, as I said, the organization is challenging. Some of the analysts complained that their boss is not satisfied by a few diagrams. He wants a lengthy, wordy explanation, so he can read it over and see if they got it right (well, that's his theory anyway). The poor analyst can't possibly do everything in the allotted time.

Now, I always keep an eye on software engineering research. I've seen countless attempts to create UML diagrams from natural language specifications. The results are usually unimpressive.
In this case, however, I would need exactly the opposite: a tool to generate a precise, yet verbose domain description out of a formal domain model. The problem is much easier to solve, especially because analysts can help the tool, by using the appropriate wording.

Guess what, the problem must be considered unworthy, because there is a dearth of works in that area. In practice, the only relevant paper I've been able to find is Generating Natural Language specifications from UML class diagrams by Farid Meziane, Nikos Athanasakis and Sophia Ananiadou. There is also Nikos' thesis online, with a few more details.
The downside is that (as usual) the tool they describe does not seem to be generally available. I've yet to contact the authors: I just hope it doesn't turn out to be one of those Re$earch Tool$ that never get to be used.

From the paper above, I've also learnt about ModelExplainer , a similar tool from a commercial company. Again, the tool doesn't seem to be generally available, but I'll get in touch with people there and see.

Overall, the problem doesn't seem so hard, especially if we accept the idea that the analyst will help the tool, choosing appropriate wording. An XMI-to-NL (Natural Language) would make for a perfect open source project. Any takers? :-)

Labels: , , , ,

Friday, March 07, 2008 

Problem frames and the DNC

If you've read any of my posts before, you probably know I'm not a fan of pre-packaged, one-size-fits-all ideas. Methods, technology, languages, models, even specific values (design values; business values; etc.) must always be considered in context.

With this background, it's hardly surprising that I've always found the notion of a Domain Neutral Component quite uncomfortable. It really sounded like an attempt to shoehorn the world into a predefined model, while we should carefully look for the relevant portion of the world we want to represent into our model.

Still, in many cases (especially for a junior analyst) starting with the DNC might be better than starting with a blank page. How could this be? Does it work all the time? If not, when? Honestly, in the past years I haven't spent too much time trying to answer those questions. The DNC was part of my bag of tricks, but I didn't use it often.

Recently, however, I was thinking (once more :-) about colors and UML, and while looking into some of Peter Coad's works for a specific reference, I stumbled on the DNC again. So I thought, maybe I've learnt something in the past few years that could shed some light on the inner quality of the DNC, and its suitability in any (?) given context.

The DNC can be considered as an overengineered "standard" model representing something (an event / moment / interval) happening somewhere (a place) involving one party or more (originally an actor), usually exchanging or dealing with some good (a thing). The party plays a role, hence the later shift from actor to party + role. Indeed, you can start with a very simple model, and "derive" the DNC by following a very reasonable line of reasoning: see From Associations To Domain Neutral Component for the full story.

Of course, in many cases, the DNC might be overengineered. But you can always simplify the unnecessary parts. The real question, however, is when the DNC can give you a head start, and when it won't (context, context, context :-).

That's where Problem Frames Patterns can shed some light. I recommend that you keep the PFP paper at hand while reading what follows.

Consider, for instance, the Commanded Behavior problem frame. Shortly, the problem is stated as:
There is some part of the world whose behavior is to be controlled in accordance with commands issued by
an operator. The problem is to build a machine that will accept the operator's commands and impose the
control accordingly.

and the frame concerns are:
1. When the Operator issues a Command
2. AND the Machine rejects invalid Commands
3. AND the Machine either ignores it if unviable, OR issues Control Events
4. AND the Control Events change the Controlled Domain
5. ENSURE the changed state meets the Commanded Behavior in every case.

That's not at odds with the DNC. Control Events map nicely to moment-interval; moreover, the text above suggests that multiple events might be issued for a single command (MomentInterval-MomentIntervalDetails). The Control Events change the Controlled Domain. Therefore, they must describe external entities (each probably having a Role) that must somehow influence internal entities (Party, having a Role, or Places, having a Role).
Using the Email Client example, "Email Retrieval" is an event, composed of individual retrieval events (one for each email message). Each Message is a Thing, although with a dubious Role. Retrieval needs (at least) an Account, which is a Party playing a specific Role (Receiver). Retrieval takes place on a specific Server, playing a Role (POP3 or IMAP server). Not so bad.

What if we look into another problem frame? Let's try Transformation:
There are some given inputs which must be transformed to give certain required outputs. The output data
must be in a particular format, and it must be derived from the input data according to certain rules. The
problem is to build a machine that will produce the required outputs from the inputs.

Concerns:
1. BY traversing the input in sequence, and simultaneously traversing the outputs in sequence
2. AND finding values in the input domain, and creating values in the output domain
3. AND that the input values produce the correct output values
4. ENSURES the I/O relation is satisfied.

Hmmm. Doesn't map so nicely, but the text is really too abstract. Let's try the actual problem: an HTML email to be converted in plain text to be shown on a limited device (I added some context to the equally abstract problem described in the original paper). Well, there is hardly a dominant MomentInterval here. Hardly a party, place, thing triad gravitating around the central MI. Hardly any value in adopting the DNC as a starting point. What can be helpful here? Concepts from grammars, taxonomies, language theory. We're basically modeling a translator, and language theory will give you the head start.

So here it is. The DNC is an interesting concept because it maps nicely to some recurring problem frames (if you got time to spare, you may want to investigate which frames are a good fit for the DNC). Some problem frames, however, just don't match with the DNC. It's not just about individual problems: it's about a whole class of problems, all those within the mismatched problem frames.

For me, this is actually good news. Once we know which problem frames map nicely to the DNC, I would say the DNC itself becomes a more powerful tool, one that can be applied wisely and not blindly.

Winding down: since it all began with colors, it's interesting to see how people used the DNC to reason about more general issues. For instance, in Whole Part Relationships in Color Models, David J. Anderson starts with the DNC and ends up recommending that we avoid some whole-part colors, like a green whole with a yellow part. There is probably more to investigate along those lines. Next time I get back to my idea of coloring associations and dependencies, I'll give it a deeper thought.

Labels: ,

Monday, March 03, 2008 

Domain Neutral Component

I mentioned the Domain Neutral Component when answering a comment to my post on the Cognitive Dimensions of Notations. That reminded me I've never been a fan of the DNC, exactly because of its context-independent ambitions. Recently, however, I've reconsidered the DNC in terms of Problem Frames.
It's kinda late, so more on this soon, I hope...

Labels: ,

Tuesday, December 18, 2007 

Problem Frames

I became aware of problem frames in the late 90s, while reading an excellent book from Michael Jackson (Software Requirements and Specifications; my Italian readers may want to read my review of the book, straight from 1998).

I found the concept interesting, and a few years later I decided to dig deeper by reading another book from Jackson (Problem Frames: Analyzing and structuring software development problems). Quite interesting, although definitely not a light reading. I cannot say it changed my life or heavily influenced the way I work while I analyze requirements, but it added a few interesting concepts to my bag of tricks.

Lately, I've found an interesting paper by Rebecca Wirfs-Brock, Paul Taylor and James Noble: Problem Frame Patterns: An Exploration of Patterns in the Problem Space. I encourage you to read the paper, even if you're not familiar with the concept of problem frame: in fact, it's probably the best introduction on the subject you can get anywhere.

In the final section (Assessment and Conclusions) the authors compare Problem Frames and Patterns. I'll quote a few lines here:

A problem frame is a template that arranges and describes
phenomena in the problem space, whereas a pattern maps forces to a solution in the solution space.


Patterns are about designing things. The fact that we put problem frames into pattern form demonstrates
that when people write specifications, they are designing too—they are designing the overall system, not its
internal structure. And while problem frames are firmly rooted in the problem space, to us they also suggest
solutions.


If you read that in light of what I've discussed in my latest post on Form, you may recognize that Problem Frames are about structuring and discovering context, while Design Patterns helps us structure a fitting form.
When Problem Frames suggests solutions, there is a good chance that they're helping us in the elusive game of (to quote Alexander again) bringing harmony between two intangibles: a form which we have not yet designed, and a context which we cannot properly describe.

Back to the concept of Problem Frames, I certainly hope that restating them in a pattern form will foster their adoption. Indeed, the paper above describes what is probably the closest thing to true Analysis Patterns, and may help analysts look more closely at the problem before jumping into use cases and describe the external behaviour of the system.

Labels: , , , , ,

Sunday, October 09, 2005 

Simple ideas on measuring the business value of a feature

In an earlier comment, I've mentioned the need to make a clear case for the business value of features and design choices. Since features, and in general user-level choices, are easier to discuss with management than lower-level design choices, they could make for a good training :-) about thinking at the edge between software and business.
Here is an easy to read article offering some simple, yet practical suggestions: Identifying the Business Value of What We Do. On the same website, you'll also find some nice papers on usability, and a few more dealing with business issues and software design.

Labels: ,