Wednesday, September 03, 2008 

Horizontal scaling

I finally took the time to read BASE: An ACID Alternative by Dan Pritchett in ACM Queue (the paper is free also for non-members). It's short and simple, but highlights a frequent problem in horizontal scaling: when you start partitioning databases (or services, for that matter) you step into distributed transactions, and that's kinda slow, man.

Of course, we get into distributed transactions because we want to preserve transactional thinking and the ACID properties. However, as Dan notes, we can trade some (short-term) consistency for higher availability and performance. As you'll see, that raises the bar for infrastructure, requiring at least a persistent queuing mechanism.

The author comes from Ebay, so we can assume he knows a thing or two about scalability :-). Also, the paper is very readable, and the same techniques can be successfully used in very complex systems (I remember I used similar techniques in quite a few banking applications). The trick is to let go some of the orthodoxy about transactions, and design your data layer for scalability.

Still on the issue of scalability, I can also recommend another article from ACM Queue (again, free to non-members): Learning from THE WEB by Adam Bosworth (VP of engineering at Google).
I especially like point 3 because it's so damn unorthodox: It is acceptable to be stale much of the time. Again, it's not easy to accept this, and to adapt our applications (and requirements!) accordingly. In many cases, we just can't do it. In many (many more than people are inclined to accept) we can.

There is also a strong connection with the BASE approach above, and both requires a little of out-of-the-box thinking to be applied. We may have to tweak requirements a little, to move closer to the sweet spot between technology and business. More on this another time :-).

Labels: , ,