Tuesday, August 09, 2005 

Overloading + Template = Reuse

I subscribe to (and try to read :-) a number of magazines, both on paper and online. One of those is ACM Queue, that I usually just skim. Unlikely most refereed publications, it is not unusual to find some strongly opinionated article on Queue. Yesterday I read a new paper, a critics of overloading in programming language under the funny title of Syntactic Heroin.
The author is strongly against the concept of overloading, and he tries to explain why by using C++ as an example. Unfortunately, there are a number of flaws and omissions in the article. I posted a short comment on what I consider the major fault directly on the publisher's site, but here is the long version.

- Not seeing the forest for the tree
Near the beginning of his arguments, the author says The best you can say for programmer-defined overloading is that it is only syntactic sugar - that is, it makes things look nicer, but it adds no significant programming capabilities at all.
Now, we have to agree that everything more than a Turing machine is syntactic sugar at some level. There is no need for data abstraction, functional abstraction, symbolic names for functions and variables, etc. All the languages are basically equivalent in their functional capability (this is known as the Church's Thesis-M). However, we usually consider several factors as "useful" or "significant". Among those, fostering reusability is normally considered beneficial.
Now, overloading per se does not foster reusability. However, overloading + templates encourage reusability. The major fault of the author here is to consider a language feature in isolation, and not as part of an organic language. Since the article seems particularly concerned with C++ bashing, here is some C++ code:
#include < vector >

template< class T > T Sum( const std::vector< T >& v )
{
T sum = T() ;
for( std::vector< T >::const_iterator i = v.begin(); i != v.end(); ++i )
sum += *i ;
return sum ;
}
This will sum all the items into a vector< T >, for whatever type T with a += operator defined. Of course, further generalization is possible (there is no need for v to be a std::vector, any container with a forward iterator would do). Try that without overloading, and you end up passing a function object, making Sum harder to write, call, and read at both sites. Obviously, this is just a small example, but the technique in itself is quite useful, and indeed widely used (just look at the C++ standard library).
So, Overloading + Templates = Reuse. How's that for "adding no significant programming capabilities at all"? :-)).

- Confusing features with languages
As other readers pointed out, there is a constant confusion in the article between overloading and the intricacies of built-in conversions and promotions in C++. Note that unlike other readers, I do not consider C++ to be an ugly language because of that. And here comes another flaw:

- Not recognizing the forces and rationale behind a language
C++ inherits most of the built-in conversions from C. We may or may not agree with Stroustrup's decision to be as compatible with C as possible, but criticizing C++ without an appreciation of this issue is pointless.

- Intentionally (?) making the picture worst than it is
The author goes on to say "the constructors with one parameter, will usually have been written for another purpose (constructing), and so are likely to get into the overload resolution stew inadvertently". He probably knows (being a "language lawyer" as he defines himself) that in C++ we have the explicit keyword exactly to prevent that, but he choose to forget.

- Doomsayer without statistics
Overall, the author talks a lot about the woes of overloading without providing any reasonable statistics of errors that we can trace back to overloading. Not even the usual experiment with graduate students. He goes on and on saying stuff like "Language designers, compiler writers, developers, and users all suffer" but provides no experimental evidence for his claiming.

- Using scary numbers without examples and explanations
Those 29 built-in conversions are not so pervasive as he want the readers to believe. They apply between base type (e.g. integral an pointers) and are usually quite harmless and very natural, as an int can be promoted to a double, a pointer to a const pointer, and stuff like that. Providing clear examples of how this can harm the programmer would have made the article much better.

- Using scary formulas without a context
The author says If there are n of these, there are (n*(n-1))/2 pairs but fails to say that usually you don't have all those viable functions! In most cases you have 1 viable function, so that formula gives you 0 pairs (not so scary anymore, right? :-). Say that we have 3. We get 3 pairs. So what?

- Evangelist in disguise
All that C++ bashing seems more and more like Java evangelism (Java does not have overloading) when you consider the closing statement "Those with real courage will break the addiction by refusing to use programming languages that push this drug" immediately followed by a bibliography where (out of nowhere) we find a reference to the Java Language Specification...

I could go ahead - but there is no need to. Actually, the first fault is probably so huge that nothing more was needed anyway. Instead, it is interesting to see how other readers kept criticizing C++, which is obviously not so "pure" as most people (apparently) want languages to be. In my old Interview with Bjarne Stroustrup, he said C++ has the advantage that its use scales to real world problems in many diverse application areas. Much of the ease of learning cleaner/newer languages comes by simplifications that force its users to abandon the language when they hit an application outside the domain where the "clean" language is a reasonable choice. Try to use Java for scientific computing, and see how nice is to have a Quaternion, Vector, Matrix class without operator + defined :-).
The comment on COBOL is also obviously misleaded, as in C++ you cannot change the meaning of existing operators for existing types, but only to overload them on new types. There is only one exception, that is, overloading operator ",", which in fact Stroustrup now thinks he shouldn't have permitted. I would say this is a statistically insignificant problem, but hey, I don't have experimental data to back this up :-)))).