Multiple Dispatch

Dr. Carlo Pescio
Multiple Dispatch
A new approach using templates and RTTI

Published in C++ Report, June 1998.

Virtual functions allow polymorphism on a single argument; however, sometimes there is a need for multi-argument polymorphism. Double dispatch, commonly used in C++ to implement multi-methods, does not lend easily extensible code. Solutions based on function tables are difficult to implement and prevent repeated derivation. This article focuses on two new techniques based on templates and Run Time Type Identification (RTTI). The first is faster but less flexible, the second is slower, but allows new classes to be added without any need to change existing ones.

Why multiple dispatch?
Consider an extensible hierarchy of geometrical objects. It should be easy to add new kinds of shapes or specialize existing ones. It should also support several calculations, i.e., the area of the intersection of two shapes, for example:

class Shape
  {
  public :
    // ...
    virtual double 
    Intersect( const Shape& s ) = 0 ; 
  } ;

class Circle : public Shape
  {
   public :
    virtual double 
    Intersect( const Shape& s ) ; 
  } ;

class Rectangle : public Shape
  // ...

However, you cannot implement Circle::Intersect for a generic Shape object. It may be tempting to use an if/else sequence based on the dynamic type of the formal argument, as in:

double Circle : Intersect( const Shape& s )
  {
  const type_info& t = typeid( s ) ;
  if( t == typeid( Circle ) )
    // circle-circle
  else if( t == typeid( Rectangle ) )
    // circle-rectangle
  else ...
  }

Yet this is unsatisfactory because you have to update the Intersect method of each Shape-derived class every time a new class is derived. This is the kind of problem that virtual functions are supposed to eliminate, but they are limited to one-argument polymorphism. Far from being an academic curiosity, the need for multiple polymorphism arises naturally in several real-world problems. In general, the selection of the most efficient algorithm may depend on the type of more than one parameter.

Double dispatch
The simplest known solution was introduced more than 10 years ago for Smalltalk, and is usually called double dispatch¹. You can implement Intersect using double dispatch as follows:

class Circle ;
class Rectangle ;
class Shape
  {
  public :
    // ...
    virtual double 
    Intersect( const Shape& s ) = 0 ; 
    virtual double 
    Intersect( const Circle& s ) = 0 ; 
    virtual double 
    Intersect( const Rectangle& s ) = 0 ; 
  } ;

class Circle : public Shape
  {
   public :
     virtual double 
     Intersect( const Shape& s ) ; 
       {
        return( s.Intersect( *this ) ) ;
        }
     virtual double 
     Intersect( const Circle& s )  ; 
       {
       // circle-circle
       }
     virtual double 
     Intersect( const Rectangle& s )  ; 
       {
       // circle-rectangle
       }
  } ;

Note that in Circle::Intersect(const Shape&) the type of *this is Circle, so s.Intersect(*this) calls T::Intersect(const Circle&) where T is the dynamic type of s. Basically, double dispatch uses two calls to resolve on the type of both arguments. The technique is very effective in a language without static typing like Smalltalk, but in C++ it does not come without problems. The base class Shape must know about all the derived classes, resulting in circular dependencies. If you derive a new class from Shape (say Triangle), you must update the interface of Shape and the interface/implementation of all the other derived classes. In some cases this is not even an option: you may not have the source code for Shape, or not be willing or permitted to modify it.
A viable alternative consists in storing pointers to functions into an associative table, using the type_info of the classes as indexes as suggested by Bjarne Stroustrup² or the name of the classes as discussed by Scott Meyers³. Meyers goes into the details of a table-driven implementation, but the solution is rather complex and does not allow to further derive a class from ,e.g., Triangle, while still enjoying polymorphic behavior.

[ Note: If you are interested in the continuous improvement of existing solutions, made possible from language extensions, you may want to consider the interplay of a currently unsupported feature (virtual member template functions) and the good-old double dispatch. Being able to write:

class Shape
  {
  public :
    // this won't compile today...
    template< class T > virtual double 
    Intersect( const T& s ) = 0 ; 
  } ;

could be the easiest way to break the circular dependency between the base class and the derived classes.]

Desirable properties
While searching for a better solution, I've found the following properties highly desirable:

It should be possible to derive a new class without any changes to the base class.
It should be possible to derive a new class and add explicit support for it in some of the sibling classes.
It should be possible to further derive from derived classes and still enjoy polymorphic behavior. This is especially important in light of requirement 2.
It should be simple to use; ideally, as simple as virtual functions. In any case, I would rather prefer to avoid complex initialization code and the correlated problems with the order of initialization for static objects.
It should be reasonably fast. Ideally, it should be as fast as double dispatch, which is the simplest available technique for multiple dispatch in C++.

Moreover, it should be directly implemented in C++, without any need for a custom preprocessor or any extra-linguistic feature. A preprocessor-based solution has been implemented by Chatterton and Conway⁴to show the viability of multimethods in C++, but whenever is possible, I prefer to stay within the language for real-world programming.
Double dispatch fails to meet requirements 1 and 2. Property 4 is subjective, but I consider the short boiler-plate code necessary for double dispatch to be "simple".
I have devised two alternative solutions, based on templates, multiple inheritance, and RTTI. The simplest technique has all the desirable properties above except (2). Therefore, extension is still problematic and the technique is mostly useful when you can't modify the base class, for instance because it is in a shared library or you don't have the source code. The second, more complex solution satisfies all the properties above except (5), as it is considerably slower than double dispatch. Flexibility and speed are often antithetical: A wise software engineer will choose the best trade-off from a variety of techniques. Sometimes, higher flexibility and maintainability are well worth some extra microseconds in a function call.

A small step forward
If you don't plan to add several derived classes, a dependency between base and derived classes may be accepted in exchange for speed. What is normally unacceptable, however, is to have a dependency between a base class used across several projects and some project-specific derived classes.
In that case, a simple solution is to move the circular dependency inside your project-specific code, making the base class reusable in different contexts. That's the strategy behind the first technique, exemplified in Listing 1; Figure 1 shows the corresponding class diagram. I've abandoned the Shape example to gain some liberty from any specific domain.
The underlying idea is simple: the class Base (corresponding to Shape in the previous example) is expected to have a stable interface, since it will be reused in several projects. Therefore, it declares only the pure virtual function f(Base&), that will be subjected to multiple dispatch.
Now you need a place for all the project-specific versions of f that have been removed from Base. This role is played by another base class, Target, which defines a virtual function f_impl(D&) for each (project-specific) derived class D. These are the functions were the multiple polymorphism will ultimately land on. In the Shape example, here is where you implement each Intersect case: Circle::Intersect_impl(Rectangle&) would ultimately handle the intersection between a circle and a rectangle. Note that only the implementors of derived classes need to know about f_impl()and users only need to know about f().
Concrete classes (like Circle and Rectangle) should derive from both Base and Target, and implement Base::f() so that the proper f_impl() is called. This requires casting from a Base& (the parameter of f) to a Target& (where f_impl is declared), a case known as cross-casting, as it crosses the boundaries of a single inheritance hierarchy. Usually cross-casting requires a dynamic_cast, but we can dispense with that by providing a common derived class (Middle). Since Middle derives from Base and from Target, any reference to Base (that we know is really a reference to a derived class like A and B) can be converted into a reference to Target using two static_cast: from Base& to Middle&, then from Middle& to Target&. Note that static_cast is resolved at compile-time, while dynamic_cast would incur a run-time performance penalty.
The final step is to avoid the manual implementation of f (where the multiple dispatch takes place) in each concrete class. A general implementation of f requires type information about either the parameter (that we only know being a Base&) or about the receiver itself. Inside each derived class, we know the actual type of *this, but how can we move the implementation of f outside the derived classes without losing type information?
In C++, type-safe reuse is often accomplished through templates. In this case, we need to derive a template class F from Middle (so to redefine Base::f) and the parameter of the template must provide the necessary type information. Therefore, each concrete class is expected to derive from F (to inherit the implementation of f) and also to provide its own type to F as a parameter. This lead to a somewhat unusual, "recursive" declaration for all the derived classes:

class D : public F< D >
  {
  public :
   // implements all the functions
   // f_impl declared in Target
  } ;

The template class F plays two roles in this scheme: It statically encodes the dynamic type of one parameter, and implements the multiple dispatch using two static_cast (see Listing 1), avoiding the replication of boiler-plate code typical of double dispatch.
Consider now what happens when the following code is executed:<

A a ; 
B b ; 
Base& theA = a ; 
Base& theB = b ; 
theA.f( theB ) ;

First, A::f(theB) is called. This function call is dispatched to F<A>::f(theB) through normal polymorphism. Inside f(), theB is cast to a reference to Middle, then Middle::f_impl(A&)is called. Since f_impl is virtual, that means calling B::f_impl(A&), fully resolving our call via multiple polymorphism.
To conclude, note the use of virtual inheritance from Middle: this is necessary to allow further derivation from a derived class. Without virtual inheritance, a class C, as follows, would have two subobjects of class Base, and this does not correspond to the concept of IS-A:

class C : public F< C >, public A
  {
  public :
    // implements all the functions
    // f_impl declared in Target
  } ;

A quantum leap

The problem with the previous solution is that Target still depends on all the derived classes: we have factored the dependency out of Base, but is still there.
In fact, the dependency is preserved by the existence of a single class Target, which knows about all the derived classes. However, that's not mandatory: you can have a different target class for each derived class. Then you derive from TargetA if you want to handle objects of class A, and so on. Obviously, having to define a different target class for each derived class is cumbersome, so we can think of using templates again to provide reusability without losing type information. A parameterized Target class should only define a single virtual function to handle objects of class T:

template< class T > class Target 
  {
  public :
    virtual void f_impl( T& ) = 0 ;
  } ;

However, breaking a dependency is not that easy. If you remove the one-knows-all Target class, you have to remove Middle as well. Otherwise, Middle would have to be derived from all the instantiations of Target, resulting in a circular dependency again (this time, between Middle and all the derived classes). Remember that we were relying on Middle to perform cross-casting inside F: Now F must be updated to cast directly to the "right" kind of target:

template< class T > class F : 
virtual public Base
  {
  public :
    void f( Base& b )
      {
      typedef Target< T > TargetT ;
      dynamic_cast< TargetT& >(b).
      f_impl( static_cast< T& >(*this) ) ;
      }
  } ;

Note the use of dynamic_cast: this is the only way to safely perform a cross-cast in absence of a common class. Also, as we'll see, now the cast may also fail at run-time.
Under this scheme, the previous example becomes:

class A : public F< A >, 
          public Target< A >, 
          public Target< B >
  {
  public :
    void f_impl( A& ) ; 
    void f_impl( B& ) ; 
  } ;

class B : public F< B >,
          public Target< A >,
          public Target< B >
  {
  public :
    void f_impl( A& ) ;
    void f_impl( B& ) ; 
  } ;

and using the same call:

A a ;
B b ;
Base& theA = a ;
Base& theB = b ;
theA.f( theB ) ;

F<A>::f(theB) is called. Internally, theB is cast to a reference to Target<A> and *this is cast to a reference to A. Finally theB's f_impl()call is dispatched to B::f_impl(A&), fully resolving both types.
Although there is no longer a circular dependency between base and derived classes, note that the solution does not yet satisfy requirement 2.

Dealing with Inheritance
Consider this case:

class A : public F< A >, 
          public Target< A >, 
          public Target< B >,
          public Target< C >
  {
  public :
    void f_impl( A& ) ;
    void f_impl( B& ) ;
    void f_impl( C& )  ;
  } ;

class B : public F< B >,
          public Target< A >, 
          public Target< B >
  {
  public :
    void f_impl( A& ) ;
    void f_impl( B& ) ;
  } ;

class C : public F< C >,  
          public A
  {
  public :
    void f_impl( A& ) ;
    void f_impl( C& )  ;
  } ;

Here, class A knows about A, B and C. Class B only knows about A and B. Class C is derived from A and it only knows about A and C.
Although B and C don't know about each other, you want B to treat C as a kind of A. You also want C to inherit A's treatment of B.
This is beyond the capabilities of double dispatch, table-driven techniques as proposed by Meyers³, and also the code just presented.

B b ;
C c ;
Base& theC = c ;
Base& theB = b ;
theC.f( theB ) ;

As shown, F<C>::f would try to cast the actual parameter to Target<C>. However, B is not derived from Target<C>, so a bad_cast exception is thrown. Ideally, F<C>::f should be smart enough to try casting to Target<C>, and if that fails, try casting to Target<A> (since C is derived from A). If that also fails, it should propagate the exception, as it does not know how to handle a Base. Doing so requires some more work inside F and a little support from the programmer as well.
If F is supposed to know about the parent class of its parameter, this information must be somewhat provided by the parameter itself. A simple and effective technique is to provide a Parent typedef in each derived class; Moreover, Base can provide a default value (Base itself) so that if you derive your class directly from Base, you don't have any extra work. If you further derive your class from a derived class, then you have to typedef Parent to be the parent class.
Once you have the parent class information, F<T>::f can be modified to try a conversion of its parameter to Target<T>*, and if that does not work, to walk up in the class hierarchy (see listing 2). The strategy used to walk the hierarchy, namely the use of the Parent typedef, assumes single inheritance. You cannot derive a class D from A and B and expect the strategy to work; however, in most practical cases this does not represent a major concern. You can still derive a class D from (e.g.) A and from another class X not rooted in Base.
If F<T>::f reaches Base while walking the hierarchy, then the second parameter does not handle the first, or any base class of it. In that case, a bad_cast exception is thrown; I haven't provided an informative string for the exception, but this can be easily done. Note that a first-cut implementation of the hierarchy walking inside f could be:

template< class T > class F : virtual public Base
  {
  public :
    void f( Base& b )
      {
      typedef Target< T > TargetT ;
      TargetT* t = 
      dynamic_cast< TargetT* >(&b) ;
      if( t )
        t-> f_impl( static_cast<T&>
                    (*this) ) ;
      else
        {
        typedef T::Parent TParent ;
        if( typeid( TParent ) == 
            typeid( Base ) )
          throw( bad_cast("") );
        else
          {
          typedef F< TParent > FTParent ;
          static_cast<FTParent*>(this)          
          ->F<TParent>::f( b ) ;
          // navigate hierarchy +
          // static bind (f is virtual)
          }
        }
      }
  } ;

This code was indeed allowed by old compilers, but is not legal. Consider the case where T::Parent is Base and the dynamic cast fails. Then the "else" branch is taken, and since TParent is Base, an exception should be thrown. Unfortunately the code won't compile because the last portion (the second "else"), when compiled, would lead to code as:
static_cast< F<Base> *>(this)->F<Base>::f( b ) ;
But this is not related to F<Base> and the static cast fails. To make the code legal, we must prevent the code in the second "else" to be instantiated when TParent is Base. This is easily obtained using an helper class AuxF (handling the general case of navigation) with a specialization AuxF< Base > throwing an exception (see Listing 2 for details).
Note that in listing 2 class C has to define its own Parent. Also, in class C you have to override f(Base&) to overcome the ambiguity introduced by multiple inheritance: a simple forwarding to F<C>::f is all what is needed.
Finally, it is necessary to explicitly specialize F<Base>; it would be instantiated anyway, since (e.g.) F<A> requires the instantiation of F<Base>. But F<Base> would try to static_cast a Base& to F<Base>&, which is not valid. Actually, if F<Base>::f is called, the second parameter is not handling the first, and an exception must be thrown.

Compilers, compilers...
[Note: the comments on compilers reported here were valid for the code published in C++ Report. I haven't updated this part for the new code and modern compilers.]

The code in listing 2 has been compiled successfully on the Intel family using Microsoft Visual C++ 4.0 and 5.0, and on a Sun Sparc using GNU C++ 2.7.2.2 and KAI C++ 3.2b. If you use GNU C++, or any up-to-date compiler, you have to change <typeinfo.h> into <typeinfo>, since the ".h" is now deprecated, and also add a using namespace std; after the inclusion of headers. Also, the syntax for template specialization has been changed months ago, and Visual C++ 5.0 already wants a template<> in front of the explicit specialization for F<Base>.
Unfortunately, Borland's compilers (Borland C++ 4.02 and C++ Builder 1.0) seems unable to compile the code. Apparently, since Parent is in the scope of F, T::Parent is not accepted in F::f. I've found a workaround, but is rather convoluted and too long to present it here. Interested readers may send me an email, and I'll provide a sample that works on Borland's compilers.

Performance
[Note: the measures reported here have been made before publication in C++ Report, using Visual C++ 4.0 and a slightly different version of the code. I haven't updated this part for the new code and modern compilers.]

The first technique requires 21 clock ticks for each call to f on a Pentium processor using Visual C++ 4.0: this compares well with double dispatch, which requires 19 ticks on the same machine.
The second technique is considerably slower: it takes 675 ticks on the same machine, so it is about 35 times slower than double dispatch. Also, if it is necessary to walk the hierarchy (which doesn't occur frequently), the overhead increases linearly with the number of classes that are tried. Please note that 675 cycles are about 3 microseconds on a Pentium 200: in many cases, this overhead is acceptable in exchange for the higher flexibility. Also, I think there is still a lot of room for improvements in the dynamic_cast code. For instance, if you derive Target from Base using virtual inheritance (which does not change the semantics of derived classes, since they already inherit virtually from Base through F), dynamic_cast works faster and the dispatch time drops to 650 ticks. For some reason, doing that helps the implementation of dynamic_cast provided with Visual C++. I guess that as RTTI becomes more widely used, compilers will implement dynamic_cast using faster strategies, and the gap between the two techniques will narrow.

Related works
Double dispatch is at the heart of the Visitor pattern, and several authors have tried to overcome the cyclic nature of Visitor. Martin's Acyclic Visitor⁵ uses a technique similar to mine, based on multiple inheritance and RTTI. However, his approach does not support repeated inheritance and forces the programmer to implement an abstract visitor class for each concrete element (while I use templates to avoid hand-written code). Nordberg's Extrinsinc Visitor⁶ uses templates, but still force the programmer to write a Dispatch function for each concrete visitor and does not support repeated inheritance. In my approach, the Dispatch function is built through multiple inheritance from the template class F, at a purely declarative level.

Conclusions
It is always interesting to look back at old problems as new features are added to the language: in this case, using RTTI, templates, and multiple inheritance, I've provided two alternative solutions to double dispatch that are better suited to a statically typed language. Programmers can choose the one that best fit their needs for flexibility and performance. Also, though the solution is in no way simple, all the complexity is hidden to the users of your code: in fact, the interface of Base is even simpler than in the case of double dispatch.

Bibliography

Daniel H. H. Ingalls, A Simple Technique for Handling Multiple Polymorphism, Proceedings of ACM OOPSLA Conference, Portland, OR. November 1986.
Bjarne Stroustrup, The Design and Evolution of C++, Addison-Wesley, 1994.
Scott Meyers, More Effective C++, Addison-Wesley, 1996.
David F. Chatterton, Damian M. Conway, Multiple Dispatch in C++ and Java, Proceedings of TOOLS 21, Melbourne, November 1996.
Robert C. Martin, Acyclic Visitor, PLoP'96 writers workshop.
Martin E. Nordberg III, Variations on the Visitor Pattern, PLoP'96 writers workshop.

Biography
Carlo Pescio has a doctoral degree in Computer Science and is a consultant and mentor for various European companies and corporations, including the Directorate of the European Commission. He specializes in object oriented technologies and is a member of IEEE Computer Society, the ACM, and the New York Academy of Sciences. He lives in Savona, Italy and can be contacted at pescio@eptacom.net.

Listing 1.

// Shared header file

class Base
  {
  public :
    virtual ~Base() {} ;
    virtual void f( Base& b ) = 0 ;
  } ;

 
// Project-specific header file

class A ; 
class B ;
class Target
  {
  public :
    virtual void f_impl( A& ) = 0 ;
    virtual void f_impl( B& ) = 0 ;
  } ;

class Middle : public Target, public Base
  {
  } ;

template< class T > class F : virtual public Middle
  {
  protected :
    void f( Base& b )
      {
      static_cast< Middle& >( b )
      .f_impl( static_cast< T& >(*this) ) ;
      }
  } ;

 
// Project-specific implementation file

#include <iostream>
using namespace std ;

class A : public F< A >
  {
  public :
    void f_impl( A& ) 
      { cout << "A-A" << endl ; }
    void f_impl( B& ) 
      { cout << "A-B" << endl ; }
  } ;

class B : public F< B >
  {
  public :
    void f_impl( A& ) 
      { cout << "B-A" << endl ; }
    void f_impl( B& ) 
      { cout << "B-B" << endl ; }
  } ;

Listing 2.

// Shared header file

#include <typeinfo>
using namespace std ;

class Base
  {
  public :
    typedef Base Parent ;
    virtual ~Base() {} ;
    virtual void f( Base& b ) = 0 ;
  } ;

 
template< class T > class Target 
  {
  public :
    virtual void f_impl( T& ) = 0 ;
  } ;

 
template< class P > class AuxF
  {
  public :
    template< class T > static void   
    Dispatch( T* t, Base& b )
      {
      typedef F< P > FTParent ;
      static_cast< FTParent* >(t)->
      // navigate hierarchy
        F< P >::f( b ) ;
        // static bind (f is virtual)
      } 
  } ;

 
template<> class AuxF< Base >
  {
  public :
    template< class T > static void 
    Dispatch( T* t, Base& b )
      {
      throw( bad_cast("") );
      }
  } ;

template< class T > class F : virtual public Base
  {
  public :
    void f( Base& b )
      {
      typedef Target< T > TargetT ;
      TargetT* t = 
      Dynamic_cast< TargetT* >(&b) ;
      if( t )
        t->f_impl( static_cast<T&>
                   (*this) ) ;
      else
        {
        typedef T::Parent TParent ;
        AuxF< TParent >::Dispatch(
          static_cast< T* >( this ), b ) ;
        }
      }
  } ;

template<> class F< Base >
  {
  public :
    void f( Base& )
      {
      throw( bad_cast("") );
      }
  } ;

// Project-specific file

#include <iostream>
using namespace std ;

class B ;  class C ;
class A : public F< A >, 
          public Target< A >,
          public Target< B >,
          public Target< C >
  {
  public :
    void f_impl( A& ) 
      { cout << "A-A" << endl ; }
    void f_impl( B& ) 
      { cout << "A-B" << endl ; }
    void f_impl( C& ) 
      { cout << "A-C" << endl ; }
  } ;

class B : public F< B >,
          public Target< A >,
          public Target< B >
  {
  public :
    void f_impl( A& ) 
      { cout << "B-A" << endl ; }
    void f_impl( B& ) 
      { cout << "B-B" << endl ; }  
  } ;

class C : public F< C >, public A
  {
  public :
    typedef A Parent  ;
    void f( Base& b ) 
      { F<C>::f(b) ; } 
    void f_impl( A& ) 
      { cout << "C-A" << endl ; }
    void f_impl( C& ) 
      { cout << "C-C" << endl ; }  
  } ;

 
 
int main()
  {
  A a ;
  B b ;
  C c ;
  Base& x1 = a ;
  Base& x2 = b ;
  Base& x3 = c ;

  x1.f( x1 ) ; // A-A
  x2.f( x1 ) ; // A-B
  x3.f( x1 ) ; // A-C
  x1.f( x2 ) ; // B-A
  x2.f( x2 ) ; // B-B
  x3.f( x2 ) ; // B-C  => B-A
  x1.f( x3 ) ; // C-A
  x2.f( x3 ) ; // C-B  => A-B
  x3.f( x3 ) ; // C-C

  return( 0 ) ;
  }