Monday, October 10, 2005

 

Curiously Recurring Template in C#? No way...

The Curiously Recurring Template Pattern, first presented by James Coplien in the homonymous paper in C++ Report, Feb. 1995, and independently discovered by several other programmers (including myself :-) is an interesting use of C++ Templates. It is now widely used in commercial libraries (e.g. ATL), open source (e.g. Boost) and basically whenever a clever use of templates is involved (see, for instance, my Multiple Dispatch: a new approach using templates and RTTI).
Basically, it involves a class that is derived by a template instantiated on the derived class itself. Here is a simple example of factoring out into a base class a common implementation of Clone for all the derived classes:

template< class T > class Base
{
public :
T* Clone() const
{
return new T( static_cast< const T& >( *this ) ) ;
}
} ;

class A : public Base< A >
{
// I have a clone for free, based on
// the copy constructor of A
} ;

If you're unfamiliar with it, you may want to read it twice :-).
While playing a little with generics in .NET 2.0, I tried something like that in C#. In the simplest incarnation, that would be:

class Base< T >
{
void g()
{
((T)(this)).f() ;
}
}

class Derived : Base< Derived >
{
void f() { }
}

As you may guess by now, it didn't work. You get an error message like "Cannot convert type Base< T > to T". This message alone tells you a lot about the compile-time model of generics in C# compared with the compile-time model of templates in C++.
Note for .NET fans :-) : of course, Clone() in .NET does not need the CRTP. In fact, in more than a few cases, the CRTP can be successfully substituted with a solution based on reflection (but not viceversa, although reflection is obviously slower). Still, the CRTP is a valuable code factoring technique, which is not available in .NET.
For More on the CRTP, see also this wiki. Coplien's article has also been reprinted in Stanley B. Lippman, editor, "C++ Gems", Cambridge University Press & Sigs Books, 1996. Old stuff, but still valuable; if you missed it, try to get a copy...

Comments:
Non credo sia l'unica tecnica non utilizzabile. In realtà sembra che i generics siano stati introdotti come in java solo per migliorare le classi collezioni. Tra l'altro, proprio uno degli usi che avrebbe potuto essere molto utile (tipo alcune computazioni compile-time che potevano far risparmiare un po' di JITC) non è possibile, inoltre trovo limitato il fatto che i constraint siano dati solo tramite interfacce implementate.
 
Dal punto di vista della "programmazione compile time", piu' che adottare la metaprogrammazione template (che in C++ rimane tuttora un paradigma non ufficiale e non privo di problemi) sarebbe forse carino prendere qualche spunto da D. Per chi non lo conosce, D e' un linguaggio C++ like ideato da Walter Bright, uno dei primi implementatori di un compilatore C++ commerciale. In D la keyword "static" puo' trasformare un costrutto da run-time a compile-time, es. "static if" e' un if valutato a compile time. L'idea funziona molto bene in tanti casi concreti ed e' particolarmente semplice. Trovate un esempio sul sito ufficiale di D. Incidentalmente :-) Walter usa un mio articolo apparso anni fa sul C++ Report per mostrare come in D sia molto piu' semplice ottenere lo stesso risultato.
Questi concetti si porterebbero in C# molto piu' facilmente, visto che il loro modello di compilazione dei template non puo' reggere nemmeno da lontano la metaprogrammazione, e sarebbero comunque piu' semplici per il programmatore.
 
Actually, CRTP does work, with a little bit of tweaking. If you change you code to :
class Base<T> where T:class
{
void g() { (this as T).f();
}
}

You get passed the error you had. You do, however, hit a new error (T does not has an f method). To get passed this, both Derived must implement an interface with is a cndition on T. The following code compiles:

interface IF { void f(); }

class Base<T> where T:class, IF
{
void g() { (this as T).f();
}
}
class Derived : Base<Derived>, IF
{
public void f() { }
}
 
Yes, this works, thanks!

Of course, compared with the "original" CRTP as conceived in C++, we still miss two points:

1) in C++, it's all a compile-time game: a static cast and a regular function call, no virtual dispatch. In C#, we are forced to use a dynamic cast and a virtual dispatch (through IF, actually). No big deal in most cases, except when you need to squeeze clock cycles. Still, that was one of the most frequent reasons to use the CRTP.

2) we need to define and implement an explicit interface, whereas in C++ the interface can be kept implicit (although this is not always good).

About point (2), in most cases an adapter to the explicit interface can still be synthesized on the fly (www.eptacom.net/blog/2006/08/structural-conformance-in-c.html).

So, I guess, your solution can be quite useful in practice!
 
This post has been removed by a blog administrator.
 
It seems you may still be adjusting to some of the semantic differences of the type-safe system in .NET, but I assure you what you speak of, to some degree is quite possible. Indeed there are some things that can be done with C++ templates that can't be done with Generics (including multiple inheritance), but most of those differences are mitigated with proper OO design. I never knew of this as a named pattern, but I have implemented similar patterns before, but I think that what you're missing is that there is a more elegant manner to accomplish this particular effect. The code below can actually be more elegant, but it encompasses a pattern similar to what you were trying to accomplish:

using System;
using System.Collections.Generic;
using System.Text;

namespace BlahTester
{
public abstract class UniqueClone<T> : ICloneable where T : UniqueClone<T>
{
private Guid uniqueIdentifier;

public UniqueClone() { uniqueIdentifier = Guid.NewGuid(); }

protected UniqueClone(Guid uniqueIdentifier) { this.uniqueIdentifier = uniqueIdentifier; }

public Guid UniqueIdentifier { get { return uniqueIdentifier; } }

protected abstract T GetClone();

#region ICloneable Members

object ICloneable.Clone()
{
return GetClone();
}

#endregion

public T GetIdenticalClone()
{
return GetClone();
}

public T GetUniqueClone()
{
T ret = GetClone();
ret.uniqueIdentifier = Guid.NewGuid();
return ret;
}
}

public class UniquePerson : UniqueClone<UniquePerson>
{
private string name;

public UniquePerson(string name) { this.name = name; }
protected UniquePerson(string name, Guid uniqueIdentifier) : base(uniqueIdentifier) { this.name = name; }

public string Name
{
get { return name; }
set { name = value; }
}

protected override UniquePerson GetClone()
{
return new UniquePerson(name, this.UniqueIdentifier);
}
}
}
 
Oh, hah, I thought, for some reason the date said 2007 until after I posted that. Oh well, perhaps this will still help someone.
 
What you're missing (besides the difference between 2005 and 2008 :->, a sense of perspective, and a few more important things) is that the CRTP, a.k.a. compile-time polymorphism, is used in C++ to achieve internal polymorphism without the overhead of virtual function calls.

What you propose, in the end, resolves to plain old run-time polymorphism. And yeah, dude, we know OOP pretty well, don't you worry, thanks :-).

In case you're missing it :-), just try the second example above, which in proper c++ would be:

template< class T > class Base
{
public :
void g()
{
(static_cast< T* >(this))->f() ;
}
} ;

class Derived : public Base< Derived >
{
public :
void f()
{
std::cout << "missing something? :-))" << std::endl ;
}
} ;

int main()
{
Derived d ;
d.g() ;
}

You'll soon see that using your allegedly "elegant" c# solution, you end up having no use for the generic part, and simply invoking f() as a virtual method. Duh. Great job.

There would be more to say, but I don't know if it's because of your nick, your attitude, or both, I think I'll leave that... missing : )))).
 
I apologize for misinterpreting your standing, I wasn't trying to be offensive, but it seems to me we're both misunderstanding each other. I didn't mean to sound arrogant, if that's how it was perceived, but your response really doesn't show a good example; even if I was being arrogant, stooping to "my level" doesn't really benefit anything but your ego and only momentarily for that (please don't interpret this as me trying to start a war, I enjoy discussion for the sake of personal education, I just prefer it to be civil). I made the assumption that you were experienced in C++ but new to .NET (which obviously is not likely the case now even if you were then), which I understand can be taken as offensive, but you could also see that I was actually just a person who misunderstood some of the purpose of the pattern and was just trying to help. Circumstance like that would warrant a reply that would have improved my understanding better like "Thanks for the comment, but I think that you missed the importance of this allowing polymorphic runtime extensibility as opposed to generics making use of self referencing." You also don't make note of the fact that "where T : UniqueClone<T>" mitigates the compile-time errors like "Cannot convert type Base< T > to T"

Also, I realize the generics in this particular demo are of little value, but that doesn't mean similar factoring patterns don't exist. It also guarantees more type-safety than simply implementing ICloneable, since it specifies a specific return type rather than Object. I've written a type-safe generics based configuration layer to sit on top of System.Configuration that would be a much better example, but it's quite a bit of code to post in a blog post. My configuration layer is designed to improve configuration for my dynamic extensibility framework, since type-safe dynamic configuration using the base classes can be a bit difficult. Ultimately, your example doesn't really seem to show much accomplishment either, other than the "ability" to call a method that is not technically required by the compiler, but the only "advantage" I see here is that it will allow you to compile code that doesn't specifically indicate in it's type definition a requirement to implement f() without receiving compile-time errors (if, indeed, the code compiles, sorry no compiler on this PC)... that's not to say the pattern doesn't hold value, just that the particular implementation demonstrates little value, similar to mine. Generics and templating aside, it has always been and always will be against the compilers requirements to call a method that the compiler can not guarantee exists for the type you are calling it on.

A good reread was worthwhile since I apparently missed some small details in the other referenced resources, but there is still a really obvious point to be made here: the differences between type-safe compile-time single-inheritance generics and the template-based multiple inheritance of C++. The differences are quite intentional and explicitly defined and the cons you speak of were sacrifices for pros of what is generally regarded to be a greater value: readability, maintainability and less runtime errors, which are, even in the simplest cases, much more time consuming to fix than compile time errors. You seem experienced so I assume you know well the concept of selecting the best suited environment based on the problem at hand, and I suppose to some it may have been worthwhile to mention that this won't work in .NET, but it seems to me that anybody who understands .NET generics and the code above should implicitly know it's not possible (assuming no reflection is used, which is not only slower, but also requires full-trust permission and is therefore only suitable for specific install environments). And being experienced with selecting the right environment for the job, if your goal is to avoid overhead for the sake of performance(?), you should have already known .NET isn't the right choice. I guess the real point-at-hand here is that it's obvious this won't work in .NET because it's not type-safe which is unimportant because there are other ways to accomplish the same level of extensibility *without* sacrificing maintainability. In your own wiki link, you might want to make note of the mentioning of the similar idiom in Java, or the fact that, apart from the arithmetic operator example (which has little added value/saved time) the rest of the examples are a moot point in the .NET runtime anyway. Perhaps if you could point out something that would benefit greatly from being able to implement some derivative of this pattern in .NET I might be able to understand your perspective. And I don't know what important things you think I'm missing, but that sounds like a lifelong process and the person who believes he "isn't" missing important things is a person who has ceased growth and will stagnate; as such, I'm sure there are plenty of things I don't know, but do you assume that there is nothing important that I understand that you might not?

Sorry if this comes off as another jab, reading it over I realize my response might be a bit heated, but please know that I do at least appreciate this exchange of knowledge. Thank you for humoring me thus far, emotional involvement aside I do find this fun and potentially enlightening (the missing details are now particularly intriguing ;p).

Just to ensure I respond to all of your points, I'm not sure what the concern is over my alias, but if you haven't read Orson Scott Card's books of the Ender's Game lineage you would probably find them to be very enjoyable as well as better improve your understanding of why I selected the name I have. If you would prefer, my name is Jason. With that said, I leave only an apology for how long winded I am :)
 
See Jason, sometimes the super-heroes way (they meet, they fight, they get friends) simply works better. To that end, I'll ignore the part where you (of all people :-)) try to teach me soft skills, and get to the more technical points. I'll quote you out of order (sorry) because that allows me to address the most important points first.

there is still a really obvious point to be made here: the differences between type-safe compile-time single-inheritance generics and the template-based multiple inheritance of C++.
Since you said that twice, I really have to clear this out. I don't know where you got the idea that templates in C++ are not statically type checked and hence not type safe, or that they are somehow based or related to multiple inheritance, but that's just incorrect.
For instance:

template< class T > T add( T a, T b )
{
return a + b ;
}

add( 3, 4 ) ; // compiles just fine
add( "a", "b" ) ; // compile time error

There are, of course, quite a few differences between C# generics and C++ templates, mostly:

- C++ templates do not define an explicit contract on types. They define an implicit contract. The contract is checked at instantiation time, which takes place at compile time. This has nothing to do with [multiple] inheritance. The same strategy can be adopted in a language without inheritance, or with single [implementation] inheritance.

- C# generics define an explicit contract on types, which (today) is mostly limited to the IS-A notion. This has nothing to do with single inheritance either. The same strategy could be easily adopted in a language with multiple inheritance. It's just very limiting.

Now, I must add that in the late 90s I wrote extensively against the implicit model of C++. In a language with all the implicit type conversions, promotions, and so on that C++ inherited from C, plus the one it added before explicit constructors were supported, the implicit model can easily backfire. In fact, moving C++ toward a more explicit model has been an active research field for quite a while, culminating in the adoption of concepts in the next round of standardization.

On the other hand, the C# approach is terribly limiting. Is not the explicit contract that is limiting, is the idea that IS-A is a sufficiently expressive notion to define contracts. You can't even define a parameter T and say "I want T to have + and - defined" without introducing a dummy interface, write an adapter class for all arithmetic types, and so on.

In a widely known interview, Hejlsberg talks about this idea of implicit/explicit contracts, and he says that C++ templates are more "loosely typed", which must not be read as "not strongly typed", but as "you don't have to define one or more explicit [possibly dummy] interfaces for T". He seems not to like the "loose typing" of templates.
Funny thing is, in the same interview (another section), Hejlsberg makes a similar statement when comparing C# delegates and Java-like listeners. He says delegates are more friendly and lead to better code, exactly because they are more "loosely typed", that is, you don't have to define one or more artificial interfaces just to give a type to the event listeners. He seems to like the "loose typing" of delegates quite a bit.
That's quite an inconsistency, and using C# generics is in fact a limiting experience. That said, it's just a matter of introducing a better way to specify the contract (as Hejlsberg points out in the same interview as well). They could learn a thing or two from Concepts. Anyway, it has nothing to do with single inheritance vs. multiple inheritance.

There are other features in C++ templates that are missing in C# generics, most notably partial and full specialization. Again, the concept has nothing to do with inheritance, but quite a bit to do with modularity and modular typing, the exact instantiation time, and so on. Still, without specialization we can't implement quite a few important patterns, like traits, and that leads again to limited expressive power and slower code.

it seems to me that anybody who understands .NET generics and the code above should implicitly know it's not possible
I think you read my post like "poor me, I tried this and it didn't work, and I can't even see why", while it was, like everything I write, a beacon for the rare reader, a pointer to something upon which I deemed useful to ponder. Like I said, "This [...] tells you a lot about the compile-time model of generics in C# compared with the compile-time model of templates in C++". I'm not convinced, after all the above, that I should have considered all that stuff obvious, and therefore implicit for everyone.

And being experienced with selecting the right environment for the job, if your goal is to avoid overhead for the sake of performance(?), you should have already known .NET isn't the right choice.
This is a myth. Actually, two myths. The first myth is that .NET is not the right choice if you care about performance. My experience is that we can get very good performance out of .NET in many cases. Unfortunaly, because of some limits in the language/tun-time, sometimes you can't get code that is both reusable and fast. Still, it doesn't happen all so often.
The second myth is that you can always select the right environment for the job. Sure, in many cases, you can. In others you can't, just because there is NO right environment. Take a GUI-intensive, performance-critical, domain-rich application. You may want to write the GUI in C# and the domain classes in C++, but then you'll find yourself writing a ton of bridging/glue code. Most people would accept the slightly worse performance of C#, unless things get ugly. Both choices are sub-optimal. There are also many other factors, like programmers' skills and so on. Technology is not the ultimate parameter.

I guess the real point-at-hand here is that it's obvious this won't work in .NET because it's not type-safe
C'mon. The CRTP is unsafe because of a cast. Are you saying that, as a rule, code with a cast inside doesn't compile in .NET? : ))).

Ultimately, your example doesn't really seem to show much accomplishment either, other than the "ability" to call a method that is not technically required by the compiler, but the only "advantage" I see here is that it will allow you to compile code that doesn't specifically indicate in it's type definition a requirement to implement f() without receiving compile-time errors
Well, the idea was to help you move past the Clone thing, which was just another example. The CRTP solves a very simple problem. You are surely familiar with the "template method" pattern. You have an algorithm in the base class, and the derived classes can redefine one or more steps of the algorithm. The CRTP allows you to do exactly the same thing, without the overhead of virtual function calls. Hence the (possibly better) name "compile-time polymorphism". Again, note that the CRTP does not require the implicit contract of C++, although it's arguably hard to define the contract explicitely. It also relies on a cast, which makes it unsafe, so in C++ is only used when performance AND reusability are needed.
The idea behind my short example was to help you see that unfortunately, using your approach, not only we would obviously get the virtual call overhead, but the generic parameter wouldn't get used at all (just try). Incidentally, by the way, in the same interview above Hejlsberg reminds us the performance hit of using virtual. In practice, in most cases we can ignore that overhead, but not all the time (numerics is the obvious example). For instance, the CRTP can be used in numeric algorithms to let a subclass redefine a step, even in the inner loop, without any performance hit. In fact, it has also been known for a while as "the Barton-Nackman trick", after the name of two authors who used that pattern quite a bit in "Engineering and Scientific C++". I realize that the wiki I linked is not choke-full of information :-)). Unfortunately I'm not much into wikis and I'm not going to fill the gap any time soon :-).

It also guarantees more type-safety than simply implementing ICloneable, since it specifies a specific return type rather than Object.
Interestingly, C++ is once again much better :-)), because of covariant return types, which improves type safety in several creational patterns. It's quite hard to dumb down C++ without losing a lot (that reminds me a what Hoare said about Algol 60. That said, C++ is dying, so, why bother?
 
First let me say: Awesome response. Thanks very much for your breakdown and responses. It seems in the future I may want to read your posts/comments at home instead of jumping back and forth at work, as we've definitely wasted some time on misreading/misunderstanding (clearly, until now, I've not been a regular reader and didn't know of your past; it was a google journey, but rest assured you have a convert :p).

I actually feel a bit remiss for having used templates all the while not realizing that they, unlike other platforms like generics, function more like complex macros than loose base type definitions. It's a pitty I didn't come across this pattern while I was doing more C++.

While I didn't properly understand how templating works I have had some experience with the difference of implicit and explicit contracts so it becomes a bit more clear to me now as to how this can be beneficial, though I'll say I still (probably like many) prefer the explicit model, though at the same time my opinion is likely biased due to the fact that I prefer framework design over application development, etc. I can see (now) how (static_cast< T* >(this))->f(); will raise a compiler error in the event that T doesn't implement f() but of course, without a more explicit contract it's more difficult to see what needs to be implemented; if I might share a quote from Tony that I prefer: "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

In reference to your response on using .NET for solutions that demand performance, I agree entirely that the thought that .NET carries too much overhead for high performance is typically a misconception, but to be more clear I was saying that someone who is worried about the overhead of a virtual method probably shouldn't be using .NET. I feel .NET is underappreciated in this aspect, as I've done a fair bit of playing with unsafe code using pointers (like when performing intensive graphics operations) and I think it's great, especially since you still have the freedom to utilize memory operations while still being able to leverage the pre-written plumbing that I do hate having to write/debug over and over (like reference tracking).

Similarly (and you may/probably already know this), in the event that an application can really benefit from a more native interface, I might recommend a mixed-mode C++ assembly, though I'm not so sure that mono has it implemented and there are still many pains with interacting across the native/clr boundary. Regardless, I can agree that there are times where there is no best-suited match; if it was that simple conversations like this would never take place ;) (I personally think that would be a loss for engineers :p).

Until a managed OS take the reigns, I think there will remain a good space for C++ and so I hope that people continue to strive to improve it. Also, it's a bit stagnant at the moment, but I played around with (and had high hopes for) D. If you haven't checked it out, it's possible it may interest you. It seems to me that the ideals and design goals are very interesting, but the implementation needs a bit of work.

All in all, thanks for a good *ss-whoopin; it's good to be humbled every now and then; I'm not much for updating wiki's either, but perhaps a simple link to this conversation is due? I hope we can continue to converse as we move forwards, this was quite fun.

P.S. My superhero (sidekick?) name is TheXenocide; what's yours? Wanna get matching tights? lmao
 
Hey Jason,
I knew you were one of the good guys!

A few short comments:

if I might share a quote from Tony that I prefer: "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."
Totally agree! There are also a few quotes from Einstein which points in the same direction...


in the event that an application can really benefit from a more native interface, I might recommend a mixed-mode C++ assembly
Yeap, I tend to use C++/CLI too, mostly as a glue language for legacy C++ code, or when we need to go down to the metal. It's still a lot of work when the native object model is large, but there is no free lunch anyway...

but I played around with (and had high hopes for) D. If you haven't checked it out, it's possible it may interest you. It seems to me that the ideals and design goals are very interesting, but the implementation needs a bit of work.
D is indeed quite interesting, although I've got little optimism about its success on the market. Still, being quite interested in language design, some time ago I exchanged a few emails with Walter Bright on the design of D. Although he did things very differently from what I would have, he had some excellent points on why things are better done a certain way.
Incidentally, he references an old article of mine when explaining metaprogramming in D :-). I really like the way he used "static" to simplify some metaprogramming stuff.

My superhero (sidekick?) name is TheXenocide; what's yours?
Dunno, but I've got many superpowers, like boring people to death, being invisible to women, picking the slow queue at the supermarket, and stuff like that : ))).

See you in something more... 2008! :-)
 
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?