C vs. C++ Test

The template efficiency test was sort of my own idea one day, when I had nothing better to do (what a nerd, huh?). Basically, I wanted to see just how efficient the template collection classes in C++ were, and how the would compare to C. I went through several versions of tests, but finally decided on what I think was a fair test, which is documented here. I make no claims about it's accuracy, except for the items that I point out, and I acknowledge that this is by no means a complete test. I also do not intend this to be a 'which is better' issue. As with all things in life, everything has a place, C, C++, VB, you name it. I do not subscribe to language snobbery, and I have used professionally Assembly (Z80 and Intel), C, C++, VB, Delphi, and VBA.

What is efficiency?

An important question. Efficiency can be measured as either the size of the code, or, how fast it executes. If a program is large, then it could be called inefficient because it uses too much memory. If a program is slow, then it can be called inefficient because it takes to long to execute. Most of the time, but not all of the time, these two measurements are a sort of see-saw. Larger programs are actually usually faster, and smaller programs are usually smaller. For example, a macro is faster than a function, because there is no call and return overhead, but, since the macro is replicated every time you use it, it makes larger code. For this reason, many compilers have options to favor either small programs, or fast programs, but not both.

There is yet another measurement for efficiency that is often overlooked: Programmer Time. Let's face it, if we all wanted the absolute fasted and smallest code, we would be using assembly language, and hand coding and optimizing every single section of code. But, this isn't really practical. A typical Windows program can be from 500K to 1.5M in size, not due entirely to 'bloated' code, but to the simple fact that it is a large, complex program with many features. Word, is about 5M, and I'd like to see someone write that entirely in assembly language, optimizing each line of code. Maybe they would get it down to 2.5M, but does that really make a difference to me, on my 13G hard drive and 450MHz machine?

All too often, I've come across some incredibly complex and fast code in a program, that may have been rewarding for the author, but I have to think to myself, "Didn't they have anything better to do?". Who can read it? Who can modify it when they get hit by a truck? These are all legitimate business questions. I once had a valuable lesson, when as a consultant after having my code reviewed, I was told it was "too slick", and they didn't want that. They wanted something they could manage after I left the project.

And of course, all things must be put into perspective. Not everything needs to be optimized. For example, there's no need to optimize text output. Since the days of the 4.77Mhz 8088 and even before, computers have been outputting text much faster than any human can possibly read. Or, sorting strings. If you are sorting as little as several hundred strings, just about any method will do it in the blink of an eye. And if you've got a zillion strings, then you probably shouldn't be sorting it, you should be creating an index of some sort, or linked list.

Why I didn't write any C code

This may seem strange in a C vs. C++ comparison, to have no C program to compare against, but I did not write a C test program, only several C++ programs. My logic for this, was that if I had written a linked list in C that would have been as flexible as the C++ templated list class, that I most likely would have written something that would have close in efficiency to a single C++ list class. In other words, after thinking about it for awhile I decided that for a single class or C structure, there would be no appreciable difference. So, instead, I decided to test the efficiency of the C++ list class with the argument that most C programmers have about it: it's bloated.

The reason for the claim, is the correct understanding that for each templated class type you create (not instance, but type), the compiler generates a new class for that type. So, if you had 3 list objects, all maintaining integers, then it would generate one list class, for managing integers. If however, you were to declare 3 list objects one for integers, one for floats, and one for strings, then the C++ compiler would create 3 new class types, and the code for the class would be replicated 3 times.

So, after some reasoning, it became apparent that there was no need to write a C program to test against, the real issue was: how efficient are the template classes when used with multiple types.

The Test

The test I designed, was to create list classes. Lot's of them. I created 6 different programs (actually, I'm smart, I created a program to create the programs, because the source code for all the programs was over 80K) that would test the C++ list template class. The test involved 2 variables: The number of types of classes, and the number of instances of the class. The tests can be broken into two sections:

Size

The size test would test a program that would have 250 instances of the same template class against another program that would have 250 instances of the same class. The goal here was to determine, roughly, how much space each new template class took up, to address the issue of 'code bloat'. The 6 programs actually also tested 125 class types as well, just for averaging.

The size of a program that had 250 instances of one class, was 57,344 bytes in size. The program that had 250 different classes, was 69,632 bytes in size. This is a difference of 12,288 bytes. If we divide that by 250, that means that each additional list class, only added about 50 bytes to the size of the program. The program that had 125 data types, was 61,440 bytes in size. If we determine the size difference between the 1 and 125 program, and the 125 and 250 program, we get 32 bytes per class, and 66 bytes per class.

After reviewing the numbers, I must admit that I could understand why the averages came out to 50, 32, and 66 bytes per class, but it appears they did. I re-checked everything I could, but those numbers came up. It is possible that some optimization feature of the compiler handled templates differently as their numbers increased, and if I were to do more classes that the average size would continue to go up. But, again, I wanted to be practical. I chose the maximum number of 250, because the entire MFC framework is only about 243 classes.

The cost of adding the first templated class, beings in at most 4,096 bytes. Part of this may just be compiler page space, and it may be less. Portion of what the templates bring into the program, also may be brought into the program by other functions. I tested this be making a completly empty program and comparing it against a program that had one list class in it. This isn't really a very legitmate test, because of how various libraries reuse code. In otherwords, of that 4096 bytes, it's possible that some other function I needed for something else may have also brought in that 4,96 bytes, or anything from zero to 4096 bytes. (Actually, I believe part of this overhead is do to the exception handling mechanism).

Summary: Adding a templated class costs roughly 50 bytes per class, and possibly a 4096 minimum.

Speed

The templated classes that were created, should have been faster than their C counterparts, because the template class generates native code specific to the data type. For example, a list that contains integers would use the built-in '<' operator to compare the integers, like if( a < b ), where as C would have to invoke a seperate compare function. This is an example of the see-saw description above where the code is larger, but faster.

I did not test speed however. The reason being, that I wrote a program that executed 3 functions against 250 template classes, 10,000 times. This means, that I invoked 7,500,000 template functions. On my machine (450MHz), the code executed in less than 1 second! After seeing this result, I made a personal decision that even if C could do better (which I don't believe it would because of the template's inherant speed gains), I couldn't imagine it being a practical benefit.

Summary: Untested, but most likely C++ would do better.

My Time

Let's not forget about the programmer time issue as well. I would have had to of written a linked list in C from scratch, because the library does not have one. The template class however, was already written, and needed no coding or testing. All in all, it took me about 20 seconds to add a linked list to my program, where in C, it would have meant at least a couple hours of typing and testing (I have written a zillion lists, and already had one in my own personal library, so this should be considered as well).

 

Something I didn't test

C++ has still one more benefit, that I did not test. Because of it's OO abilities, I could have created all of my 250 data types derived from a single, similar base class. This means, that I would have only had to make a single templated class, and the size increases noted above, would have been eliminated. By doing this though, I would have slowed down the template class, and introduced some variables in the test that may or may not apply to a real-world application (depending on the data types and their uses). Basically, what this would do, is make the template class behave more like it's C counterpart. Whether this is a good thing, or a bad thing, depends on the task at hand, but at least you have that option.

 

Summary

So, what did I learn? I learned that the template classes are pre-defined, pre-coded, and pre-tested. They are a standard, that work on any platform that has C++. They are also (in my opinion) extremely efficient from both a space and time perspective. They may be slightly larger than a C version (only about 50 bytes per class), but I think that is well within an acceptable range. And, I believe (though this test did not prove it), they are faster than C.

You can email me if you like at mario@openroad.org and tell me what an idiot I am, or how I have no clue as to what I'm doing. But, if you do, then please supply a legitamte test to back up the claim. I'm trying to be as objective as possible. I was a C programmer for a lot longer than I was (am) a C++ programmer.

The program

The following program is a C++ program, that makes my 6 test programs. You can try it out yourself. My results were on Visual C++ 6.0, with service pack 2 applied, and the only compiler switches I used were /Zm200 and /GX. Once you compile this program, if you run it, it will make the source code to create the other 6 programs.

#include <iostream>
#include <iomanip>
#include <fstream>
#include <strstream>
using namespace std;
#pragma warning( disable:4768 )
char Code[]=
	"#include <list>\n"
	"using namespace std;\n\n"
	"#pragma warning( disable:4768 )\n\n";
void MakeSource( int Types, int Items )
{
	ofstream Dest;
	strstream File;
	int i;
	File << "listpp" << setw(3) << setfill('0') << Types << "_" << setw(3) << setfill('0') << Items << ".cpp" << '\0';
	Dest.open( File.str(), ios::trunc );
	if( !Dest )
		cout << "Error creating/trunacating " << File.str() << endl;
	else
	{
		Dest << "// This version has " << Types << " Data type(s) and " << Items << " instances" << endl;
		Dest << Code << endl;
		for( i=0; i < Types; i++ )
			Dest << "typedef struct _tagSTRUCT" << i << " { char SomeData["<<i+1<<"]; } STRUCT" << i << ";" << endl;
		Dest << "\nint main()\n{" << endl;
		for( i=0; i < Items; i++ )
			Dest << "\tlist<STRUCT" << (Types>1?i%Types:0) << "> Data" << i << ";"<<endl;
		Dest << endl;
		for( i=0; i < Items; i++ )
			Dest << "\tData" << i << ".erase( Data" << i <<".begin(), Data" << i << ".end() );"<<endl;
		Dest << "\treturn(0);\n}\n" << endl;
	}
}
int main()
{
	MakeSource( 1, 1 );
	MakeSource( 1, 125 );
	MakeSource( 1, 250 );
	MakeSource( 1, 1 );
	MakeSource( 125, 250 );
	MakeSource( 250, 250 );
		
	return(0);
}