Some reflections on programming in D, and why it kicks serious ass over C++, leaving it dead and broken on the sidewalk

23 Oct 2007 // programming

[some typos fixed, thanks comments]

I've finally found a worthy replacement for C++. I've used C++ a lot in the last 10 years. I've written a protein viewer in C++. I've done research for scientific papers in C++. But yet, even though I've tried to take the moral high road with the STL, C++ just feels ugly and unwieldy.

I remember the horrendous learning curve from C to C++. There was so much to get right – C++ throws out the compactness of C for a baroque welding of different programming paradigms and backwards-compatibility.

So it was with some trepidation that I tried out D, yet another saviour for C++ programmers. Now that I've been using it for a few months, I have to say, do believe the hype.

Quoting from an article on D, D is a multiparadigm statically-typed compiled language that achieves conceptual integrity. That is, the author of D, Walter Bright, is willing to say no to features, ultimately leading to cleaner and prettier code.

Here's a small but illuminating example. In D, objects are always allocated on the heap. You always create objects with "new". You can't create static objects. What good does that you? By sacrificing the ability to statically declare objects, we know that objects are always pointer-allocated. Hence, there will never be any ambiguity between obj.property and obj_pointer->property. So you can eliminate the -> operator. It's always obj.property and your code becomes that much easier to read.

While on the subject of objects, here's another call. In D, all object methods are virtual. No if's or but's – sorry, you don't get to choose. What you get is the elimination of a whole swathe of potential bugs as you try to inherit from an un-inheritable static methods. Objects are objects after all, because of inheritance. Walter Bright argues that the place to optimize is in the compiler – because of the strictness of defining virtual methods, a compiler can analyze where it would be faster to automatically substitute static methods for virtual methods. Hey, even when you think you lose, you win.

Indeed, Walter Bright comes from a very unusual place compared to most designers of computer languages. He has spent most of his career implementing C++ compilers (plus writing the odd bestselling computer game), and has designed D, in large part, from the point of a compiler writer. A lot of the features are designed to make it easier for the compiler writer.

Dynamic Arrays

The biggest gain in D are built-in dynamic arrays, with a spare and elegant notation. They clean up the horrendous eye-sore that is the STL in C++. I don't know about you but one of the biggest reasons I switched to C++ was the Standard Template Library. The STL would whisper sweetly in your ear promising that you never have to roll out your own linked list again, or dynamic array, or hash table. But at what cost? For such largesse, my code got a whole lot uglier. Ever tried using iterators?

include <vector>
vector<int> int_list;
// some kind of initialization
vector<int>::iterator iter;

for (iter = int_list.begin(); iter!=int.end(); i++)
{
  // do something with *iter
}

Butt ugly, and a pain to type out. In contrast, dynamic arrays are built into D, so that you can get the same functionality with very clean syntax.

int[] int_list;
// initialization

foreach (i; int_list)
{
  // do something with i
}

Walter Bright, like Guido in Python, and Larry in Perl, realized that most programmers use dynamic arrays all the time, and so made it part of the language. The syntax is so clean that it is, dare I say it, almost pythonic.

Half the power of dynamic typing

Don't get me wrong though, I don't use C++ now unless I have to. I do most of my day-jobs in Python. In Python, I really like dynamic typing. Whilst there are many advantages to dynamic typing, perhaps one of the greatest is that you can blissfully avoid dealing with the return type of a function, and just keep on plugging.

In D, the "auto" keyword gives you this same advantage. If you're declaring a variable that fetches a variable from a function, you declare it's type "auto" and D will look up the type for you. I've saved so much time because of this little feature. And my program is still statically-typed and correct.

Strings that fit in the language

It seems sad to say this, but C++ really should have had a decent string class a long time ago. The one that comes with the STL feels bolted on, you always to keep in mind that you have to interface with C strings at some point. Well, in D, there is a beautiful built-in string class, with lots of useful methods. Even better, since D has built-in dynamic arrays, the string class is conceived as a dynamic array on char "char[]", and you have the same powerful syntax available to you as with all other dynamic arrays, like for instance, the "foreach" loop.

One of the more interesting developments in D, is the introduction of the concatenation operator ~. At first I was puzzled by this, because in C++, you concatenate strings by "string1 + string2" so why the need for this new operator? But addition (+) is a fundamentally different operation to concatenation (~). You don't notice this in C++ because there are no built-in dynamic arrays in C++, and thus concatenation of arrays in the STL is done with methods. However, from a conceptual point of view, concatenating strings should be like concatenating any dynamic array:

int[] vals1 = [1, 2, 3];
int[] vals2 = [5, 6, 7];
int[] vals3 = vals1 ~ 4 ~ vals2; // [1, 2, 3, 4, 5, 6, 7]

char[] string1 = "Hello";
char[] string2 = "world!";
char[] string3 = string1 ~ " " ~ string2;  // "Hello world!"

There is also a slicing command for dynamic arrays:

char[] feeling = "smile";
char[] length = feeling[1..5]; // "mile"

Ever tried declaring a hash-table in C++? It's not pretty. In D, to create a dictionary of numbers with strings as a key, simply declare:

int[char[]] ages;
ages['me'] = 33;

The end result is that strings and dictionaries are just as expressive in D, as they are in Python or Perl.

No more header files, or not repeating yourself twice all the time

C was created in the days of kilobyte memory, and precious compiling cycles. Hence the header file was created out of necessity for sane compiling times, as the compiler can make assumptions just from reading the header files. C++ inherited this. The result is that for any decently sized project, you type a lot of things twice. This, obviously, violates the DRY (Don't Repeat Yourself) principle of maintainable code.

What better way to eliminate this painful redundancy in typing than to eliminate the preprocessor altogether? That's exactly what Walter Bright has done. The D compiler scans the source of all modules to reconstruct the available functions in order to check the typing. You only have to type one thing once.

But there's more. One of the interesting consequences of not having header files is that objects can't be declared over multiple source files. If you write huge-ass objects with a massive declaration and multiple source files, you're asking for trouble anyway. By forcing you to write everything in one source file, you are forced to program better. Another side-effect is that since objects are only ever declared in one file, in D, all object methods and properties are implemented inside the declaration braces:

class MyObject
{
  this()
  {
    // constructor
  }

  void my_method()
  {
  }
}

In D, you will never have to type the awfully verbose my_object::my_method again. Constructors are called "this" instead of the ridiculous notation in C++ of my_object::my_object. That's a lot of wins for readability.

Still, a lot of people complained about the loss of templates with the loss of the pre-processor, so Walter Bright caved in, and introduced templates in D, except he made it better. Better, that is, for the compiler writer. In C++, templates are declared with the characters "", which causes huge head-aches for C++ parsers, as the parsers have to untangle this from genuine greater-than and less-than operators. In D, they are declared with "!(" and ")", which cause no problems for D parsers.

Syntactic Sugar

There's also plenty of syntactic sugar in D that gives you a lot of wriggle room when designing objects. For instance, if you have an object method with no parameters that returns a value:

class MyClass
{
  int my_func();
}

You can call the function as:

auto my_object = new MyClass();
auto val = my_object.my_func;

You now have a choice of changing the function into a property without changing the code that calls the function.

With parameters in a function, the key-words "inout" tells D that it's a call-by-reference, and D will let you use the same variable. No more juggling with pointers to pass values out of a function.

Optional Garbage collection

If anything marks the philosophy of D, it is this: you get garbage collection and manual memory management. D lets you play with pointers, but keeps them in the gun cabinet when not in use. This feature above all marks Walter Bright as the über-pragmatist of language designers. I don't know of any other language that allows this. If you let your variables dangle, D will clean up after you, but if you really do need to control every bit and byte, you can, perhaps shooting off your foot in the process.

But that's the weird thing about D, although you can conceivably write operating systems and device drivers in it, you can also hack out a quick-and-dirty text-mangling script. Now that's conceptual integrity.