The Polyglot Experiment

, ,

One of the traditional introductions to learning a programming language has been (since the time of K & R1) to write a program that prints "Hello, World!" or some suitable variation to the screen. In the case of C, it makes some sense to use this as a learning tool, as it introduced the concepts of the main function, data types, printf, escape sequences, the preprocessor and the standard library.

#include <stdio.h> int main(void){ printf("Hello, World!\n"); return 0; } Hello, World in C

The venerable 'Hello World' is still serving proudly as a sanity check2 on many thousands of newly set up development environments worldwide. But it has long since lost it's status as a learning tool. The reason being that modern languages tend almost always to reduce all this fun to a laconic one liner3. For example,

print "Hello, World!\n"; perl

puts "Hello, World!" ruby, or maybe tcl

print "Hello, World!" python, but could be most kinds of BASIC or possibly lua

"Hello, world!" print the io language likes to mix things up a bit

SELECT 'Hello, World!'; this should work in most dialects of SQL, and it isn't even turing-complete!

You get the picture. In unix environments people are always trying to print various strings to an output stream, and many languages earn their keep by making this easy and intuitive; so much so that there's nothing to learn. People have been taking the piss4. Frankly they're just right to do so - "Hello World" was really only intended as a minimal check, to test that the compiler, I/O and so on were operating correctly.

So "what's your alternative you smart-ass?", I hear your all chorus. "How else are we to compare programming languages and illustrate their main features?". Well, one of the other main programs that follows up Hello World is something like the factorial function, or the fibonacci sequence. Both these demonstrate the way recursion works, but they omit all the I/O necessary for a language to be useful.

The Trabb Pardo-Knuth algorithm covers data structures (arrays/lists), mathematical operations, subroutines, IO, iteration, program structure (conditionals). It is (I think) the algorithm Missy Elliot rapped about in "Work It" - read 11 numbers in, apply a function (traditionally 5x³+abs(x)½) and (in reverse order) print out the number if the result (is less than, say, 400) or "TOO BIG" if it is, well, too big. Missy glossed over a few details. In fact it was designed specifically to demonstrate the differences between imperative languages. And that's a problem with it - it's a very imperatively focused challenge.

So what else is there, something perhaps a bit more "real world" but still containing theoretical content? Something we can use to get a feel 'at a glance' for how the language looks at the world?

If you're of the opinion that real programmers don't eat quiche, or need know whether the processor is big- or little-endian before sitting down to code, you may think this is all a bit namby-pamby or 'high level'. How good is the language for real work, like implementing crypto routines? The Tiny Encryption Algorithm is nice and simple.

And surely, in this age of the intertubes, a routing algorithm like Dijkstra's algorithm should be able to satisfy the criteria of usefulness and theoretical rigor, as it's both a classic, beautiful, algorithm and in use in actual (OSPF) routers today?

And surely one of the most definitive features of computers is that they are machines which operate on symbols, converting from one representation to another? This is where the whole notion of a programming language comes from after all... What about something like the Shunting algorithm for converting between infix and reverse polish notation?

I propose to try and implement all the algorithms mentioned in this article in a set of programming languages that I would like to know more about. I hope this will be a useful learning experience and help to provide a more substantial set of trials for my own use in trying out new languages with. I hope to manage one language a month.

It may be that the tests I have chosen are inadequate - for example none of them seriously address the extent of libraries for, say, matrix operations, database abstraction, GUI programming, regular expressions or XML or whatever. This could be said to undermine the advantages of languages like PHP, Perl, Objective C, VB.Net etc.

The quality of libraries is undoubtably of the utmost importantance to getting anything done with software. The problems with trying to evaluate a language by issuing library-driven challenges (such as, say, editing MP3 metadata or retrieving emails over POP3) are twofold. First is that we are testing the language libraries' suitability for that purpose, rather than for implementing arbitrary algorithms. Secondly, the better the library, the less we learn - we could end up back with the same problem that we started with; life just isn't hard enough for us!


1. K & R: The first reference known is in fact Kernighans' earlier tutorial on the 'B' language, but I think it's safe to say that that's not what made it traditional.

2. sanity. See the C2 wiki.

3. most. Java is keeping the faith - it is necessary to understand what a 'static public void' method is before attempting to design Hello World.

4. Hello
21:49 05 Oct 2007 /code/ polyglot_expt

quick links




(RSS) (atom) whole site
(RSS) (atom) this bit


Email 'steve' at this domain.
My public key (gpg)