Retracing the ghost of Research Past

I recently suffered a bout of scientific nostalgia.

A couple of weeks ago, someone asked me for some results from my very first paper. This paper I wrote some ten years ago, and was the fruit of a 5 1/2 years of research of my PhD thesis. It was an analysis of beta-sheets in proteins structures.

Unfortunately, I didn't have any of the code working, and it was probably lost somewhere on my hard drive, and written for Windoze. But undeterred, I promised to regenerate the results for him. After all, how hard was it to recreate 5 1/2 years of work?

So I rolled up my sleeves and went about re-implementing a large chunk of my thesis from 10 years ago. It turned out to be a remarkably fun and pain free process.

First, it only took me 2 days!

I am a much better programmer now, so the analysis I wrote was shorter, cleaner, and much more robust than the code I wrote years ago.

What was the difference?

In two words: scripting languages. When I did my PhD thesis, everything was done in an ugly mix of C and shell scripts. C, as anyone knows, can't handle text gracefully. Now I use Python, which lets my drop into the OS, gobble text like honey and lets me use hash tables and dynamic lists as to the Manor born. And it also helps that I've since developed some robust libraries to lean on.

The other surprising thing was how the minute details of the original analysis came flooding back. Whilst hammering on my code, I'd make a mistake and then get stuck. But lo-and-behold, from somewhere in the depths of my subconscious, the subtle error would be revealed, and without further ado, I would be on my way again.

I was very pleased with my code – a short 300 line Python program that calculated the twist and shear of beta-sheets of a directory of arbitrary proteins. Like the smell of a childhood toy, or the play of light outside the family home, my thesis analysis is engraved in the deep recesses of my mind.