The Minimal Set of Equipment to do Computational Biology

08 Nov 2010 // science

In the last few months, I've been sucking up a bunch of writings by minimalists such as Leo Babauta and Everett Bogue. I had a great opportunity to put some of this advice to practice when I got an invitation to relocate to Berlin for the summer to work on some simulations of DNA interacting with the histone.

I gave up my apartment, sold a bunch of stuff, and gave away even more. I managed to fit everything into a backpack, a messenger bag and I carried around a beat-up guitar without a case. I had succeeded in whittling it down to 100 things or less and happily hopped onto a transatlantic flight to Berlin via Paris (the airplane people were always happy to accommodate my guitar in the cabin).

Still, it took me several more months to figure out how to work, as a computational biologist, with as little paraphernalia as possible. I have finally got it down to:

13 inch Macbook Pro
small bluetooth 3-button logitech mouse
2 external USB-powered WD drives
iPhone 4

This is enough to do research to publish articles in PLoS ONE

The paper problem

Before I cleared out my desk of 4 years in my old lab I had to figure out what to do with the collection of papers that I'd accumulated over 10 years of research.

The first thing I did was learn to use the app Papers from Menkentosj. It is undeniably the best program on a Mac to read PDF's. Their PDF reader is so smooth and fast such that in full-screen mode, it even kicks the pants of the native Mac OS X Preview app. I am happy to report that the digitization of science articles is moving right along as I managed to find PDF's of every paper article in my filing cabinet, even some old Feynman papers from the 1950s. I imported all of them right into Papers.

Nevertheless, reading PDF's on a computer is a new thing for me and it took one piece of technology to make it click in my brain. That technology was trackpad gestures on my new Macbook Pro. My main gripe with reading PDFs on the computer before was that it was always fiddly to navigate through an article. Scrollbars are finicky, arrow keys are a pain in the finger, whilst pressing the spacebar is a guessing game. But the swiping gesture on the new Macbook trackpad is just the most natural motion in the world. I get a wonderful kinesthetic sense swiping through the PDF of Feynman's electrodynamics paper. It certainly beats flipping through a badly-stapled bunch of old moldy pages.

And better yet, with Papers, I now have text search to find that elusive paper even if I've forgotten who wrote it.

You don't need no beautiful external monitor

I got spoilt silly in Berlin when my boss got my both an Apple 24" cinema display and a Zalman 3D display for displaying proteins in sick stereo. However I found that an over-reliance on these great toys cramped my minimalist mobile lifestyle.

Of course it's been shown that a large screen boosts your typical programmer's productivity no end. On further reflection the reason for this is mainly to help you see two or more files (code or graphs or web pages) simultaneously.

Then I discovered this little utility called SizeUp for Mac OS X which is really a bit of a throwback to ye olde window managers. All SizeUp does is it resizes your programs to exactly a half or a quarter of your screen. By choosing which edge of the screen to size the windows to, I found that I could easily display different files and images simultaneously on my little 13" Macbook, something that would have been exceedingly laborious by dragging the little sizing icon on Mac OS X windows.. Before long I got so used to the SizeUp hotkeys that I became as productive as when I used a big-ass cinema display.

As a computational structural biologist, I have to look at molecular structures. Unfortunately, none of the standard molecular programs can be used with a trackpad on the Macbook. All of them overlay vital 3D manipulations on all 3 mouse buttons (including the scroll wheel). I had bravely tried using an Apple magic mouse but it's emulation of the scroll button was very patchy. Also the magic mouse eats up battery like nothing else, so I've switched back to a little bluetooth mouse to specifically look at protein structures.

Taking notes

When you do research, it's a chaotic process and for a long time I needed a paper journal to jot down idea fragments, numbers, and bits of code. I still carry around a medium sized notebook, but more and more I've started to take notes on the computer. When I first tried to take notes on a computer, I used simple text editors but it didn't work. The problem was that the trivial necessity of naming files and saving turned out to be a significant barrier to quick note taking.

Then I discovered note taking programs, which eliminate any thinking about how you want to save the notes by providing a frictionless interface. I now swear by notational velocity but I've heard good things about yojimbo and evernote. Still, automatic syncing by notational velocity with simplenote on my iPhone 4 is a killer feature.

As far saving data in general, I've had people suggest that I use cloud storage for my data. Alas, my simulations produce data in the Gigabyte range. This would choke up any internet bandwidth I would have if I saved the simulation data with Dropbox or any other kind of cloud storage. To be mobile with commonly available Wifi connections, I have to restrict this bandwidth and so I have to save simulation data separately on an external drive. In that case, I want two copies of any simulation data and my solution is to roam around with two external drives, saving every so often.

Textbooks

After college, I looked upon my textbooks as trophies of all that learning I had done. Although I sold several that I hated, I kept some just in case I had to look things up in a textbook. But I've found that over the last few years, I have tended to go straight to wikipedia to look up basic textbook knowledge. In fact, I prefer to look up wikipedia articles through the gorgeous iPhone app Articles even when I'm working on my Macbook.

Nevertheless for advanced grad school material, there may not be good wikipedia texts. However I've found that the same texts float around in most labs so that information is normally at hand. But we are at the infancy of tablet computing and I fully expect that textbooks of the future will come in electronic form with gorgeous interactive graphics where the data will be continuously updated to the latest and greatest. Paper textbooks are doomed.

Strategic Connectivity

When you make yourself mobile to work, you are restricted by one important thing: internet availability. It takes a while to scope out cafes in your neighborhood that have 1) good wifi 2) power-outlets and 3) comfortable chairs. If you're missing any one of these, it will not be a long productive visit. It's important to live in a neighborhood where you can find public places (including libraries) that provide these features.

There are other internet alternatives. I've seen friends with MiFi, a gadget that converts a 3G data transfer signal into a wifi portal. I've also had some experience using Modem sticks, but I didn't pay for a lot of data, so it was quite frustrating. Another possibility is tethering from our smart-phones, but given the oligopoly in the US, I wouldn't hold my breath.

To do effective research, you need access to the latest papers. For a while I could VPN to my old campus and then access various journal websites. But this is un-reliable. Currently, I've scoped out various campus cafes with guest wifi. I save the URL's of papers I want to read on my browser. I make time to visit a capmus cafe, and download these PDF's in batches.

The most important reason for finding internet is that you want to connect to a remote computing cluster. No one makes production calculations on their own computers anymore. I've found that Coda is a really neat program to interact with a remote computer.

More importantly, I think the future is that we will all do our calculations on remote elastic computing clusters like Amazon EC2. This is the minimal future of computational research where we don't even care what machines do our computing for us.