Fetching PDB files remotely in pure Python code

10 Aug 2007 // protein

What could be more elementary for a structural biologist than getting an arbitrary PDB file from the interwebs?

In the old days, this used to be a rather difficult task which involved ordering a CD from far, far away. Even recently, I would have to ftp to the ftp site, download a zipped .z file, and de-compress. I did this with Python. Unfortunately, I had to drop into the os to do the decompression.

But now, I've found a better way: listed here as a 2 line python function that uses only the standard library.

import urllib

def fetch_pdb(id):
  url = 'http://www.rcsb.org/pdb/files/%s.pdb' % id
  return urllib.urlopen(url).read()

Thanks to the new rcsb.org site, one can look at a PDB file directly in the browser. We can thus pull it off the web-site, and avoid the ftp server, and skip the decompression step.