An Annotated History of Molecular Dynamics

25 Feb 2009 // protein

So I've been doing a lot of reading lately, and I feel like I'm getting a feel for Molecular Dynamics (MD) simulations. In brief, I'd define MD simulations as simulations where you:

  1. replace all chemical bonds with infinite springs
  2. pretend that the electron density resides only at the nucleus of an atom and never changes
  3. treat the atoms as soft balls

Even so, MD simulations may take many forms (here I've taken the liberty to include Monte-Carlo and brownian dynamics as MD variants). They have been used profitably to study many biological and chemical problems in the field of protein chemistry. Here are some of the seminal MD papers:

  • Equation of State Calculations by Fast Computing Machines (1953)
    N Metropolis, AW Rosenbluth, MN Rosenbluth, AH Teller, Edward Teller
    The Journal of Chemical Physics 21:1087 {pdf}

    This paper introduced the Monte-Carlo technique to the solving of physical equations. It described the idea of using random numbers to project a representative subset of conformational space, whilst using the exponent of the energy as a probability filter. To use this method properly, you also need a mathematically correct random number generator, but that is a rather complicated story. One of the authors of this paper was the inspiration for Stanley Kubrick's Doctor Strangelove.

  • Computer simulation of protein folding (1975)
    M Levitt, A Warshel
    Nature 253:94. {pdf}

    Is this the first published MD simulation of a protein? Levitt and Warshel simulated the folding of the Bovine Pancreatic Trypsin Inhibtor. There are so many firsts here: Langevin term for dynamics, contact maps, simplified models. What surprises me is that after 35 years of simplified models, coarse-grained models don't really seem to do much better than the results presented here.

  • Thermodynamic fluctuations in protein molecules (1976)
    A Cooper
    PNAS 73:2740 {pdf}

    It's only in the last decade or so that protein theorists have really grappled with protein fluctuations from ideas such as the conformational ensemble, free-energy funnel, single-domain allostery, and intrinsically unstructured loops. All these ideas look back to this prescient paper, which used broad thermodynamic arguments to argue that proteins must undergo large fluctuations. This paper was a bit before its time, written well before techniques that measure fluctuations were invented.

  • Dynamics of ligand binding to heme protein (1979)
    DA Case, M Karplus
    J Mol Biol 132:343

    This is arguably the first simulation of ligand moving through the protein. At this early stage in the game, they could either fix the protein and watch the oxygen bounce around, or let individual sidechains get hit by the oxygen. Each oxygen atom were simulated for 3.75 ps. For these pioneers it was a surprise to see that the oxygen bounces around the inside of the myoglobin, without getting too far. Nevetheless, they identified 2 different pathways for the oxygen to travel into the binding site.

  • A geometric approach to macromolecule-ligand interaction (1982)
    TD Kuntz, JM Blaney, SJ Oatley, R Langridg, TE Ferrin
    J Mol Biol 161:269

    The grand-daddy of ligand-binding screening papers. This is the paper that laid down the basis for DOCK, and serves as the starting point of the entire pharmaceutical computation industry.

  • Dynamical theory of activated processes in globular proteins (1982)
    S H Northrup, M R Pear, C Y Lee, J A McCammon, M Karplus
    PNAS 82:4035 {pdf}

    Umbrella sampling is the most popular method of exploring large conformational changes in MD. In this paper, Karplus and friends modeled a rather more modest conformational change: the swinging of an aromatic residue sidechain. From the simulations, they generated a free-energy surface, from which they calculated a sidechain flipping rate. This paper is important not just for simulating the first sidechain flip but also for introducing proteins to "umbrella sampling".

  • Harmonic dynamics of proteins: Normal modes and fluctuations in bovine pancreatic trypsin inhibitor (1983)
    B Brooks, M Karplus
    PNAS 80:6571 {pdf}

    First application of normal modes to identify low-frequency oscillations using the energy minimum of the molecular mechanics force-field of a protein. This is the basic technique to identify domain-level motions in a protein.

  • Accurate simulation of protein dynamics in solution (1988)
    M Levitt and R Sharon
    PNAS 85:7557 {pdf}

    First simulation of a protein in explicit waters. Suddenly, acceptable computer resources got a whole lot more expensive.

  • A method to explore transition paths in macromolecules. Applications to hemoglobin and phosphoglycerate kinase (1995)
    Christophe Guilbert, David Perahia, Liliane Mouawad
    Computer Physics Communications 91:263 {pdf}

    First description of the RMSD potential, a powerful method for identifying low-energy pathways in the neighborhood of a given static structure.

  • Role of hydration and water structure in biological and colloidal interactions (1996)
    J. Israelachvili, and H. Wennerström
    Nature 379:219 {link}

    This review shows that water molecules can have structuring effects of several Ångstroms. Flags the importance of using explicit water molecules.

  • Ligand Binding: Molecular Mechanics Calculation of the Streptavidin-Biotin Rupture Force (1996)
    Helmut Grubmüller, Berthold Heymann, Paul Tavan
    Science 271:997 {link}

    First paper to use Steered Molecular Dynamics to pull ligands out of proteins, in this case, the classic example of pulling biotin out of the streptavidin protein. The beauty of this simulation is that unlike most simulations, it closely models an actual experiment - the Atomic Force Microscopy experiments that measure the rupture force of the pulling of biotin.

  • Assembly of Protein Tertiary Structures from Fragments with Similar Local Sequences using Simulated Annealing and Bayesian Scoring Functions (1997)
    Kim T. Simons, Charles Kooperberg, Enoch Huang and David Baker
    JMB 268:209 {pdf}

    This paper describes the heart of Rosetta, the most successful ab initio protein folding program ever. This program is remarkable precisely because it has succeeded where so many others have failed. In hind-sight, there are two principal insights contained in this paper. First, Baker found that certain fragments have well-defined structure, and used these as starting points in the search. This was kind of a heretical idea at the time because no one else had been able to simulate these fragments in MD. They still can't. Instead Baker trawled the PDB database to find suitable fragments. The second insight was that you should use proper statistical techniques in scoring functions. Baker was the first to use Bayesian statistics to integrate the salad of empircal scoring terms in protein-folding (of course this would imply that rather ham-fisted statistical reasoning had been used in the past but then you'd probably be right).

  • Nonequilibrium Equality for Free Energy Differences (1997)
    C. Jarzynski
    Physical Review Letters 78:2690 {pdf}

    This paper introduced a remarkable therodynamic result known as Jarzynski's Equality, namely <exp(-βW)> = exp(-βΔF). In classic thermodynamics, you're taught that free-energies F and the like can only be measured if you do reversible work δW, i.e. forces applied very lightly. However, Jarzynski's Equality says that you can actually get equilibrium free-energies from any type of irreversible work on a system, with the proviso that you repeat the work, and average the results properly. The devil, of course, is in the averaging. But still, this epochal result has permitted a whole generation of theorists to perform simulations at non-equilibrium, but then derive equilibrium values from them.

  • Contact Order, Transition State Placement and the Refolding Rates of Single Domain Proteins (1998)
    Kevin W. Plaxco, Kim T. Simons and David Baker
    JMB 277:985 {pdf}

    There aren't many simple models of protein folding, but this is one relationship that seems to hold. This paper has spawned a whole industry of papers explaining why the "contact order" correlates with the folding rate for a bunch of small single domain proteins.

  • Unfolding of Titin Immunoglobulin Domains by Steered Molecular Dynamics Simulation (1998)
    H Lu, B Isralewitz, A Krammer, V Vogel, Klaus Schulten
    Biophysical J 75:662 {pdf}

    In this classic study, Steered Molecular Dynamics is used to pull apart the two ends of a protein that make up the elastic material in our muscle cells. The simulations shows in exquisite detail, how the protein unfolds and the different stages of the unfolding. The simulations correlate impressively with the elasticity of the protein as measured by Atomic Force Microscopy.

  • Pathways to a Protein Folding Intermediate Observed in a 1-Microsecond Simulation in Aqueous Solution (1998)
    Yong Duan, Peter A. Kollman
    Science 282:740 {link}

    This paper is as much cited for how long it took as for what it did. The first reported 1 microsecond MD simulation, it was a mamoth effort for the late 90's, really pushing the technology of parallel clusters, a technology which we all pretty much take for granted now. They tried to fold a tiny protein, the villin headpiece subdoamin, and got some of the way.

  • Replica-exchange molecular dynamics method for protein folding (1999)
    Yuji Sugita and Yuko Okamotoa
    Chemical Physics Letters 314:141 {link}

    This brilliant technique allows you to sample proteins at high-temperatures and low-temperatures on parallel clusters, thus giving a far better sample of the equilibrium ensemble than you might get if you ran an MD simulation for a very long time. It was the perfect technique to put your new Beowolf cluster to use.

  • Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences (1999)
    Gavin E. Crooks
    Physical Review E 60:2721 {pdf}

    This paper generalizes the Jarzynski's Equality, not just to an thermodynamic ensemble, but to any stochastic microscopically reversible dynamic system. It's known as the Crooks fluctuation theorem: P(+σ)/P(-σ) ~= exp(τσ). With this theorem, you might be able to derive the thermodynamics from a single trajectory that hasn't yet run for the age of the universe.

  • Water activity as the determinant for homogeneous ice nucleation in aqueous solutions (2000)
    Thomas Koop, Beiping Luo, Athanasios Tsias & Thomas Peter
    Nature 406:611 {link}

    Just to give you an idea of how hard it is to simulate water molecules, this paper is the first ever MD simulation of water freezing into ice, and given the work chemists have done on water, this is a long overdue result. Indeed this results brings to mind one of the limitations of explicit waters used in standard MD packages: most common force-fields are optimized to room temperature.

  • Energetics of ion conduction through the K+ channel (2001)
    S Bernèche, B Roux
    Nature 414:73

    This beautiful study uses umbrella sampling to identify all the positions of the K+ ion along the KcsA K+ membrane channel. They show that the channel for K+ is virtually barrierless, hence it is a diffusion controlled process but more impressively, they identify two K+ sites just outside the channel, which was subsequently identified by electron density in a high-resolution structure.

  • The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method (2004)
    Shankar Kumar, John M. Rosenberg, Djamal Bouzida, Robert H. Swendsen, Peter A. Kollman
    Journal of Computational Chemistry 13:1011 {link}

    Although Jarzynski's Equality and Crook's Fluctuation Theorem argues that you can, in principle, calculate equilibrium values from non-equilibrium simulations, actually doing it is hard. This paper provides one of the most popular recipes for performing this transformation, and shows you when you can't. Essential for replica-exchange. It is also one of the most opaquely written papers I've ever read.

  • Design of a Novel Globular Protein Fold with Atomic-Level Accuracy (2004)
    Brian Kuhlman, Gautam Dantas, Gregory C. Ireton, Gabriele Varani, Barry L. Stoddard, David Baker
    Science 302:1364 {link}

    First computational design of a artifical protein fold. AN ARTIFICIAL FUCKING FOLD. No more needs to be said.

  • Domain swapping is a consequence of minimal frustration (2004)
    Sichun Yang, Samuel S. Cho, Yaakov Levy, Margaret S. Cheung, Herbert Levine, Peter G. Wolynes, and José N. Onuchic
    PNAS 101:13786 {pdf}

    I am not normally a fan of the Go model, but this is a particularly clever application where you can use it to find something new. Take the structure of a monomer, define the Go potential. Throw two copies of the monomer into the simulation and watch it dimerize.

  • Mechanism of Na+/H+ Antiporting (2007)
    Isaiah T. Arkin, Huafeng Xu, Morten Ø . Jensen, Eyal Arbely, Estelle R. Bennett, Kevin J. Bowers, Edmond Chow, Ron O. Dror, Michael P. Eastwood, Ravenna Flitman-Tene, Brent A. Gregersen, John L. Klepeis, István Kolossváry, Yibing Shan, David E. Shaw
    Science 317:799 {link}

    Here is a tough problem in MD: given that by definition MD won't let you break chemical bonds, what do you do when you want to study something trivial like proton hopping, which involves the making and breaking of bonds? Well, the guys at DE Shaw got round the problem by simulating the gatekeeping residues of a membrane antiporter in every possible protonation state. This inolvolved ridiculously long simulations - maybe even the longest simulations I've seen. Collecting the stats for all the different states gave a pretty concrete picture of the pumping mechanism of Na+ that involved a proton hop. Oh but don't use the word "pump" around channel guys because they hate that word. Call it an antiporter instead.

  • Motifs for molecular recognition exploiting hydrophobic enclosure in protein-ligand binding (2007)
    Tom Young, Robert Abel, Byungchan Kim, Bruce J. Berne, and Richard A. Friesner
    PNAS 104:808 {pdf}

    Explicits waters are simulated all the time, but they often seem like that relative that you have to invite to family gatherings but whom everybody finds annoying once they're there. In contrast, the waters are front-and-center in this study. Friesner and friends found an efficient way to identify ordered waters in binding sites. What's great about this paper is that ordered waters provide a completely new mechanism to explain the tight binding affinity of biotin to streptavidin. A chunk of the binding affinity is due to the displacement by biotin of an unfavorable ordered water.

  • Kemp elimination catalysts by computational enzyme design (2008)
    Daniela Röthlisberger, Olga Khersonsky, Andrew M. Wollacott, Lin Jiang, Jason DeChancie, Jamie Betker, Jasmine L. Gallaher, Eric A. Althoff, Alexandre Zanghellini, Orly Dym, Shira Albeck, Kendall N. Houk, Dan S. Tawfik & David Baker
    Nature 453:190 {link}

    Does this paper have Nobel Prize written all over it? If this paper turns out to be the harbinger of a reliable technique, then David Baker is off to Stockholm. Leveraging his incredible success with the protein folding system Rosetta, Baker and coworkers design a completely novel enzymatic reaction using a TIM barrel scaffold. The enzymatic rate is weak but definitely enhanced.