update: fixed some typos, thanks to the comments

One of the most spectacular flameouts in science happened last year. In a short letter (barely over 300 words long) published in Science in the very last issue of 2006, Geoffrey Chang, a crystallographer, retracted 3 Science articles, a Nature article, a PNAS article and a JMB article. The sum of 5 years of work was destroyed, apparently, over a single sign error in a data-processing program.

Geoffrey Chang was one of the youngest faculty member at the Scripps Institute, La Jolla. Winner of a slew of prestigious awards, he had made his name solving the crystal structure of the MsbA protein, a membrane protein, arguably the most difficult type of protein to crystalize, due to the greasy nature of the trans-membrane surface of the protein.

The story reads like a pulp science novel. A rising star, Geoffrey Chang solves a number of immensely difficult problem. A former labmate (from the days when Chang worked in the Rees lab) Kaspar Locher, now a scientist at the ETH Zurich Institue in Switzerland, solves a similar crystal structure, except that his crystal structure had a helix that was flipped over 180 degrees. Locher published his own paper in Nature, in 2006, setting alarm bells ringing. Just by looking at the flip, Locher realized that his former labmate’s structure contained a phasing error. An error that was “in the category of monumental blunders,” said Locher in an interview with Science.

From what I’ve read, it seems to be a genuinely honest mistake on Chang’s part. Although some commentators think that the reviewers in Science should have caught the error, I don’t think the reviewers should be held accountable. Reviewers can catch relatively straightforward calculation mistakes, but the truth of the matter is that when we review articles, we take a lot of things on faith. What we can do as reviewers, is look for methodological soundness, and apply our hard-earned intuition on the results.

Still it’s instructive to look at the nature of the error, as quoted in the retraction:

An in-house data reduction program introduced a change in sign for anomalous differences. This program, which was not part of a conventional data processing package, converted the anomalous pairs (I+ and I-) to (F- and F+), thereby introducing a sign change.

A single sign change in the data-processing program. And an in-house program at that. This should send chills up the spine of every computational biologist out there.

Don’t get me wrong, I am not questioning the use of in-house software. After all, in-house software is the bread and butter of all working computational biologists. Still, Chang explained that the in-house program was legacy software inherited from a neighboring lab.

What this should flag is the necessity to aggressively test all the software that you write. One method is to apply a unit-testing methodology to your software. Even on programs that are written by someone else (a clever grad student in my last lab found many bugs in AMBER, a huge third-party molecular dynamics program). Eeeep!

In a later interview with The Scientist, Chang said, “I deeply apologize to those that used the old structures to come up with results that are incorrect,” Chang said. “I feel pretty bad about that.” This is a severe understatement. Due to the influential nature of Chang’s 2001 paper, major grants have been awarded and denied in the study of the MsbA protein, based on the compatibility with the crystal structure. From the science news article,

David Clarke of the University of Toronto in Canada says his team had a hard time persuading journals to accept their biochemical studies that contradicted Chang’s MsbA structure. Clarke also served on grant panels on which he says Chang’s work was influential. “Those applications providing preliminary results that were not in agreement with the retracted papers were given a rough time,” he says.

I’ve heard on the grapevine that Chang is a somewhat aggressive character, and he had dismissed the work of biochemists working in the field of MsbA. Maybe he should not have been so dismissive after all.

The moral to be drawn from the story is that crystallographers ought to pay very careful attention to what biochemists say about their proteins. What comes to mind is the ATP Synthase, the work for which won the 1997 Nobel Prize. One of the winners was John E. Walker, the crystallographer who solved this massive structure.

What shouldn’t be forgetten is that one of the other winners was Paul E. Boyer, who figured out from biochemical analyses, a full 20 years earlier, much of the mechanism of how ATPase works – the existence of 6 subnits, made up of alternating domains, and how the binding site must be in the interface between the subunits. This mechanism was subsequently confirmed by Walker’s crystal structure.

Crystallography, like any other scientific discipline, does not operate in a vacuum, and only in the interstices between disciplines, can the truth be found.

Mark on 03/21 said:

Amen to unit testing.

In my experience it’s crucially important to developing anything larger than a few classes or screens of code.

To return to the overworn metaphor of programming as builing, you can think of unit testing as scaffolding. You put it in place as you build each part of your system and you put it back up whenever you want to renovate to make sure nothing falls down.

joe on 03/26 said:

I read an essay recently arguing that computer science papers should always provide a link to complete sourcecode, because otherwise how can you truly evaluate and replicate the results? Sounds like people in a few other fields should do the same…

Rolando Garza on 03/27 said:

I agree with Joe; Open Source and Science should go hand in hand… perhaps the ultimate form of ‘peer reviewing’?

Crispy on 03/27 said:

How many software people understand the needs and perils, let alone biologists…

Matt on 03/27 said:

Is the name “Loch” or “Locher”?

Alex on 03/27 said:

Richard Feynman said in a commencement lecture he gave at Caltech in 19744

“We have learned a lot from experience about how to handle some of the ways we fool ourselves.

One example: Millikan measured the charge on an electron by an experiment with falling oil drops, and got an answer which we now know not to be quite right. It’s a little bit off because he had the incorrect value for the viscosity of air. It’s interesting to look at the history of measurements of the charge of an electron, after Millikan. If you plot them as a function of time, you find that one is a little bit bigger than Millikan’s, and the next one’s a little bit bigger than that, and the next one’s a little bit bigger than that, until finally they settle down to a number which is higher.

Why didn’t they discover the new number was higher right away? It’s a thing that scientists are ashamed of – this history – because it’s apparent that people did things like this: When they got a number that was too high above Millikan’s, they thought something must be wrong – and they would look for and find a reason why something might be wrong. When they got a number close to Millikan’s value they didn’t look so hard. And so they eliminated the numbers that were too far off, and did other things like that. We’ve learned those tricks nowadays, and now we don’t have that kind of a disease.”

Perhaps the disease is now attacking science again.

asdf on 03/27 said:

One of his very own postdocs

Locher was one of Rees’ postdocs along with Chang. He wasn’t Chang’s postdoc.

See:

http://www.sciencemag.org/cgi/content/full/314/5807/1856

Paulo on 03/27 said:

Not only unit testing. Every software has to be validated against already known results. In the absence of known/good results to validate, test sets need to be created.

anonymous scientist on 03/27 said:

scientists have always been poor programmers, but modern programming systems have become MUCH harder to work with. scientists should pay professional programmers to write the code they rely on. they don’t build their own microscopes and pipettes, why are they writing their own software?

Naveen on 04/03 said:

To anonymous scientist:

The reason is that, unlike microscopes or other lab equipment, the requirements put on software by scientists are fluid, and often the current crop of programs cannot perform the type of calculations that scientists desire (especially in the field of computational biology, maybe not so much in crystallography, where the mathematical techniques for converting scattering data to electron density are well defined).

C on 07/19 said:

Actually John Walker won the Nobel prize for his contributions towards understanding the biochemical and structural mechanism of the ATPase synthase. Andrew Leslie is the crystallographer who collaborated with him and solved the structure.

yfjnvxtrl on 01/04 said:

9zmN6C nszoctovijnt, [url=http://dztosguarirp.com/]dztosguarirp[/url], [link=http://zjxiewwfmnja.com/]zjxiewwfmnja[/link], http://hmamgalmhqqj.com/