Just how recently did we share a common ancestor with chimps? Five new super-fast computers are now crunching numbers at the University of Rochester to answer that question, part of one of biology's trickier issues: How closely related is any one species to another?
The wealth of DNA sequencing has given biologists unprecedented data to peek behind evolution's curtain. Teasing out the secrets of relationships among species requires analyzing the blueprints of life--the sequence of chemical bases of DNA represented by thousands or millions of A, C, G, and Ts. John Huelsenbeck, assistant professor of biology, devised a way to use 200-year-old math to help make sense of rapidly accumulating DNA sequence data, and now he's got the power of five near-super-computers to crank out some answers.
"Each of these machines is many times faster than your average desktop," says Huelsenbeck. "We need them to perform large phylogenetic analyses. Right now it's very difficult to perform such analyses on hundreds of DNA sequences."
Most biologists have to resort to shortcuts by performing simpler searches or by making unsatisfactory assumptions about the process of mutation, but Huelsenbeck should be able to determine a much more accurate relationship between branches of the evolutionary tree. The formula he's devised uses Bayesian math, a method of weighing new information against old which was first developed by Thomas Bayes in the 18th century. The ability to assign "weight" to each piece of information should keep small, erroneous findings from leading scientists astray. The program, called "MrBayes 1.0," is now up and running on the new computers.
"I'm already having some success performing large phylogenetic analyses of hundreds of sequences," says Huelsenbeck, who has compared hundreds of DNA sequences and inferred what evolutionary path each followed. He recently compared the genes of 357 plant species at once. "Before it would have taken months, or even years to work it out," he says, "but now I can do it in a week or two."
The five computers, which cost $75,000, were paid for by the National Science Foundation. Each machine has a gigabyte of memory and splits its tasks between two 667-megahertz 21264 alpha processors.