Biofrontiers computer scientist, Aaron Clauset, brings the power of computing to unlock biological mysteries. (Photo: Patrick Campbell, University of Colorado)
Chasing the elegant solution
Stereotypes tell us that computer scientists are all about hardware, software and servers. They are all about sifting through crowded lines of code in the dim basement of the engineering school. If this is what you believe about computer scientists, Aaron Clauset is about to burst that misconception. An assistant professor in computer science and a faculty member of the Biofrontiers Institute, he is more interested in using computational tools to understand how complex biological and social systems work.
After graduating with a bachelor’s degree in physics, Clauset was impressed by the computer’s ability to simulate the real world and make predictions that could be tested in the laboratory. In addition, computers could model things that couldn’t be done on live subjects, or over impossibly long periods of time. He went on to get his Ph.D. in computer science and now develops computational tools for modeling phenomena in biological, technological and social systems.
“I saw an opportunity to use the computer as a virtual laboratory,” he says. “Although I came out of the natural sciences, I was fascinated by the complexity of messy systems like biological evolution and human behavior.”
Clauset’s timing couldn’t have been better. The complexity of biology came into focus in 2001 when the Human Genome Project was completed. More than 1,000 scientists around the world sifted through three billion bits of data in human cells to map the ordering of all human genes. This breakthrough was just the beginning of our understanding of how genomes actually build life.
"In many areas, we're practically swimming in data and it can be difficult to turn this mountain of information into actual scientific understanding," says Clauset. "The traditional approach is to drill down and isolate things from each other. But, this leaves out the interactions between those pieces that make a complex system work. So, I try to 'drill up' to get a wider view of how the pieces fit together. This often requires developing completely new mathematical and computational techniques to figure out what's important and what's not."
Clauset looks at data on a macro level, seeking patterns. For example, biologists have extensively studied how species change size as they evolve. The sizes of fish species around the world are relatively small when averaged statistically across all fish species. But the largest members of the fish family, say a whale shark, are large to an extreme—sometimes thousands or millions of times bigger than anything in their taxonomic family.
Clauset developed a deceptively simple computational model that could predict this seemingly quantum leap of evolution. First, he put a strict limit on how small a species could become and still survive, and then slowly increased the extinction rate with the size of the species. Otherwise, Darwin’s rules held: a species inherits its size from its parent, but with a small amount of variability. Surprisingly, he ignored some of the more traditional rules of evolution and ecology, filtering out species interactions and the dynamics of growing populations.
Then he let evolution unfold in his computer, over millions of years, to show that a species tendency to grow larger is offset by its tendency to become extinct more quickly. In other words, living large as a species is risky and tends to earn you a shorter time on Earth. By recreating evolution in his computer he was able to identify patterns in species size over time.
To check his work, Clauset used fossil data from extinct mammal species going back 90 million years. He was able to show that his “virtual evolution” calculations accurately reproduced both the diversity of 4,000 living species and the fossil record patterns over the past 60 million years. Clauset’s solution was an elegant one: it stripped away enough complexity to keep the computational model very compact, but was able to accurately predict something as large as global diversity of mammal sizes.
His work on this elegant solution crossed into several academic areas: biology, paleontology, mathematics and physics, in addition to computer science. A multi-discipline approach suits Clauset’s need to roam across academic subjects, and his ability to grasp the larger picture.
“You have to be interested in the synthesis of ideas to truly be interdisciplinary,” says Clauset. “To build elegant solutions around how parts interact to create a working complex system almost always requires combining ideas from multiple disciplines.”