A Supercomputing Wulf in PC Clothing

A Supercomputing Wulf in PC ClothingIt has become a ritual at businesses and colleges across the world: get rid of computers that have become obsolete in just two or three years—and replace them with new machines.


But would the purging be done so often if the system administrators knew that these so-called obsolete machines could easily be linked together to create a supercomputer that would otherwise cost hundreds of thousands, if not millions of dollars?


That’s the idea that a Wesleyan professor of physics and two of his students pursued. What began as an experiment using computers headed for the nearest dumpster has blossomed into a dynamic and growing collection of PCs with varying individual computing power that have been connected to create one of the 100 most powerful computer clusters in the world. Perhaps best of all, the cluster is doing the work of a $1-million-plus supercomputer for about one-twentieth of what one of those machines would cost.


Wesleyan’s cluster, known as “WesWulf,” had its beginnings in late 1998 when Vasilios Hoffman ’02, then a physics undergrad, read about socalled “Beowulf clusters” touted by NASA. The concept was to get supercomputing power out of PCs by linking them physically and reprogramming them in UNIX to act as a single, multiprocessor computer. Hoffman also heard that Wesleyan had just hired a new physics professor who had a generous start-up budget.


“I was still living in Germany and had barely accepted the faculty position at Wesleyan when I got an e-mail from a very excited student named Hoffman who proposed that I use a percentage of my start-up funds to build something called a Beowulf cluster,” says Rinehold Blümel, professor of physics. “I remember thinking, ‘Pushy, these Wesleyan students.’ And I soon forgot all about his e-mail.”


A few months later, when Blümel began settling in to his new position, he realized that his work in computational physics would require him either to rent expensive time on a supercomputer or somehow come up with the funds to buy one. Suddenly his generous startup budget was looking very tight.


“It was then that I remembered Vasilios’s e-mail,” he says.


Blümel contacted the undergraduate student and began to ask more questions about computer clusters. Hoffman spoke excitedly about NASA’s experience with the clusters and how the PCs properly linked could produce the same computing power of a single supercomputer. He also produced documentation supporting his claims. Blümel was genuinely intrigued.


“Maybe it was the NASA endorsement or the torrent of UNIX jargon and enthusiasm that Vasilios bubbled over with,” Blümel says. “In any case I thought I’d give it a whirl.”


A qualified whirl, that is.


Blümel wanted tangible proof that a cluster would work as advertised, before he committed any part of his budget. He asked Hoffman to build a proof-of-concept system. Hoffman told him it wouldn’t be a problem. All they had to do was a little dumpster diving.


“It could be done with castoffs—computers that had recently been replaced and slated for disposal,” says Hoffman.”


Luckily, Henk Meij, Wesleyan’s social science computing manager, saved Hoffman and Blümel the trouble of picking through the trash. Meji had just replaced several 486mhz PCs. Blümel and Hoffman had their pick. They grabbed 12 of the machines and in the summer of 1999 built the test cluster.


“We were very proud of it,” Blümel says. “What would have otherwise ended up in a landfill was now a 12-node cluster that was as powerful as the top-of-the-line desktop that I had bought in January 1999, for $10,000. And the cluster didn’t cost us a single cent.”


They decided to call their creation “WesWulf I.”


Convinced that the idea would work, Blümel took $20,000 of his start-up money and bought 40 Pentium-III computers in the early fall of 1999. He also brought on board his graduate student, Thomas Clausen, a UNIX wiz who had trained at the prestigious Niels Bohr Institute in Denmark. Clausen used open source software and wrote original computer code.


“It took me about three months to set the cluster up and get it to run the way I wanted,” says Clausen, who is now a postdoctoral fellow in the physics department. “If we were to do this today, however, it would be much easier. There is more open source (i.e., free) code available on the Internet. Or one can buy the software on a single disc and install it much like Microsoft products are installed.”


Clausen stops and smiles.


“This works much better than Windows, however.”


According to Blümel, the 40-node cluster dubbed “WesWulf II” that he, Clausen, and Hoffman built in late 1999 for $20,000 gave them the computing power of a supercomputer that would have cost 50 times that amount. It was also completely scalable. Blümel and Clausen have periodically added computers so that today it is a 90-node cluster that can perform about 80 billion operations per second.


“This is enough computing power to rank us among the top 100 cluster computers in the world,” Blümel says.


Along with Blümel’s own work in computational physics and other physics department computing, the cluster has been used by faculty in the astronomy department who are modeling galaxies, and by biology professors who are doing molecular modeling. But Beowulf clusters are perfect for any large data-crunching tasks.


“Stock calculations, geological surveys typically done by oil companies, genetics research—these are all areas that could use these clusters,” Hoffman, who is now Wesleyan’s UNIX administrator, says. “I even read about a guitar string company that switched over from desktops to a cluster and saw immediate savings and increases in productivity.”


Blümel says that the relatively cheap cost of Beowulf clusters makes them perfect for universities and small businesses. Best of all, adding machines to the cluster is easy. All that’s needed is a working air conditioning system to keep the machines cool and a dedicated space for the PCs. WesWulf II is kept in a windowless 8-by-12-foot room.


“If you figure that recently decommissioned PCs are about three generations behind the top of the line,” Blümel says, “then 10 castoffs would make a cluster as powerful as a new machine. Fifty castoffs outstrip a brand new top-of-the-line computer by a very respectable factor of five.” Blümel smiles as he closes the door to WesWulf II’s room.


“What could be better?” he says. “This is such an attractive option for economically and ecologically forwardthinking folks like, well, all of us!”


Download a PDF of the complete article HERE