Clustered Systems Compete With Supercomputers

By Arik Hesseldahl

New York– How do you put the power of a supercomputer into a design that can be built with standard off-the-shelf personal computer components?

The latest addition to the National Technology Grid, the nationwide arsenal of supercomputing sites overseen by the National Computational Science Alliance, is a 128-processor supercomputing cluster known as Roadrunner based at the Albuquerque High Performance Computing Center at the University of New Mexico.

The cluster machine came not from a major OEM like SGI or Compaq, but from a small Utah-based firm called Alta Technology.

Using off-the-shelf Intel Pentium II processors running at 450 megahertz, Alta custom built the cluster for the alliance, and plans are afoot to grow the cluster to 512 processors by the end of the year.

Founded in 1989, Alta started out building board-level components for the embedded market centered on the Alpha Processor architecture, said Clark Roundy, Alta’s director of marketing.

About two years ago, the company met with some researchers at a Linux trade show.

“We basically asked them what they wanted in a clustered computing system, and they told us they wanted a system that put as much power in as small a space as possible, but using off-the-shelf component,” Roundy said.

The result was the Alta cluster, which comes in standard eight- and 16-processor models that range in price from $15,000 to $35,000, and run on the Linux operating system.

So far, most of Alta’s clients have been academic institutions and laboratories, but commercial and government clients are beginning to show an interest, Roundy said.

“We sold some systems to Boeing, which they are using for some modeling applications,” Round said. Another client is Japan’s Ministry of Industry and Trade, which just bought a system containing 256 Alpha 21264 processors running at 500MHz, he said.

However designing the box was not without serious engineering challenges. The more processors you put in a single box, the more heat you have to dissipate.

“We have had problems with the temperature in the past, but we solved them,” Roundy said. “We put fans on the front and the back,” he said. The company has also developed a software package that allows the user to remotely monitor temperatures, control power sequencing within the cluster and remotely shut down single processors as needed.

“If one node is shut down or fails, the system automatically reconfigures the node out of the system and continues on with the nodes that remain,” he said.

Larry Smarr, director of the National Computational Science Alliance, said that researchers, both in the academic and private sectors, are increasingly looking for inexpensive ways to do the complex calculations that require a supercomputer.

“The industry is constantly driving down the costs and up the performance of the PC in the consumer market, as opposed to supercomputers which are typically built of very specialized components and are therefore typically more expensive,” Smarr said.

Smarr had previously been involved with research efforts to connect 64 two-processor PCs together for supercomputer applications.

“What we wanted to do was build something that was a cross between a supercomputer and a stack of PCs,” Smarr said.

David A. Bader
David A. Bader
Distinguished Professor and Director of the Institute for Data Science

David A. Bader is a Distinguished Professor in the Department of Computer Science at New Jersey Institute of Technology.