Phylogenies derived from gene order data may prove crucial in answering some fundamental questions in biomolecular evolution. Yet very few techniques are available for phylogenetic reconstruction based upon gene order and content, and these are (for the most part) computationally expensive. High-performance algorithm engineering offers a battery of tools that can reduce, sometimes spectacularly, the running time of existing approaches. We discuss one such such application, in which we started with the method known as ``breakpoint analysis’’ (developed by Sankoff and his colleagues) and produced a software suite, GRAPPA, that demonstrated a million-fold speedup in running time (on a variety of real and simulated datasets), by combining low-level algorithmic improvements, cache-aware programming, careful performance tuning, and massive parallelism. The phylogeny reconstruction now can be performed in parallel and attain a linear speedup with the number of processors. We show how these techniques are directly applicable to a large variety of problems in computational biology. (Supported in part by NSF Grants CAREER 00-93039, ITR 00-81404 and DEB 99-10123.)
David A. Bader is an Assistant Professor and Regents’ Lecturer in the Electrical & Computer Engineering Department at The University of New Mexico. He received his Ph.D. in Electrical Engineering in 1996 from The University of Maryland, and held an NSF CISE Postdoctoral Research Associate in Experimental Computer Science before joining the University of New Mexico. David is an NSF CAREER Award recipient, and his research focuses on developing high-performance algorithms for uniform-memory-access shared-memory machines and SMP clusters with applications in the areas of computational biology, genomics, and ecology.