Phylogenies derived from gene order data may prove crucial in answering some fundamental questions in biomolecular evolution. Yet very few techniques are available for phylogenetic reconstruction based upon gene order and content, and these are (for the most part) computationally expensive. High-performance algorithm engineering offers a battery of tools that can reduce, sometimes spectacularly, the running time of existing approaches. We discuss one such such application, in which we started with the method known as ``breakpoint analysis’’ (developed by Sankoff and his colleagues) and produced a software suite, GRAPPA, that demonstrated over a billion-fold speedup in running time (on a variety of real and simulated datasets), by combining low-level algorithmic improvements, cache-aware programming, careful performance tuning, and massive parallelism. The phylogeny reconstruction now can be performed in parallel and attain a linear speedup with the number of processors. We show how these techniques are directly applicable to a large variety of problems in computational biology.
David A. Bader is an Associate Professor and Regents’ Lecturer in the Departments of Electrical and Computer Engineering and Computer Science at The University of New Mexico (UNM). He received his Ph.D. in Electrical Engineering in 1996 from The University of Maryland, and was awarded a National Science Foundation (NSF) Postdoctoral Research Associateship in Experimental Computer Science before joining UNM in 1998. He is an NSF CAREER Award recipient, an investigator on several NSF awards, a distinguished speaker in the IEEE Computer Society Distinguished Visitors Program, and is a member of the IBM PERCS team for the DARPA High Productivity Computing Systems program. Dr. Bader serves on the Steering Committees of the IPDPS and HiPC conferences, and is the General co-Chair for IPDPS (2004–2005), and Vice General Chair for HiPC (2002–2004). David has previously chaired several conference program committees, is the Program Chair for HiPC 2005, and a Program Vice-Chair for IPDPS 2006. He has served on numerous conference program committees related to parallel processing, is an associate editor for the IEEE Transactions on Parallel and Distributed Systems and the ACM Journal of Experimental Algorithmics, a Senior Member of the IEEE Computer Society, and a Member of the ACM. Dr. Bader has been a pioneer the field of high-performance computing for problems in bioinformatics and computational genomics. He has co-chaired a series of meetings, the IEEE International Workshop on High-Performance Computational Biology (HiCOMB), written several book chapters, and co-edited a special issue of the Journal of Parallel and Distributed Computing on High-Performance Computational Biology. He has co-authored over 65 articles in peer-reviewed journals and conferences, and his main areas of research are in parallel algorithms, combinatorial optimization, and computational biology and genomics.