High-Performance Combinatorial Techniques for Analyzing Massive Dynamic Interaction Networks

Abstract

Graph-theoretic abstractions are extensively used to analyze massive data sets. Temporal data streams from socio-economic interactions, the world-wide web, communication networks, and scientific computing can be intuitively modeled as graphs. In this paper, we discuss high performance combinatorial techniques for analyzing large-scale information networks, encapsulating dynamic interaction data in the order of billions of entities. For tractable analysis of massive temporal data sets, we need holistic techniques that supplement existing static graph algorithms with relevant ideas from dynamic graph algorithms, social network analysis, and parallel algorithms for combinatorial problems. For instance, in order to design scalable parallel algorithms, it is crucial to estimate and exploit network characteristics such as the degree distribution and the graph diameter. We present a computational framework for the topological analysis of dynamic interaction data: we experiment with several graph representations, identify key analysis kernels to be optimized, and discuss parallel algorithms for large-scale graph analysis. In recent work, we have designed efficient parallel techniques for graph traversal, connectivity and centrality problems that process static graphs with billions of vertices and edges. We intend to extend these algorithms for studying temporal data, and our current research focus is on a prototype open-source toolkit for large-scale dynamic network analysis.

Publication
DIMACS Workshop on Computational Methods for Dynamic Interaction Networks, DIMACS Center, Rutgers University, Piscataway, NJ, September 24-25, 2007

http://dimacs.rutgers.edu/archive/Workshops/Dynamic/program.html

David A. Bader
David A. Bader
Distinguished Professor and Director of the Institute for Data Science

David A. Bader is a Distinguished Professor in the Department of Computer Science at New Jersey Institute of Technology.