A new parallel algorithm for connected components in dynamic graphs

Abstract

Social networks, communication networks, business intelligence databases, and large scientific data sources now contain hundreds of millions elements with billions of relationships. The relationships in these massive datasets are changing at ever-faster rates. Through representing these datasets as dynamic and semantic graphs of vertices and edges, it is possible to characterize the structure of the relationships and to quickly respond to queries about how the elements in the set are connected. Statically computing analytics on snapshots of these dynamic graphs is frequently not fast enough to provide current and accurate information as the graph changes. This has led to the development of dynamic graph algorithms that can maintain analytic information without resorting to full static recomputation. In this work we present a novel parallel algorithm for tracking the connected components of a dynamic graph. Our approach has a low memory requirement of O(V) and is appropriate for all graph densities. On a graph with 512 million edges, we show that our new dynamic algorithm is up to 128X faster than well-known static algorithms and that our algorithm achieves a 14X parallel speedup on a x86 64-core shared-memory system. To the best of the authors' knowledge, this is the first parallel implementation of dynamic connected components that does not eventually require static recomputation.

Publication
20th Annual International Conference on High Performance Computing, HiPC 2013, Bengaluru (Bangalore), Karnataka, India, December 18-21, 2013