Large-Scale Network Analysis

Abstract

Centrality analysis deals with the identification of critical vertices and edges in real-world graph abstractions. Graph-theoretic centrality heuristics such as betweenness and closeness are widely used in application domains ranging from social network analysis to systems biology. In this chapter, we discuss several new results related to large-scale graph analysis using centrality indices. We present the first parallel algorithms and efficient implementations for evaluating these compute-intensive metrics. Our parallel algorithms are optimized for real-world networks, and they exploit topological properties such as the low graph diameter and unbalanced degree distributions. We evaluate centrality indices for several large-scale networks such as web crawls, protein-interaction networks (PINs), movie-actor networks, and patent citation networks that are three orders of magnitude larger than instances that can be processed by current social network analysis packages. As an application to systems biology, we present the novel case study of betweenness centrality analysis applied to eukaryotic PINs. We make an important observation that proteins with high betweenness centrality, but low degree, are abundant in the human and yeast PINs.

Publication
Graph Algorithms in the Language of Linear Algebra