### Abstract

The construction of efficient parallel graph algorithms is important for quickly solving problems in areas such as urban planning, social network analysis, and hardware verification. Existing GPU implementations of graph algorithms tend to be monolithic and thus contributions from the literature are typically rebuilt rather than reused. Recent work has focused on traversal-based abstractions that efficiently execute a single breadth-first search or enact algorithms in the “think like a vertex” paradigm. However, graph analytics such as the all-pairs shortest paths problem, diameter computations, betweenness centrality, and reachability querying require the execution of many such graph traversals. Typically, these traversals are independent of one another and can thus be executed in parallel. This paper presents multi-search, a simple abstraction that is designed for graph algorithms requiring many breadth-first searches that can be executed simultaneously. Although algorithms have implicitly leveraged this abstraction in the past, we provide an explicit, reusable implementation that efficiently maps this abstraction to the GPU, performing more than twice as fast as previous approaches on large graphs of varying diameter. This approach allows us to scale our APSP implementation to graphs with millions of vertices using a single GPU whereas prior approaches were either constrained to much smaller graph instances or required large supercomputers to process graphs of similar size. To show the flexibility of our abstraction, we use it to express betweenness centrality and achieve more than a 5.82x average speedup over parallel CPU implementations from existing frameworks and a 2.24x average speedup over a manual, highly optimized GPU implementation of the algorithm.

Publication

*21st IEEE International Conference on Parallel and Distributed Systems, ICPADS 2015, Melbourne, Australia, December 14-17, 2015*