Interactive Graph Analytics at Scale in Arkouda

Abstract

Data from many real-world applications often can be abstracted as graphs. However, the increasing graph size makes it impossible for existing popular exploratory data analysis tools to handle large data sets in the memory of a common laptop/personal computer. Arkouda is a framework under early development that brings together the productivity of Python at the user side with the high-performance of Chapel at the server side. In this work, a succinct double-index data structure is designed to build a static graph and the sketch of a graph stream with much less memory footprint. Two typical graph algorithms, Breadth-First Search (BFS) and triangle counting algorithms, are developed to evaluate the efficiency of the proposed graph analytics workflow. Experimental results show that our method can take advantage of distributed resources to handle large graphs. This work provides the large and rapidly growing Python community a powerful way to handle terabyte and beyond graph data using their laptops. All our methods and code have been implemented in Arkouda and are available from GitHub (https://github.com/Bader-Research/arkouda/tree/streaming).

Publication
Massive Graph Analytics