Processing a one trillion-edge graph has recently been demonstrated by distributed graph engines running on clusters of tens to hundreds of nodes.
In this paper, we employ a single heterogeneous machine with fast storage media (e.g., NVMe SSD) and massively parallel coprocessors (e.g., Xeon Phi) to reach similar dimensions. By fully exploiting the heterogeneous devices, we design a new graph processing engine, named Mosaic, for a single machine. We propose a new locality-optimizing, space-efficient graph representation—Hilbert-ordered tiles, and a hybrid execution model that enables vertex-centric operations in fast host processors and edge-centric operations in massively parallel coprocessors.
Our evaluation shows that for smaller graphs, Mosaic consistently outperforms other state-of-the-art out-of-core engines by 3.2-58.6x and shows comparable performance to distributed graph engines. Furthermore, Mosaic can complete one iteration of the Pagerank algorithm on a trillion-edge graph in 21 minutes, outperforming a distributed disk-based engine by 9.2×.
Taesoo Kim is a Catherine M. and James E. Allchin Early Career, Assistant Professor in the School Computer Science at Georgia Tech. He also serves as the director of the Georgia Tech Systems Software and Security Center (GTS3). He is interested in building a system that has underline principles for why it should be secure. Those principles include the design of a system, analysis of its implementation, and clear separation of trusted components. His thesis work, in particular, focused on detecting and recovering from attacks on computer systems. He holds a BS from KAIST (2009), a SM (2011) and a Ph.D. (2014) from MIT in CS.