As capacity for data collection and storage continues to grow, data analytics requirements regularly exceed the DRAM capacity of a single machine.
In order to avoid the communication and cost overhead of distributed in-memory processing over a cluster, I present some work on using flash storage with near-storage acceleration for large-scale data analytics.
Using high-performance flash storage, FPGA-based accelerators, and cross-layer optimizations, I demonstrate that the capital and operational costs of important applications including graph analytics and key-value caches can be reduced significantly without sacrificing performance.
For graph analytics, a desktop with a low-tier FPGA accelerator was able to outperform a 32-thread Xeon server with 512 GB of DRAM, tripling the cost-effectiveness of the system.
Sang-Woo Jun is an assistant professor at the computer science department of UC Irvine, focusing on large-scale data analytics using flash storage and FPGA acceleration. Before joining UCI, he obtained his Ph.D. from MIT, working with Professor Arvind, and his BS from Seoul National University in Korea.