Platform-Agnostic Lightweight Deep Learning for Garbage Collection Scheduling in SSDs

Junhyeok Jang, Donghyun Gouk, Jinwoo Shin, and Myoungsoo Jung

12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage), 2020, Poster

Flash block reclaiming, called garbage collection (GC), is the major performance bottleneck and sits on the critical path in modern SSDs. Thus, both industry and academia have paid significant attentions to address the overhead imposed by GC. To eliminate GC overhead from users’ viewpoint, there exist several studies to perform GCs at user idle times. While these scheduling methods, called background GC, are a very practical approach, the main challenge behind the background GC is to predict the exact arrival time of a next I/O request.

We propose GC-Tutor, which is a garbage collection (GC) scheduler that makes GC overhead invisible to users by precisely predicting future I/O arrival times with a deep learning algorithm. For the prediction of future arrivals, applying conventional deep neural networks (DNNs) to SSD is unfortunately an infeasible option as typical model training takes tens of hours or days. Instead, GC-Tutor leverages a light-weight online-learning method that learns the dynamic request arrival behavior with a small amount of runtime information within the target SSD. Our evaluation results show that GC-Tutor reduces the request suspending time than a conventional rule-based and DNN-only GC schedulers by 82.4% and 67.9%, respectively, while increasing the prediction accuracy by 16.9%, on average, under diverse real workloads.

Show List