Title: APINT: A Full-Stack Framework for Acceleration of Privacy-Preserving Inference of Transformers based on Garbled Circuits
Venue: ICCAD 2024
Abstract: This paper introduces APINT, the first full-stack framework to accelerate Private Inference of Transformers (PiT), a key solution to AI security challenges in cloud environments. It identifies garbled circuits (GC) as the main bottleneck in recent PiT protocols and provides a comprehensive solution, including a novel PiT protocol, GC-friendly circuit generation, netlist scheduling, and a hardware accelerator with compiler speculation to reduce latency and energy consumption. APINT achieves significant latency reductions, outperforming existing CPU-based platforms by 12.2× online and 2.2× in preprocessing. Meanwhile, the APINT accelerator further improves latency by 3.3× and reduces energy consumption by 4.6× compared to the state-of-the-art GC accelerator.
Main Figure: