A bounding box commonly serves as the proxy for 2D object detection. However, extending this practice to 3D detection raises sensitivity to localization error. This problem is acute on flat objects since small localization error may lead to low overlaps between the prediction and ground truth. To address this problem, this paper proposes Sphere Region Proposal Network (SphereRPN) which detects objects by learning spheres as opposed to bounding boxes. We demonstrate that spherical proposals are more robust to localization error compared to bounding boxes. The proposed SphereRPN is not only accurate but also fast. Experiment results on the standard ScanNet dataset show that the proposed SphereRPN outperforms the previous state-of-the-art methods by a large margin while being 2x to 7x faster.
Figure 12: Architecture of the proposed SphereRPN