(June 9) 3D Recurrent Reconstruction Neural Network (3D-R2N2)

2016.06.09
410

Subject

3D Recurrent Reconstruction Neural Network (3D-R2N2)

Date

2016/06/09 (Thursday) 16:00

Speaker

Bongsoo Choy/Stanford University

Place

N1 Building #201

Overview:

Inspired by the recent success of methods that employ shape priors to achieve robust 3D reconstructions, we propose a novel recurrent neural network architecture that we call the 3D Recurrent Reconstruction Neural Network (3D-R2N2). The network learns a mapping from images of objects to their underlying 3D shapes from a large collection of synthetic data. Our network takes in one or more images of an object instance from arbitrary viewpoints and outputs a reconstruction of the object in the form of a 3D occupancy grid. Unlike most of the previous works, our network does not require any image annotations or object class labels for training or testing. Our extensive experimental analysis shows that our reconstruction framework i) outperforms the state-of-the-art methods for single view reconstruction, and ii) enables the 3D reconstruction of objects in situations when traditional SFM/SLAM methods fail (because of lack of texture and/or wide baseline).

If the time permits, I’ll go over our latest work on visual/semantic correspondence: Universal Correspondence Network

We present a deep learning framework for accurate visual correspondences and demonstrate its effectiveness for both geometric and semantic matching, spanning across rigid motions to intra-class shape or appearance variations. In contrast to previous CNN-based approaches that optimize a surrogate patch similarity objective, we use deep metric learning to directly learn a feature space that preserves either geometric or semantic similarity. Our fully convolutional architecture, along with a novel correspondence contrastive loss allows faster training by effective reuse of computations, accurate gradient computation through the use of thousands of examples per image pair and faster testing with O(n) feedforward passes for n keypoints, instead of O(n^2) for typical patch similarity methods. We propose a convolutional spatial transformer to mimic patch normalization in traditional features like SIFT, which is shown to dramatically boost accuracy for semantic correspondences across intra-class shape variations. Extensive experiments on KITTI, PASCAL and CUB-2011 datasets demonstrate the significant advantages of our features over prior works that use either hand-constructed or learned features.

Profile:

Christopher Bongsoo Choy received his Bachelor of Science degree from KAIST in 2012, and his Master of Science from Stanford University in 2014 all in electrical engineering. He is currently pursuing his Ph.D. degree under the direction of Prof. Silvio Savarese at Stanford University. He was awarded the Presidential Scholarship in 2007 and the Korea Foundation for Advanced Studied Fellowship in 2012.

News & Event

Seminar

Notice

Calendar

Seminar

Event

Press

Job Openings

Newsletter

Faculty Recruitment

(June 9) 3D Recurrent Reconstruction Neural Network (3D-R2N2)

Subject

3D Recurrent Reconstruction Neural Network (3D-R2N2)

Date

Speaker

Bongsoo Choy/Stanford University

Place

N1 Building #201

Overview:

Profile:

About Us

Research

EE-X

AI in EE

People & Life

Academics

Admissions

News & Event

External Relations

About Us

Research

EE-X

AI in EE

People & Life

Academics

External Relations

Admissions

News & Event

News & Event​

Seminar

(June 9) 3D Recurrent Reconstruction Neural Network (3D-R2N2)

Subject

3D Recurrent Reconstruction Neural Network (3D-R2N2)

Date

Speaker

Bongsoo Choy/Stanford University

Place

N1 Building #201

Overview:

Profile:

News & Event