TY - GEN
T1 - 360SD-Net
T2 - 2020 IEEE International Conference on Robotics and Automation, ICRA 2020
AU - Wang, Ning Hsu
AU - Solarte, Bolivar
AU - Tsai, Yi Hsuan
AU - Chiu, Wei Chen
AU - Sun, Min
N1 - Publisher Copyright:
© 2020 IEEE.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/5
Y1 - 2020/5
N2 - Recently, end-to-end trainable deep neural networks have significantly improved stereo depth estimation for perspective images. However, 360° images captured under equirectangular projection cannot benefit from directly adopting existing methods due to distortion introduced (i.e., lines in 3D are not projected onto lines in 2D). To tackle this issue, we present a novel architecture specifically designed for spherical disparity using the setting of top-bottom 360° camera pairs. Moreover, we propose to mitigate the distortion issue by (1) an additional input branch capturing the position and relation of each pixel in the spherical coordinate, and (2) a cost volume built upon a learnable shifting filter. Due to the lack of 360° stereo data, we collect two 360° stereo datasets from Matterport3D and Stanford3D for training and evaluation. Extensive experiments and ablation study are provided to validate our method against existing algorithms. Finally, we show promising results on real-world environments capturing images with two consumer-level cameras. Our project page is at https://albert100121.github.io/360SD-Net-Project-Page.
AB - Recently, end-to-end trainable deep neural networks have significantly improved stereo depth estimation for perspective images. However, 360° images captured under equirectangular projection cannot benefit from directly adopting existing methods due to distortion introduced (i.e., lines in 3D are not projected onto lines in 2D). To tackle this issue, we present a novel architecture specifically designed for spherical disparity using the setting of top-bottom 360° camera pairs. Moreover, we propose to mitigate the distortion issue by (1) an additional input branch capturing the position and relation of each pixel in the spherical coordinate, and (2) a cost volume built upon a learnable shifting filter. Due to the lack of 360° stereo data, we collect two 360° stereo datasets from Matterport3D and Stanford3D for training and evaluation. Extensive experiments and ablation study are provided to validate our method against existing algorithms. Finally, we show promising results on real-world environments capturing images with two consumer-level cameras. Our project page is at https://albert100121.github.io/360SD-Net-Project-Page.
UR - http://www.scopus.com/inward/record.url?scp=85092700985&partnerID=8YFLogxK
U2 - 10.1109/ICRA40945.2020.9196975
DO - 10.1109/ICRA40945.2020.9196975
M3 - Conference contribution
AN - SCOPUS:85092700985
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 582
EP - 588
BT - 2020 IEEE International Conference on Robotics and Automation, ICRA 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 31 May 2020 through 31 August 2020
ER -