TY - JOUR
T1 - Bifuse
T2 - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020
AU - Wang, Fu En
AU - Yeh, Yu Hsuan
AU - Sun, Min
AU - Chiu, Wei-Chen
AU - Tsai, Yi Hsuan
N1 - Publisher Copyright:
© 2020 IEEE
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020
Y1 - 2020
N2 - Depth estimation from a monocular 360◦ image is an emerging problem that gains popularity due to the availability of consumer-level 360◦ cameras and the complete surrounding sensing capability. While the standard of 360◦ imaging is under rapid development, we propose to predict the depth map of a monocular 360◦ image by mimicking both peripheral and foveal vision of the human eye. To this end, we adopt a two-branch neural network leveraging two common projections: equirectangular and cubemap projections. In particular, equirectangular projection incorporates a complete field-of-view but introduces distortion, whereas cubemap projection avoids distortion but introduces discontinuity at the boundary of the cube. Thus we propose a bi-projection fusion scheme along with learnable masks to balance the feature map from the two projections. Moreover, for the cubemap projection, we propose a spherical padding procedure which mitigates discontinuity at the boundary of each face. We apply our method to four panorama datasets and show favorable results against the existing state-of-the-art methods.
AB - Depth estimation from a monocular 360◦ image is an emerging problem that gains popularity due to the availability of consumer-level 360◦ cameras and the complete surrounding sensing capability. While the standard of 360◦ imaging is under rapid development, we propose to predict the depth map of a monocular 360◦ image by mimicking both peripheral and foveal vision of the human eye. To this end, we adopt a two-branch neural network leveraging two common projections: equirectangular and cubemap projections. In particular, equirectangular projection incorporates a complete field-of-view but introduces distortion, whereas cubemap projection avoids distortion but introduces discontinuity at the boundary of the cube. Thus we propose a bi-projection fusion scheme along with learnable masks to balance the feature map from the two projections. Moreover, for the cubemap projection, we propose a spherical padding procedure which mitigates discontinuity at the boundary of each face. We apply our method to four panorama datasets and show favorable results against the existing state-of-the-art methods.
UR - http://www.scopus.com/inward/record.url?scp=85094645321&partnerID=8YFLogxK
U2 - 10.1109/CVPR42600.2020.00054
DO - 10.1109/CVPR42600.2020.00054
M3 - Conference article
AN - SCOPUS:85094645321
SN - 1063-6919
SP - 459
EP - 468
JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
M1 - 9157424
Y2 - 14 June 2020 through 19 June 2020
ER -