TY - GEN
T1 - Finding Robust 2D-to-3D Correspondence with LSTM Score Estimation for Camera Localization
AU - Huang, Tsu Kuan
AU - Chen, Po Heng
AU - Wang, Li Yang
AU - Chen, Kuan Wen
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - 2D-to-3D correspondence estimation is the key step of 3D model-based image localization, and most of the existing research in this field focuses on improving the feature matching performance. Even with the best feature matching method, there are still some outliers, and thus, almost all the methods simply apply the RANSAC algorithm to select the inliers and estimate the camera pose afterwards. However, the reliability of RANSAC depends considerably on the inlier ratio. Once the inlier ratio decreases, for example a challenging scenario occurs, it will be unable to select the inliers well and lead to a worse camera pose. In this study, we attempted to build a neural network to learn the geometric relationship between 2D images and the 3D model to select the correct correspondence from the initial 2D-to-3D matching results to improve the performance of camera localization. Because the number of inputs, i.e., the number of 2D-to-3D correspondences, is unknown and different for each image, we propose a PointNet-based Geometric Consistency Network (GCC-Net) for the correct correspondence estimation and an LSTM-based Hypothesis Rating Network (HR-Net) to enhance GCC-Net with the camera localization loss. Experimental results showed that the proposed method outperforms RANSAC considerably on the camera pose estimation, particularly when the inlier ratio of the initial correspondence was low.
AB - 2D-to-3D correspondence estimation is the key step of 3D model-based image localization, and most of the existing research in this field focuses on improving the feature matching performance. Even with the best feature matching method, there are still some outliers, and thus, almost all the methods simply apply the RANSAC algorithm to select the inliers and estimate the camera pose afterwards. However, the reliability of RANSAC depends considerably on the inlier ratio. Once the inlier ratio decreases, for example a challenging scenario occurs, it will be unable to select the inliers well and lead to a worse camera pose. In this study, we attempted to build a neural network to learn the geometric relationship between 2D images and the 3D model to select the correct correspondence from the initial 2D-to-3D matching results to improve the performance of camera localization. Because the number of inputs, i.e., the number of 2D-to-3D correspondences, is unknown and different for each image, we propose a PointNet-based Geometric Consistency Network (GCC-Net) for the correct correspondence estimation and an LSTM-based Hypothesis Rating Network (HR-Net) to enhance GCC-Net with the camera localization loss. Experimental results showed that the proposed method outperforms RANSAC considerably on the camera pose estimation, particularly when the inlier ratio of the initial correspondence was low.
UR - http://www.scopus.com/inward/record.url?scp=85124334739&partnerID=8YFLogxK
U2 - 10.1109/IROS51168.2021.9636516
DO - 10.1109/IROS51168.2021.9636516
M3 - Conference contribution
AN - SCOPUS:85124334739
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 2232
EP - 2238
BT - IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2021
Y2 - 27 September 2021 through 1 October 2021
ER -