TY - JOUR
T1 - Improving LiDAR Semantic Segmentation on Minority Classes and Generalization Capability Using U-Net++ for Self-Driving Scenes
AU - Tseng, Chiao Hua
AU - Lin, Yu Ting
AU - Lin, Wen Chieh
AU - Wang, Chieh Chih
N1 - Publisher Copyright:
© 2024 Institute of Information Science. All rights reserved.
PY - 2024/5
Y1 - 2024/5
N2 - LiDAR has been an important sensor in autonomous driving systems. Compared to the measurements provided by a radar or camera, LiDAR can provide more precise geometric information and be fused with other types of senors to tackle various perception tasks in autonomous driving. Among these perception tasks, semantic segmentation on LiDAR point clouds has received more and more research interest and achieved compelling results. However, there are still two unsolved issues. The first one is about minority classes caused by data imbalance, which is an inevitable problem in large-scale outdoor scenes. The minority classes, which are small in a scene and result in very few LiDAR points, can be important objects to be recognized for self-driving cars, e.g., pedestrians, motorcycles, traffic signs. In order to solve this class imbalance problem, we use U-Net++ architecture and dice loss to enhance the IoU score for the minority classes. The second issue is generalization capability on different LiDAR resolutions. Existing methods mostly need to be retrained to deal with data collected by LiDARs with different resolutions. We adopt KPConv as convolution operator to tackle this issue. With U-Net++ and dice loss, we get 5.1% mIoU improvement on SemanticKITTI, especially 9.5% mIoU improvement of minority classes compared with baseline. Moreover, we show the generalization capability of our model with KPConv by training on 64-beam dataset and testing on 32-beam and 128-beam dataset. We obtain 3.3% mIoU improvement on 128-beam dataset and 1.9% mIoU improvement on 32-beam dataset.
AB - LiDAR has been an important sensor in autonomous driving systems. Compared to the measurements provided by a radar or camera, LiDAR can provide more precise geometric information and be fused with other types of senors to tackle various perception tasks in autonomous driving. Among these perception tasks, semantic segmentation on LiDAR point clouds has received more and more research interest and achieved compelling results. However, there are still two unsolved issues. The first one is about minority classes caused by data imbalance, which is an inevitable problem in large-scale outdoor scenes. The minority classes, which are small in a scene and result in very few LiDAR points, can be important objects to be recognized for self-driving cars, e.g., pedestrians, motorcycles, traffic signs. In order to solve this class imbalance problem, we use U-Net++ architecture and dice loss to enhance the IoU score for the minority classes. The second issue is generalization capability on different LiDAR resolutions. Existing methods mostly need to be retrained to deal with data collected by LiDARs with different resolutions. We adopt KPConv as convolution operator to tackle this issue. With U-Net++ and dice loss, we get 5.1% mIoU improvement on SemanticKITTI, especially 9.5% mIoU improvement of minority classes compared with baseline. Moreover, we show the generalization capability of our model with KPConv by training on 64-beam dataset and testing on 32-beam and 128-beam dataset. We obtain 3.3% mIoU improvement on 128-beam dataset and 1.9% mIoU improvement on 32-beam dataset.
KW - autonomous driving
KW - deep learning
KW - generalization capability
KW - LiDAR semantic segmentation
KW - minority class
UR - http://www.scopus.com/inward/record.url?scp=85192677036&partnerID=8YFLogxK
U2 - 10.6688/JISE.202405_40(3).0012
DO - 10.6688/JISE.202405_40(3).0012
M3 - Article
AN - SCOPUS:85192677036
SN - 1016-2364
VL - 40
SP - 615
EP - 629
JO - Journal of Information Science and Engineering
JF - Journal of Information Science and Engineering
IS - 3
ER -