Education, Science, Technology, Innovation and Life
Open Access
Sign In

End-to-End Multi-Sensor Fusion for 3d Object Detection in Lidar Point Clouds

Download as PDF

DOI: 10.23977/appep.2021.020113 | Downloads: 5 | Views: 110

Author(s)

Kanrun Huang 1

Affiliation(s)

1 Nauto, Inc., California, 94306, United States, China

Corresponding Author

Kanrun Huang

ABSTRACT

In this paper, we propose a new method to fuse camera and lidar point clouds at the same time that outperforms the state-of-the-art network at long range and challenge conditions like poor weather and highly reflective surfaces. Point clouds from LiDAR tend to be sparse and highly variable, especially at long range and poor weather, which may cause 3D object detectors to fail to detect small but important objects (pedestrians, traffic signs, cones, etc.). Our neural network model does not need labels of objects in the 2D images. The neural network model we proposed is evaluated both on the Waymo Open Dataset and on the KITTI dataset [1] and we demonstrate that our model achieves state-of-the-art accuracy on 3D object detection.

KEYWORDS

Object detection, Deep learning, Sensor fusion

CITE THIS PAPER

Kanrun Huang. End-to-End Multi-Sensor Fusion for 3d Object Detection in Lidar Point Clouds. Applied & Educational Psychology (2021) 2: 67-72. DOI: http://dx.doi.org/10.23977/appep.2021.020113.

REFERENCES

[1] Geiger, Andreas, Philip Lenz, Raquel Urtasun(2012). Are we ready for autonomous driving? the kitti vision benchmark suite. 2012 IEEE conference on computer vision and pattern recognition, pp. 3354-3361.
[2] Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, Tian Xia(2017). Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1907-1915.
[3] Jason Ku, Melissa Mozifian, Jungwook Lee, Ali Harakeh, and Steven L. Waslander(2018). Joint 3D proposal generation and object detection from view aggregation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1-8.
[4] Ming Liang, Bin Yang, Shenlong Wang, and Raquel Urtasun(2018). Deep continuous fusion for multi-sensor 3D object detection. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 641-656.
[5] Yin Zhou, Oncel Tuzel(2018). Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4490-4499.
[6] Yan Yan, Yuxing Mao, Bo Li((2018)). “Second: Sparsely embedded convolutional detection.” Sensors 18, no. 10 : pp. 3337.
[7] Alex H. Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, Oscar Beijbom(2019). Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12697-12705.

All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.