Education, Science, Technology, Innovation and Life
Open Access
Sign In

Unified Prior Mask-Guided End-to-End Online Vectorized HD Map Construction

Download as PDF

DOI: 10.23977/autml.2026.070111 | Downloads: 2 | Views: 46

Author(s)

Yupeng Luo 1, Yuan Zhu 1, Ke Lu 1

Affiliation(s)

1 College of Automotive and Energy Engineering, Tongji University, No. 4800 Caoan Road, Shanghai, China

Corresponding Author

Ke Lu

ABSTRACT

Online vectorized high-definition map construction has attracted increasing attention, which reduces the cost of manual annotation compared with traditional SLAM methods and provides critical static road information for downstream tasks such as localization and motion planning. However, while standard-definition maps and global maps from historical predictions are readily available priors, existing methods fail to uniformly and fully leverage this valuable information, leading to suboptimal mapping performance especially in long-range complex scenarios. To address this issue, we propose a prior-guided mapping method leveraging a unified prior mask, termed PGMapNet. Specifically, we design a unified mask-guided prior embedding generator, which fuses bird's-eye view (BEV) features with prior maps to generate a unified prior mask. The mask is further utilized to predict map instance information and generate prior embeddings, thereby providing positional and structural instance priors for the queries of the map decoder. Furthermore, a multi-point selective state space model (SSM) module is designed, which adaptively samples key region features of map instance points from BEV features, and performs interactive modeling on the sampled sequences via SSM, effectively enhancing the prediction accuracy of the map instance point set. Extensive experiments on the nuScenes dataset validate the effectiveness of the proposed method. Compared with the baseline model, our proposed model achieves an improvement of 2.5% mAP within the 30-meter range, and a further improvement of 2.9% mAP for the extended 50-meter range.

KEYWORDS

Vectorized High-Definition Map Construction, Autonomous Driving, Bird's-Eye View Perception, Selective State Space Model

CITE THIS PAPER

Yupeng Luo, Yuan Zhu, Ke Lu. Unified Prior Mask-Guided End-to-End Online Vectorized HD Map Construction. Automation and Machine Learning (2026). Vol. 7, No. 1, 90-97. DOI: http://dx.doi.org/10.23977/autml.2026.070111.

REFERENCES

[1] Zhang, J., & Singh, S. (2014, July). LOAM: Lidar odometry and mapping in real-time. In Robotics: Science and systems (Vol. 2, No. 9, pp. 1-9).
[2] Shan, T., & Englot, B. (2018, October). Lego-loam: Lightweight and ground-optimized lidar odometry and mapping on variable terrain. In 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 4758-4765). IEEE. 
[3] Li, Q., Wang, Y., Wang, Y., & Zhao, H. (2022, May). Hdmapnet: An online hd map construction and evaluation framework. In 2022 International Conference on Robotics and Automation (ICRA) (pp. 4628-4634). IEEE.
[4] Liu, Y., Yuan, T., Wang, Y., Wang, Y., & Zhao, H. (2023, July). Vectormapnet: End-to-end vectorized hd map learning. In International conference on machine learning (pp. 22352-22369). PMLR.
[5] Liao, B., Chen, S., Wang, X., Cheng, T., Zhang, Q., Liu, W., & Huang, C. (2022). Maptr: Structured modeling and learning for online vectorized hd map construction. arXiv preprint arXiv:2208.14437.
[6] Liu, X., Wang, S., Li, W., Yang, R., Chen, J., & Zhu, J. (2024). Mgmap: Mask-guided learning for online vectorized hd map construction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 14812-14821).
[7] Jiang, Z., Zhu, Z., Li, P., Gao, H. A., Yuan, T., Shi, Y., ... & Zhao, H. (2024). P-mapnet: Far-seeing map generator enhanced by both sdmap and hdmap priors. IEEE Robotics and Automation Letters, 9(10), 8539-8546.
[8] Yuan, T., Liu, Y., Wang, Y., Wang, Y., & Zhao, H. (2024). Streammapnet: Streaming mapping network for vectorized online hd map construction. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 7356-7365).
[9] Shi, A., Cai, Y., Chen, X., Pu, J., Fu, Z., & Lu, H. (2024). Globalmapnet: An online framework for vectorized global hd map construction. arXiv preprint arXiv:2409.10063.
[10] Haklay, M., & Weber, P. (2008). Openstreetmap: User-generated street maps. IEEE Pervasive computing, 7(4), 12-18.
[11] Li, Z., Wang, W., Li, H., Xie, E., Sima, C., Lu, T., ... & Dai, J. (2024). Bevformer: learning bird's-eye-view representation from lidar-camera via spatiotemporal transformers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(3), 2020-2036.
[12] Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. (2020). Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159.
[13] Dao, T., & Gu, A. (2024). Transformers are ssms: Generalized models and efficient algorithms through structured state space duality. arXiv preprint arXiv:2405.21060.
[14] Caesar, H., Bankiti, V., Lang, A. H., Vora, S., Liong, V. E., Xu, Q., ... & Beijbom, O. (2020). nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11621-11631).

Downloads: 5042
Visits: 252292

Sponsors, Associates, and Links


All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.