Education, Science, Technology, Innovation and Life
Open Access
Sign In

Attention-based mechanism for SuperPoint feature point extraction in endoscopy

Download as PDF

DOI: 10.23977/acss.2024.080311 | Downloads: 5 | Views: 173

Author(s)

Mingyue Zhang 1

Affiliation(s)

1 School of Information and Electronic Technology, Key Laboratory of Autonomous Intelligence and Information Processing in Heilongjiang Province, Jiamusi University, Jiamusi, China

Corresponding Author

Mingyue Zhang

ABSTRACT

Routine endoscopes have been widely used in medical diagnosis. Three-dimensional (3D) modelling reconstruction of endoscopic images has become the development direction of future medical domain. Local feature extraction and matching is a key step for 3D modelling reconstruction. Handcrafted local features such as SIFT, SURF, ORB, are still a predominant tool for such tasks. Due to the special environment of endoscopes, there are generally weak textures and large lighting changes, which make traditional feature point extraction algorithms unable to extract feature points well. We explore the potential of the self-supervised method SuperPoint. Many existing works have shown the benefits of enhancing spatial encoding. We propose a new architecture unit, in which the SE attention mechanism module is proposed, which can explicitly model the interdependence between convolutional feature channels to improve the network's representation ability. The experimental results show that this multi-scale channel attention feature point extraction algorithm based on SuperPoint has better result and achieves higher matching quality than handcrafted local features and original algorithm in endoscopic images.

KEYWORDS

Attention mechanism, deep learning, self-supervision, local features, endoscopy

CITE THIS PAPER

Mingyue Zhang, Attention-based mechanism for SuperPoint feature point extraction in endoscopy. Advances in Computer, Signals and Systems (2024) Vol. 8: 76-83. DOI: http://dx.doi.org/10.23977/acss.2024.080311.

REFERENCES

[1] H. Dubois, J. Creutzfeldt, M. Törnqvist, and M. Bergenmar, 2020, "Patient participation in gastrointestinal endoscopy—From patients' perspectives," Health Expectations, 23, 4, 893-903.
[2] A. Darzi and Y. Munz, 2004, "The impact of minimally invasive surgical techniques," Annu. Rev. Med.,  55, 223-237.
[3] F. Chadebecq, F. Vasconcelos, E. Mazomenos, and D. Stoyanov, 2020, "Computer vision in the surgical operating room," Visceral Medicine,  36, 6, 456-462. 
[4] C. Fergo, J. Burcharth, H.-C. Pommergaard, N. Kildebro, and J. Rosenberg, 2017, "Three-dimensional laparoscopy vs 2-dimensional laparoscopy with high-definition technology for abdominal surgery: a systematic review," The American Journal of Surgery, 213, 1, 159-170.
[5] A. Zaman, F. Yangyu, M. Irfan, M. S. Ayub, L. Guoyun, and L. Shiya, 2022, "LifelongGlue: Keypoint matching for 3D reconstruction with continual neural networks," Expert Systems with Applications, 195, 116613.
[6] O. L. Barbed, F. Chadebecq, J. Morlana, J. M. Montiel, and A. C. Murillo, 2022, "Superpoint features in endoscopy," in MICCAI Workshop on Imaging Systems for GI Endoscopy, 45-55.
[7] X. Liu et al., 2020, "Extremely dense point correspondences using a learned feature descriptor," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 4847-4856.
[8] D. DeTone, T. Malisiewicz, and A. Rabinovich, 2018, "Superpoint: Self-supervised interest point detection and description," in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 224-236.
[9] J. Hu, L. Shen, and G. Sun, 2018, "Squeeze-and-excitation networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 7132-7141.
[10] Q. Liu, J. Zhang, J. Liu, and Z. Yang, 2022, "Feature extraction and classification algorithm, which one is more essential? An experimental study on a specific task of vibration signal diagnosis," International Journal of Machine Learning and Cybernetics, 13, 6, 1685-1696.
[11] A. Witkin, 1984, "Scale-space filtering: A new approach to multi-scale description," in ICASSP'84. IEEE international conference on acoustics, speech, and signal processing, 9, 150-153.
[12] H. Bay, T. Tuytelaars, and L. Van Gool, 2006, "Surf: Speeded up robust features," in Computer Vision–ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, 404-417.
[13] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, 2011, "ORB: An efficient alternative to SIFT or SURF," in 2011 International conference on computer vision, 2564-2571.
[14] D. G. Viswanathan, 2009, "Features from accelerated segment test (fast)," in Proceedings of the 10th workshop on image analysis for multimedia interactive services, London, UK, 6-8.
[15] M. Calonder, V. Lepetit, C. Strecha, and P. Fua, 2010, "Brief: Binary robust independent elementary features," in Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece 778-792.
[16] Y. Tian, B. Fan, and F. Wu, 2017, "L2-net: Deep learning of discriminative patch descriptor in euclidean space," in Proceedings of the IEEE conference on computer vision and pattern recognition, 661-669.
[17] K. B. Ozyoruk et al., 2021, "EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos," Medical image analysis, 71, 102058.
[18] H. Borgli et al., 2020, "HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy," Scientific data, 7, 1, 283.
[19] K. Mikolajczyk and C. Schmid, 2005, "A performance evaluation of local descriptors," IEEE transactions on pattern analysis machine intelligence, 27, 10, 1615-1630.

Downloads: 19518
Visits: 297777

Sponsors, Associates, and Links


All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.