A survey of Few-Shot Action Recognition
DOI: 10.23977/jaip.2023.060105 | Downloads: 13 | Views: 720
Author(s)
Congmin Wang 1, Yancong Zhou 2
Affiliation(s)
1 School of Science, Tianjin University of Commerce, Tianjin, China
2 School of Information Engineering, Tianjin University of Commerce, Tianjin, China
Corresponding Author
Yancong ZhouABSTRACT
In recent years, with the development of network technology, countless videos are produced every day. Many achievements have also been made in the field of action recognition in computer vision. Training action recognition models requires a large number of labeled samples, but in reality, the amount of data is scarce, and it is extremely difficult to obtain a large amount of data due to costs and other reasons. The few-shot learning aims to solve the problem of using several samples to learn new categories. This paper combs the relevant research in recent years of few-shot action recognition technology. According to the classification of training process, this paper summarizes the research progress and typical models of few-shot action recognition from the perspectives of data processing, feature embedding, feature augmentation, and metric learning; finally points out the challenges faced by current research and the future development directions.
KEYWORDS
Few-Shot Learning, Action Recognition, Deep LearningCITE THIS PAPER
Congmin Wang, Yancong Zhou, A survey of Few-Shot Action Recognition. Journal of Artificial Intelligence Practice (2023) Vol. 6: 34-40. DOI: http://dx.doi.org/10.23977/jaip.2023.060105.
REFERENCES
[1] Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 2015, 28.
[2] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. Advancesin neural information processing systems, 2012, 25.
[3] Yan L, Zheng Y, Cao J. Few-shot learning for short text classification. Multimedia Tools and Applications, 2018, 77(22): 29799-29810.
[4] Goodfellow I, Bengio Y, Courville A. Deep learning. MIT press, 2016.
[5] Qin T, Li W, Shi Y, et al. Diversity helps: Unsupervised few-shot learning via distribution shift-based data augmentation. arXiv:2004.05805, 2020.
[6] Xu H, Wang J, Li H, et al. Unsupervised meta-learning for few-shot learning. Pattern Recognition, 2021, 116: 107951.
[7] Zhang H, Zhan T, Davidson I. A self-supervised deep learning framework for unsupervised few-shot learning and clustering. Pattern Recognition Letters, 2021, 148: 75-81.
[8] Wang Y, Yao Q, Kwok J T, et al. Generalizing from a Few Examples: A Survey on Few-shot Learning. ACM Computing Surveys, 2020, 53(3):1-34.
[9] An Shengbiao, Guo Yuqi, Bai Yu, Wang Tengbo. Summary of image classification studies in small samples. And Computer Science and Exploration. 2022. 1-22.
[10] H Kuehne, T Serre, H Jhuang, E Garrote, T Poggio, and T Serre. HMDB: A large video database for human motion recognition. In International Conference on Computer Vision, nov 2011. 2, 4, 10.
[11] Hongguang Zhang, Li Zhang, Xiaojuan Qi, Hongdong Li, Philip H S Torr, and Piotr Koniusz. Few-shot Action Recognition with Permutation-invariant Attention. In European Conference on Computer Vision, 2020. 1, 2, 3, 4, 5, 10,11
[12] Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah.UCF101: A Dataset of 101 Human Actions Classes from Videos in The Wild. arXiv, 2012. 2, 4, 10.
[13] Raghav Goyal, Vincent Michalski, Joanna Materzy, Susanne Westphal, Heuna Kim, Valentin Haenel, Peter Yianilos, Moritz Mueller-freitag, Florian Hoppe, Christian Thurau, Ingo Bax, and Roland Memisevic. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense. In International Conference on Computer Vision, 2017. 1, 2, 4, 10.
[14] Linchao Zhu and Yi Yang. Compound Memory Networks for Few-Shot Video Classification. In European Conference on Computer Vision, 2018. 1, 2, 4, 5, 10, 11.
[15] Linchao Zhu and Yi Yang. Label Independent Memory for Semi-Supervised Few-shot Video Classification. Transactions on Pattern Analysis and Machine Intelligence, 14(8), 2020. 1, 2, 4, 5, 7, 8, 10, 11.
[16] Joao Carreira and Andrew Zisserman. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. Computer Vision and Pattern Recognition, 2017. 1, 2, 4, 10.
[17] Lawrence S, Giles C L, Tsoi A C. Lessons in Neural Network Training: Overfitting May be Harder than Expected. Fourteenth National Conference on Artificial Intelligence & Ninth Innovative Applications of Artificial Intelligence Conference. AAAI Press, 1997.
[18] G. Huang, A. G. Bors. Busy-Quiet Video Disentangling for Video Classification.2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2022, 756-765.
[19] Yuqian Fu, Li Zhang, Junke Wang, Yanwei Fu, Yu-Gang Jiang. Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition. accepted by ACM Multimedia 2020.
[20] K. Cao, J. Ji, Z. Cao, C. -Y. Chang and J. C. Niebles. Few-Shot Video Classification via Temporal Alignment.2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, 10615-10624.
[21] L. Wang, Y. Xiong, Z. Wang, Y. Qiao, D. Lin, X. Tang, and L. Van Gool. Temporal segment networks: Towards good practices for deep action recognition. In European conference on computer vision, pages 20–36. Springer, 2016. 1, 2, 4, 6.
[22] Mina Bishay, Georgios Zoumpourlis, and Ioannis Patras. TARN: Temporal Attentive Relation Network for Few-Shot and Zero-Shot Action Recognition. In British Machine Vision Conference, 2019. 1, 2, 4, 5, 8, 10, 11.
[23] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. In Computer Vision and Pattern Recognition, 2016. 5.
[24] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. International Conference on Learning Representations. 2021.
[25] H.-J. Ye, H. Hu, D.-C. Zhan and F. Sha. Few-Shot Learning via Embedding Adaptation with Set-to-Set Functions.2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, 8805-8814.
[26] Haddad, M., Ghassab, V.K., Najar, F. et al. A statistical framework for few-shot action recognition. Multimed Tools Appl 80, 2021,24303–24318.
[27] Hongguang Zhang, Li Zhang, Xiaojuan Qi, Hongdong Li, Philip H S Torr, and Piotr Koniusz. Few-shot Action Recognition with Permutation-invariant Attention. In European Conference on Computer Vision, 2020. 1, 2, 3, 4, 5, 10, 11
[28] Toby Perrett, Alessandro Masullo, Tilo Burghardt, Majid Mirmehdi, and Dima Damen. Temporal-relational crosstransformers for few-shot action recognition. In CVPR, 2021. 1, 2, 3, 6, 7, 8, 12.
[29] Carl Doersch, Ankush Gupta, and Andrew Zisserman.CrossTransformers: Spatially-Aware Few-Shot Transfer. In Advances in Nerual Information Processing Systems, 2020.1, 2, 3, 5.
[30] Thatipelli, A., Narayan, S., Khan, S.H., Anwer, R.M., Khan, F.S.& Ghanem,B. Spatio-temporal Relation Modeling for Few-shot Action Recognition. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. 19926-19935.
[31] Xing E, Jordan M, Russell S J, et al. Distance metric learning with application to clustering with side-information. Advances in neural information processing systems,2002, 15.
[32] SNELLJ, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems, Long Beach: MIT Press, 2017.4077-4087.
[33] SUNG F, YANG YX, ZHANG L, XIANG T, TORR P H, Hospedales T M. Learning to compare: relation network for few-shot learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake, USA: IEEE, 2018. 1199-1208.
[34] VINYALS O, BLUNDELL C, LILLICRAP T, KORAY K. Matching networks for one shot learning. Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: MITPress, 2016. 3630-3638.
[35] Wei Shihong, Liu Hongmei, Tang Hong, Zhu Longjiao. Small-sample learning of multilevel metric networks. Computer Engineering and Application, 2023,59 (02): 94-101.
[36] Zhu X,Toisoul A,Prez-Ra J M,et al. Few-shot Action Recognition with Prototype-centered Attentive Learning. 2021.
[37] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In Computer Vision and Pattern Recognition, 2009. 5.
Downloads: | 9790 |
---|---|
Visits: | 261735 |
Sponsors, Associates, and Links
-
Power Systems Computation
-
Internet of Things (IoT) and Engineering Applications
-
Computing, Performance and Communication Systems
-
Advances in Computer, Signals and Systems
-
Journal of Network Computing and Applications
-
Journal of Web Systems and Applications
-
Journal of Electrotechnology, Electrical Engineering and Management
-
Journal of Wireless Sensors and Sensor Networks
-
Journal of Image Processing Theory and Applications
-
Mobile Computing and Networking
-
Vehicle Power and Propulsion
-
Frontiers in Computer Vision and Pattern Recognition
-
Knowledge Discovery and Data Mining Letters
-
Big Data Analysis and Cloud Computing
-
Electrical Insulation and Dielectrics
-
Crypto and Information Security
-
Journal of Neural Information Processing
-
Collaborative and Social Computing
-
International Journal of Network and Communication Technology
-
File and Storage Technologies
-
Frontiers in Genetic and Evolutionary Computation
-
Optical Network Design and Modeling
-
Journal of Virtual Reality and Artificial Intelligence
-
Natural Language Processing and Speech Recognition
-
Journal of High-Voltage
-
Programming Languages and Operating Systems
-
Visual Communications and Image Processing
-
Journal of Systems Analysis and Integration
-
Knowledge Representation and Automated Reasoning
-
Review of Information Display Techniques
-
Data and Knowledge Engineering
-
Journal of Database Systems
-
Journal of Cluster and Grid Computing
-
Cloud and Service-Oriented Computing
-
Journal of Networking, Architecture and Storage
-
Journal of Software Engineering and Metrics
-
Visualization Techniques
-
Journal of Parallel and Distributed Processing
-
Journal of Modeling, Analysis and Simulation
-
Journal of Privacy, Trust and Security
-
Journal of Cognitive Informatics and Cognitive Computing
-
Lecture Notes on Wireless Networks and Communications
-
International Journal of Computer and Communications Security
-
Journal of Multimedia Techniques
-
Automation and Machine Learning
-
Computational Linguistics Letters
-
Journal of Computer Architecture and Design
-
Journal of Ubiquitous and Future Networks