Text classification system based on LLM
DOI: 10.23977/jaip.2024.070321 | Downloads: 71 | Views: 802
Author(s)
Yang Yu 1, Jinqi Li 2, Hongpeng Liu 2, Fan Yang 2, Xuanyao Yu 3
Affiliation(s)
1 Kunlun Digital Technology Co., Ltd, Beijing, China
2 Petrochina Kunlun Gas Co., Ltd., Shandong Branch, Jinan, China
3 China University of Petroleum (Beijing), Beijing, China
Corresponding Author
Xuanyao YuABSTRACT
Text classification based on long text and multi label text classification have always been a challenge, and the combination of these two problems brings great difficulties to text classification. This study focuses on the problem of long text multi label classification. The GLM large model was used to extract abstracts from text, which effectively reduced the length of the text and retained the main content of the text. Furthermore, the Bert-BiLSTM model was used to improve the accuracy of long sequence text. This model performs particularly well in multi label classification, and can accurately classify all category labels for multi label news classification, with much higher accuracy than conventional models.
KEYWORDS
Text classification, GLM, Multi labelCITE THIS PAPER
Yang Yu, Jinqi Li, Hongpeng Liu, Fan Yang, Xuanyao Yu, Text classification system based on LLM. Journal of Artificial Intelligence Practice (2024) Vol. 7: 172-178. DOI: http://dx.doi.org/10.23977/jaip.2024.070321.
REFERENCES
[1] Li, Q.; Peng, H.; Li, J.; Xia, C.; Yang, R.; Sun, L.; Yu, P.S.; He, L. A Survey on Text Classification: From Shallow to Deep Learning. arXiv 2020, arXiv:2008.00364
[2] Gangamohan, Paidi, Sudarsana Reddy Kadiri, and B. Yegnanarayana. "Analysis of emotional speech—A review." Toward Robotic Socially Believable Behaving Systems-Volume I: Modeling Emotions (2016): 205-238.
[3] Jayady, Siti Hajar, and Hasmawati Antong. "Theme Identification using Machine Learning Techniques." Journal of Integrated and Advanced Engineering (JIAE) 1.2 (2021): 123-134.
[4] Minaee, Shervin, et al. "Deep learning--based text classification: a comprehensive review." ACM computing surveys (CSUR) 54.3 (2021): 1-40.
[5] Gu JX, Wang ZH, Kuen J, et al. Recent advances in convolutional neural networks. Pattern Recognition, 2018, 77: 354–377. [doi: 10.1016/j.patcog.2017.10.013]
[6] McClelland JL, Elman JL. The TRACE model of speech perception. Cognitive Psychology, 1986, 18(1): 1–86. [doi: 10.1016/0010-0285(86)90015-0]
[7] Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).
[8] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735–1780. [doi: 10.1162/neco.1997.9.8.1735]
[9] Zhou P, Qi ZY, Zheng SC, et al. Text classification improved by integrating bidirectional LSTM with twodimensional max pooling. Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. Osaka: The COLING 2016 Organizing Committee, 2016. 3485–3495.
[10] Kenton J D M W C, Toutanova L K. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of naacL-HLT. 2019, 1: 2.
[11] Sun C, Qiu X, Xu Y, et al. How to fine-tune bert for text classification? [C]//Chinese Computational Linguistics: 18th China National Conference, CCL 2019, Kunming, China, October 18–20, 2019, Proceedings 18. Springer International Publishing, 2019: 194-206.
[12] Pappagari R, Zelasko P, Villalba J, et al. Hierarchical transformers for long document classification[C]//2019 IEEE automatic speech recognition and understanding workshop (ASRU). IEEE, 2019: 838-844.
[13] Dai A M, Le Q V. Semi-supervised sequence learning [J]. Advances in neural information processing systems, 2015, 28.
[14] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv:1810.04805, 2018.
[15] Zeng et al. GLM-130B: An Open Bilingual Pre-Trained Model. ICLR 2023.
[16] GLM, Team, et al. "ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools." arxiv preprint arxiv: 2406. 12793 (2024).
Downloads: | 13271 |
---|---|
Visits: | 376003 |
Sponsors, Associates, and Links
-
Power Systems Computation
-
Internet of Things (IoT) and Engineering Applications
-
Computing, Performance and Communication Systems
-
Advances in Computer, Signals and Systems
-
Journal of Network Computing and Applications
-
Journal of Web Systems and Applications
-
Journal of Electrotechnology, Electrical Engineering and Management
-
Journal of Wireless Sensors and Sensor Networks
-
Journal of Image Processing Theory and Applications
-
Mobile Computing and Networking
-
Vehicle Power and Propulsion
-
Frontiers in Computer Vision and Pattern Recognition
-
Knowledge Discovery and Data Mining Letters
-
Big Data Analysis and Cloud Computing
-
Electrical Insulation and Dielectrics
-
Crypto and Information Security
-
Journal of Neural Information Processing
-
Collaborative and Social Computing
-
International Journal of Network and Communication Technology
-
File and Storage Technologies
-
Frontiers in Genetic and Evolutionary Computation
-
Optical Network Design and Modeling
-
Journal of Virtual Reality and Artificial Intelligence
-
Natural Language Processing and Speech Recognition
-
Journal of High-Voltage
-
Programming Languages and Operating Systems
-
Visual Communications and Image Processing
-
Journal of Systems Analysis and Integration
-
Knowledge Representation and Automated Reasoning
-
Review of Information Display Techniques
-
Data and Knowledge Engineering
-
Journal of Database Systems
-
Journal of Cluster and Grid Computing
-
Cloud and Service-Oriented Computing
-
Journal of Networking, Architecture and Storage
-
Journal of Software Engineering and Metrics
-
Visualization Techniques
-
Journal of Parallel and Distributed Processing
-
Journal of Modeling, Analysis and Simulation
-
Journal of Privacy, Trust and Security
-
Journal of Cognitive Informatics and Cognitive Computing
-
Lecture Notes on Wireless Networks and Communications
-
International Journal of Computer and Communications Security
-
Journal of Multimedia Techniques
-
Automation and Machine Learning
-
Computational Linguistics Letters
-
Journal of Computer Architecture and Design
-
Journal of Ubiquitous and Future Networks