Education, Science, Technology, Innovation and Life
Open Access
Sign In

Text classification system based on LLM

Download as PDF

DOI: 10.23977/jaip.2024.070321 | Downloads: 71 | Views: 802

Author(s)

Yang Yu 1, Jinqi Li 2, Hongpeng Liu 2, Fan Yang 2, Xuanyao Yu 3

Affiliation(s)

1 Kunlun Digital Technology Co., Ltd, Beijing, China
2 Petrochina Kunlun Gas Co., Ltd., Shandong Branch, Jinan, China
3 China University of Petroleum (Beijing), Beijing, China

Corresponding Author

Xuanyao Yu

ABSTRACT

Text classification based on long text and multi label text classification have always been a challenge, and the combination of these two problems brings great difficulties to text classification. This study focuses on the problem of long text multi label classification. The GLM large model was used to extract abstracts from text, which effectively reduced the length of the text and retained the main content of the text. Furthermore, the Bert-BiLSTM model was used to improve the accuracy of long sequence text. This model performs particularly well in multi label classification, and can accurately classify all category labels for multi label news classification, with much higher accuracy than conventional models.

KEYWORDS

Text classification, GLM, Multi label

CITE THIS PAPER

Yang Yu, Jinqi Li, Hongpeng Liu, Fan Yang, Xuanyao Yu, Text classification system based on LLM. Journal of Artificial Intelligence Practice (2024) Vol. 7: 172-178. DOI: http://dx.doi.org/10.23977/jaip.2024.070321.

REFERENCES

[1] Li, Q.; Peng, H.; Li, J.; Xia, C.; Yang, R.; Sun, L.; Yu, P.S.; He, L. A Survey on Text Classification: From Shallow to Deep Learning. arXiv 2020, arXiv:2008.00364  
[2] Gangamohan, Paidi, Sudarsana Reddy Kadiri, and B. Yegnanarayana. "Analysis of emotional speech—A review." Toward Robotic Socially Believable Behaving Systems-Volume I: Modeling Emotions (2016): 205-238.
[3] Jayady, Siti Hajar, and Hasmawati Antong. "Theme Identification using Machine Learning Techniques." Journal of Integrated and Advanced Engineering (JIAE) 1.2 (2021): 123-134.
[4] Minaee, Shervin, et al. "Deep learning--based text classification: a comprehensive review." ACM computing surveys (CSUR) 54.3 (2021): 1-40.
[5] Gu JX, Wang ZH, Kuen J, et al. Recent advances in convolutional neural networks. Pattern Recognition, 2018, 77: 354–377. [doi: 10.1016/j.patcog.2017.10.013]
[6] McClelland JL, Elman JL. The TRACE model of speech perception. Cognitive Psychology, 1986, 18(1): 1–86. [doi: 10.1016/0010-0285(86)90015-0]
[7] Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).
[8] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735–1780. [doi: 10.1162/neco.1997.9.8.1735]
[9] Zhou P, Qi ZY, Zheng SC, et al. Text classification improved by integrating bidirectional LSTM with twodimensional max pooling. Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. Osaka: The COLING 2016 Organizing Committee, 2016. 3485–3495.
[10] Kenton J D M W C, Toutanova L K. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of naacL-HLT. 2019, 1: 2.
[11] Sun C, Qiu X, Xu Y, et al. How to fine-tune bert for text classification? [C]//Chinese Computational Linguistics: 18th China National Conference, CCL 2019, Kunming, China, October 18–20, 2019, Proceedings 18. Springer International Publishing, 2019: 194-206.
[12] Pappagari R, Zelasko P, Villalba J, et al. Hierarchical transformers for long document classification[C]//2019 IEEE automatic speech recognition and understanding workshop (ASRU). IEEE, 2019: 838-844.
[13] Dai A M, Le Q V. Semi-supervised sequence learning [J]. Advances in neural information processing systems, 2015, 28. 
[14] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv:1810.04805, 2018.
[15] Zeng et al. GLM-130B: An Open Bilingual Pre-Trained Model. ICLR 2023.
[16] GLM, Team, et al. "ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools." arxiv preprint arxiv: 2406. 12793 (2024).

Downloads: 13271
Visits: 376003

Sponsors, Associates, and Links


All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.