Education, Science, Technology, Innovation and Life
Open Access
Sign In

Research on malicious Web Page Detection algorithm based on Deep Learning

Download as PDF

DOI: 10.23977/ICCIA2020035

Author(s)

Ruoyang Tan, Shuhao Zhang, Sibo Zhang

Corresponding Author

Ruoyang Tan

ABSTRACT

In recent years, malicious web page detection mainly relies on semantic analysis or code simulation execution to extract features, but these methods are complex to implement, require high computational overhead and increase the attack surface. For this reason, a malicious web page detection method based on deep learning is proposed. Firstly, simple regular expressions are used to extract semantically independent tags directly from static HTML documents. Then the neural network model is used to capture the local representation of the document on multiple hierarchical spatial scales, and the ability to quickly find small malicious code fragments from web pages of arbitrary length is realized. this method is compared with a variety of baseline models and simplified models, and the results show that this method achieves a detection rate of 96.4% with a false alarm rate of 0.1%. Better classification accuracy is obtained. the speed and accuracy of this method make it suitable for deployment to endpoints, firewalls and Web agents.

KEYWORDS

Deep learning; malicious Web content; web page classification; malicious web page recognition

All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.