Research on malicious Web Page Detection algorithm based on Deep Learning
Download as PDF
DOI: 10.23977/ICCIA2020035
Author(s)
Ruoyang Tan, Shuhao Zhang, Sibo Zhang
Corresponding Author
Ruoyang Tan
ABSTRACT
In recent years, malicious web page detection mainly relies on semantic analysis or code simulation execution to extract features, but these methods are complex to implement, require high computational overhead and increase the attack surface. For this reason, a malicious web page detection method based on deep learning is proposed. Firstly, simple regular expressions are used to extract semantically independent tags directly from static HTML documents. Then the neural network model is used to capture the local representation of the document on multiple hierarchical spatial scales, and the ability to quickly find small malicious code fragments from web pages of arbitrary length is realized. this method is compared with a variety of baseline models and simplified models, and the results show that this method achieves a detection rate of 96.4% with a false alarm rate of 0.1%. Better classification accuracy is obtained. the speed and accuracy of this method make it suitable for deployment to endpoints, firewalls and Web agents.
KEYWORDS
Deep learning; malicious Web content; web page classification; malicious web page recognition