Machine Learning Techniques in Detection of Malicious Web Traffic
This research shows how to apply deep learning techniques to malware classification, specifically to determine whether a web page is malicious. The data types we examine are HTML, JavaScript and CSS - the building blocks of modern web, and also typical tools to deliver malicious code. Various deep learning techniques, including convolutional neural networks (CNN), long-short term memory (LSTM) and combinations of CNN and LSTM are applied and compared to determine which algorithm is the most effective for our sit- uation. We also compare our results to the results of other conventional machine learning methods, such as support vector machines (SVMs) and k-nearest neighbors (k-NN) meth- ods.To maximize the training speed, we use a distributed environment during our training phase, with synchronous updates.Index term : Malware classification, distributed deep learning, convolutional neural net- work, long-short term memory
Machine Learning Techniques in Detection of Malicious Web Traffic
Supervisor(s): |
Roman Kruszelnicki |
Status: |
finished |
Topic: |
Machine Learning Methods |
Author: |
Ching-Yu Kao |
Submission: |
2016-08-15 |
Type of Thesis: |
Masterthesis
|
Proof of Concept |
No |
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching
|
Astract:This research shows how to apply deep learning techniques to malware classification, specifically to determine whether a web page is malicious. The data types we examine are HTML, JavaScript and CSS - the building blocks of modern web, and also typical tools to deliver malicious code. Various deep learning techniques, including convolutional neural networks (CNN), long-short term memory (LSTM) and combinations of CNN and LSTM are applied and compared to determine which algorithm is the most effective for our sit- uation. We also compare our results to the results of other conventional machine learning methods, such as support vector machines (SVMs) and k-nearest neighbors (k-NN) meth- ods.To maximize the training speed, we use a distributed environment during our training phase, with synchronous updates.Index term : Malware classification, distributed deep learning, convolutional neural net- work, long-short term memory |