Comparative study of supervised learning methods for malware analysis

Kruczkowski, M; Niewiadomska-Szynkiewicz, E

  • Journal of Telecommunications and Information Technology;
  • Tom: 4;
  • Strony: 24-33;
  • 2014;

Malware is a software designed to disrupt or even damage computer system or do other unwanted actions on a computer system. Nowadays, malware is a common threat of the World Wide Web. Anti-malware protection and intrusion detection can be significantly supported by a comprehensive and extensive analysis of data on the Web. The aim of such analysis is a classification of the collected data into two sets, i.e.,normal and malicious data. In this paper we investigate the use of three supervised learning methods for data mining to support the malware etection. We describe the results of applications of Support Vector Machine, Naive Bayes and k-Nearest Neighbours techniques to classification of the data taken from devices located in many units, organizations and monitoring systems serviced by CERT Poland. The performance of all methods is compared and discussed. From the results of our experiments we can say that the supervised learning algorithms method can be successfully used to computer data analysis, and can support computer emergency response teams in threats detection.

Słowa kluczowe: Data classification, malware analysis, Support Vector Machine, Naive Bayes, k-Nearest Neighbours