Malicious and benign websites classification using machine learning methods

Authors

  • M. Lavreniuk National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute», Institute of Physics and Technology, Ukraine
  • O. Novikov National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute», Institute of Physics and Technology, Ukraine

DOI:

https://doi.org/10.20535/tacs.2664-29132020.1.209434

Abstract

Nowadays web surfing is an integral part of the life of the average person and everyone would like to protect his own data from thieves and malicious web pages. Therefore, this paper proposes a solution to the discrimination of malicious and benign websites problem with desirable accuracy. We propose to utilize machine learning methods for classification malicious and benign websites based on URL and other host-based features. State-of-the-art gradient-boosted decision trees are proposed to use for this task and they have been compared with well-known machine learning methods as random forest and multilayer perceptron. It was shown that all machine learning methods provided desirable accuracy which is higher than 95% for solving this problem and proposed gradient-boosted decision trees outperforms random forest and neural network approach in this case in terms of both overall accuracy and f1-score.

Downloads

Published

2020-08-06

Issue

Section

Mathematical methods, models and technologies for secure cyberspace functioning research