TY - GEN
T1 - Classifier Selection for an Ensemble of Network Traffic Analysis Machine Learning Models
AU - Roponena, Evita
AU - Polaka, Inese
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - During the COVID-19 pandemic, the need for digitalization of business processes has increased. Consequently, the number of cyberattacks has also increased, which has a negative impact on businesses. One way to detect cyber threats in a system is to perform network traffic analysis using automated techniques. Machine learning algorithms are able to ensure data analysis automation. This research was conducted to understand how to select the most suitable classifiers for network traffic analysis machine learning ensemble. The CICIDS-2017 intrusion detection evaluation dataset was selected for training and testing of the created approach. The binary classification machine learning ensemble consisted of random forest (RF), 3 types of decision trees (DT), XGBoost, and extremely randomized trees (ET) classifiers. The multiclass classification machine learning ensemble consisted of all the classifiers mentioned above, except the XGBoost classifier. In the case of binary classification, the machine learning ensemble reached an accuracy of 0.9997 using test data. The training time is 449.5 seconds, while the testing rate is 32768 records per second. The multiclass machine learning ensemble reached 0.9991 accuracy using test data, training time 1671.39 seconds, and testing rate 7695 records per second.
AB - During the COVID-19 pandemic, the need for digitalization of business processes has increased. Consequently, the number of cyberattacks has also increased, which has a negative impact on businesses. One way to detect cyber threats in a system is to perform network traffic analysis using automated techniques. Machine learning algorithms are able to ensure data analysis automation. This research was conducted to understand how to select the most suitable classifiers for network traffic analysis machine learning ensemble. The CICIDS-2017 intrusion detection evaluation dataset was selected for training and testing of the created approach. The binary classification machine learning ensemble consisted of random forest (RF), 3 types of decision trees (DT), XGBoost, and extremely randomized trees (ET) classifiers. The multiclass classification machine learning ensemble consisted of all the classifiers mentioned above, except the XGBoost classifier. In the case of binary classification, the machine learning ensemble reached an accuracy of 0.9997 using test data. The training time is 449.5 seconds, while the testing rate is 32768 records per second. The multiclass machine learning ensemble reached 0.9991 accuracy using test data, training time 1671.39 seconds, and testing rate 7695 records per second.
KW - binary classification
KW - feature selection
KW - machine learning ensemble
KW - multiclass classification
KW - netflow analysis
UR - https://www.scopus.com/pages/publications/85142927789
U2 - 10.1109/ITMS56974.2022.9937116
DO - 10.1109/ITMS56974.2022.9937116
M3 - Conference paper
AN - SCOPUS:85142927789
T3 - 2022 63rd International Scientific Conference on Information Technology and Management Science of Riga Technical University, ITMS 2022 - Proceedings
BT - 2022 63rd International Scientific Conference on Information Technology and Management Science of Riga Technical University, ITMS 2022 - Proceedings
A2 - Grabis, Janis
A2 - Romanovs, Andrejs
A2 - Kulesova, Galina
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 63rd International Scientific Conference on Information Technology and Management Science of Riga Technical University, ITMS 2022
Y2 - 6 October 2022 through 7 October 2022
ER -