Entity: Universitat Rovira i Virgili (URV)
Confidenciality: No
Education area(s): Enginyeria de la Seguretat Informàtica i Intel·ligència Artificial
Title in different languages: Novelty detection and early classification of malicious activities on industrial control systems
Abstract: Industrial Control Systems (ICS) are a set of industrial processes that manage, direct and regulate the behaviour of other devices. In particular, these processes are vital to service critical infrastructure, such as communications, manufacturing, and energy. An attack on these infrastructures can pose a threat to the day to day of the states. Unfortunately, in recent years ICS have been subject to an increase in the number of attacks Although Network Anomaly Detection Systems (NADS) are capable of detecting existing and zero-day attacks, it is still not universally implemented in industry and real applications, since current systems produce high False Positive Rates (FPRs) and low Detection Rates (DRs). Consequently, anomaly detection is still under-utilized in the cybersecurity arena. However, the alternative technique, the detection of abuse, is limited by the fact that it only addresses known vulnerabilities. Therefore, there is a mandatory need for anomaly detection to be operational to increase the coverage of current Intrusion Detection Systems (IDS). The goal of this master thesis is to apply machine learning and deep learning techniques that allow the normality space definition to address novelty detection and early classification of malicious activities on ICS. The rationale behind is that if the underlying infrastructure of malware samples is similar (e.g., as a result of being controlled by the same attacker, or by reusing code from another author), their behaviours or the order in which they perform certain actions would be similar. Specifically, the solution looks at similarities shown in industrial network traffic modelized by an enhanced feature set composed of simple, high-level features, extracted from the headers and payloads of the network packets. Regarding machine learning techniques, some traditional Machine Learning methods and Deep Neural Networks well suited for high-dimensional datasets have been implemented and tested. After several trails, the final solution combines an unsupervised anomaly detection technique called AutoEncoder (for the detection of unknown attacks) with Random Forest, a supervised machine learning that supports the detection of known attacks as well to reduce the false positive rate. This novel machine learning pipeline successfully detects zero-day and specific attacks since it achieves a precision of 0.998425, recall of 0.9607375 and f1-score of 0.9733375 while reduces to a negligible value the false positive rate. The performance evaluations of this approach have been carried out using the benchmark dataset named as CICIDS2017. This dataset was created by the Canadian Institute for Cybersecurity (CIC) and the University of New Brunswick (UNB). This dataset accommodates a variety of up-to-date multistage attacks and intruder strategies in modern normal behaviours.
Subject: Enginyeria informàtica
Academic year: 2020-2021
Language: en
Work's public defense date: 2021-09-21
Subject areas: Computer engineering
Student: Palacios Prados, Maria Carmen
Work's codirector: 46557028Z
Department: Enginyeria Informàtica i Matemàtiques
Creation date in repository: 2022-05-17
Keywords: Intrusion Detection, AutoEncoder, Random Forest
Title in original language: Novelty detection and early classification of malicious activities on industrial control systems
Access Rights: info:eu-repo/semantics/openAccess
Project director: Gómez Jiménez, Sergio