16 research outputs found
Intrusion detection on the in-vehicle network using machine learning
Controller Area Network (CAN) is a protocol for
the in-vehicle network that connects microcontrollers called
Electronic Control Units (ECUs) and other components in a
vehicle so that they may communicate among themselves and
control the operations of the vehicle. The CAN protocol was
initially not designed with security in mind, but as modern
vehicles are increasingly becoming connected to the outside
world through wired and wireless interfaces, the CAN bus has
become susceptible to intrusions and attacks such as message
injection, replay attacks, denial of service (DoS) attacks, and
eavesdropping. This paper presents an intrusion detection
method based on the Isolation Forest (iForest) algorithm that
detects message insertion attacks using message timing
information. The resulting intrusion detection system benefits
from the linear time complexity and low memory requirement
of the iForest algorithm, as well as the ability to train the
classifier with only a small sample of normal CAN traffic. The
usage of only timing information for intrusion detection makes
it a vehicle-agnostic method that does not rely on the message
content, which is often proprietary and confidential
information. The intrusion detection system was trained with
normal CAN traffic trace and tested with two spoof attack CAN
datasets. The high values obtained for the Area Under Curve
(AUC) measure in the two cases, 0.966 and 0.974, indicated the
effectiveness of this approach for intrusion detectio
A new framework of feature engineering for machine learning in financial fraud detection
Financial fraud activities have soared despite the advancement of fraud detection models empowered by machine learning (ML). To address this issue, we propose a new framework of feature engineering for ML models. The framework consists of feature creation that combines feature aggregation and feature transformation, and feature selection that accommodates a variety of ML algorithms. To illustrate the effectiveness of the framework, we conduct an experiment using an actual financial transaction dataset and show that the framework significantly improves the performance of ML fraud detection models. Specifically, all the ML models complemented by a feature set generated from our framework surpass the same models without such a feature set by nearly 40% on the F1-measure and 20% on the Area Under the Curve (AUC) value
Comparison of Machine Learning Techniques for Mortality Prediction in a Prospective Cohort of Older Adults
As global demographics change, ageing is a global phenomenon which is increasingly of interest in our modern and rapidly changing society. Thus, the application of proper prognostic indices in clinical decisions regarding mortality prediction has assumed a significant importance for personalized risk management (i.e., identifying patients who are at high or low risk of death) and to help ensure effective healthcare services to patients. Consequently, prognostic modelling expressed as all‐cause mortality prediction is an important step for effective patient management. Machine learning has the potential to transform prognostic modelling. In this paper, results on the development of machine learning models for all‐cause mortality prediction in a cohort of healthy older adults are reported. The models are based on features covering anthropometric variables, physical and lab examinations, questionnaires, and lifestyles, as well as wearable data collected in free‐living settings, obtained for the “Healthy Ageing Initiative” study conducted on 2291 recruited participants. Several machine learning techniques including feature engineering, feature selection, data augmentation and resampling were investigated for this purpose. A detailed empirical comparison of the impact of the different techniques is presented and discussed. The achieved performances were also compared with a standard epidemiological model. This investigation showed that, for the dataset under consideration, the best results were achieved with Random Under‐ Sampling in conjunction with Random Forest (either with or without probability calibration). However, while including probability calibration slightly reduced the average performance, it increased the model robustness, as indicated by the lower 95% confidence intervals. The analysis showed that machine learning models could provide comparable results to standard epidemiological models while being completely data‐driven and disease‐agnostic, thus demonstrating the opportunity for building machine learning models on health records data for research and clinical practice. However, further testing is required to significantly improve the model performance and its robustness
Класифікація і таксономія аномалій в аспекті кібербезпеки і захисту інформації
Об’єктом дослідження стали аномалії та системи виявлення аномалій в
кіберпросторі.
Предметом дослідження особливості виникнення аномалій в мережі,
складності детектування аномальної поведінки, аналіз існуючої класифікації
аномалій для побудови таксономії.
Метою даної роботи є складання класифікації аномалій з різних сторін,
знаходження зв’язку аномалій з атаками, аналіз мережевих і немережевих
аномалій, огляд аномалій з боку методів та технік їх детектування для
побудови таксономії, яка буде корисною в наступних дослідженнях атак і
конструювань IDS.The object of the study were anomalies and systems for detecting anomalies in
cyberspace.
The subject of the study is the peculiarities of the occurrence of anomalies in
the network, the complexity of detecting anomalous behavior, the analysis of the
existing classification of anomalies in order to build a taxonomy.
The aim of this work is to classify anomalies from different angles, to find the
connection of anomalies with attacks, analysis of network and non-network
anomalies, review of anomalies by methods and techniques of their detection to build
a taxonomy that will be useful in future studies of IDS constructions, dealing with
specific types of attacks