1,537 research outputs found
AutoOC: Automated multi-objective design of deep autoencoders and one-class classifiers using grammatical evolution
One-Class Classification (OCC) corresponds to a subclass of unsupervised Machine Learning (ML) that is valuable when labeled data is non-existent. In this paper, we present AutoOC, a computationally efficient Grammatical Evolution (GE) approach that automatically searches for OCC models. AutoOC assumes a multi-objective optimization, aiming to increase the OCC predictive performance while reducing the ML training time. AutoOC also includes two execution speedup mechanisms, a periodic training sampling, and a multi-core fitness evaluation. In particular, we study two AutoOC variants: a pure Neuroevolution (NE) setup that optimizes two types of deep learning models, namely dense Autoencoder (AE) and Variational Autoencoder (VAE); and a general Automated Machine Learning (AutoML) ALL setup that considers five distinct OCC base learners, specifically Isolation Forest (IF), Local Outlier Factor (LOF), One-Class SVM (OC-SVM), AE and VAE. Several experiments were conducted, using eight public OpenML datasets and two validation scenarios (unsupervised and supervised). The results show that AutoOC requires a reasonable amount of execution time and tends to obtain lightweight OCC models. Moreover, AutoOC provides quality predictive results, outperforming a baseline IF for all analyzed datasets and surpassing the best supervised OpenML human modeling for two datasets.- (undefined
A survey of machine learning techniques applied to self organizing cellular networks
In this paper, a survey of the literature of the past fifteen years involving Machine Learning (ML) algorithms applied to self organizing cellular networks is performed. In order for future networks to overcome the current limitations and address the issues of current cellular systems, it is clear that more intelligence needs to be deployed, so that a fully autonomous and flexible network can be enabled. This paper focuses on the learning perspective of Self Organizing Networks (SON) solutions and provides, not only an overview of the most common ML techniques encountered in cellular networks, but also manages to classify each paper in terms of its learning solution, while also giving some examples. The authors also classify each paper in terms of its self-organizing use-case and discuss how each proposed solution performed. In addition, a comparison between the most commonly found ML algorithms in terms of certain SON metrics is performed and general guidelines on when to choose each ML algorithm for each SON function are proposed. Lastly, this work also provides future research directions and new paradigms that the use of more robust and intelligent algorithms, together with data gathered by operators, can bring to the cellular networks domain and fully enable the concept of SON in the near future
A framework for automated anomaly detection in high frequency water-quality data from in situ sensors
River water-quality monitoring is increasingly conducted using automated in
situ sensors, enabling timelier identification of unexpected values. However,
anomalies caused by technical issues confound these data, while the volume and
velocity of data prevent manual detection. We present a framework for automated
anomaly detection in high-frequency water-quality data from in situ sensors,
using turbidity, conductivity and river level data. After identifying end-user
needs and defining anomalies, we ranked their importance and selected suitable
detection methods. High priority anomalies included sudden isolated spikes and
level shifts, most of which were classified correctly by regression-based
methods such as autoregressive integrated moving average models. However, using
other water-quality variables as covariates reduced performance due to complex
relationships among variables. Classification of drift and periods of
anomalously low or high variability improved when we applied replaced anomalous
measurements with forecasts, but this inflated false positive rates.
Feature-based methods also performed well on high priority anomalies, but were
also less proficient at detecting lower priority anomalies, resulting in high
false negative rates. Unlike regression-based methods, all feature-based
methods produced low false positive rates, but did not and require training or
optimization. Rule-based methods successfully detected impossible values and
missing observations. Thus, we recommend using a combination of methods to
improve anomaly detection performance, whilst minimizing false detection rates.
Furthermore, our framework emphasizes the importance of communication between
end-users and analysts for optimal outcomes with respect to both detection
performance and end-user needs. Our framework is applicable to other types of
high frequency time-series data and anomaly detection applications
Applying Machine Learning to Cyber Security
Intrusion Detection Systems (IDS) nowadays are a very important part of a system. In the last years many methods have been proposed to implement this kind of security measure against cyber attacks, including Machine Learning and Data Mining based. In this work we discuss in details the family of anomaly based IDSs, which are able to detect never seen attacks, paying particular attention to adherence to the FAIR principles. This principles include the Accessibility and the Reusability of software. Moreover, as the purpose of this work is the assessment of what is going on in the state of the art we have selected three approaches, according to their reproducibility and we have compared their performances with a common experimental setting. Lastly real world use case has been analyzed, resulting in the proposal of an usupervised ML model for pre-processing and analyzing web server logs. The proposed solution uses clustering and outlier detection techniques to detect attacks in an unsupervised way
- …