48 research outputs found

    Featured Anomaly Detection Methods and Applications

    Get PDF
    Anomaly detection is a fundamental research topic that has been widely investigated. From critical industrial systems, e.g., network intrusion detection systems, to people’s daily activities, e.g., mobile fraud detection, anomaly detection has become the very first vital resort to protect and secure public and personal properties. Although anomaly detection methods have been under consistent development over the years, the explosive growth of data volume and the continued dramatic variation of data patterns pose great challenges on the anomaly detection systems and are fuelling the great demand of introducing more intelligent anomaly detection methods with distinct characteristics to cope with various needs. To this end, this thesis starts with presenting a thorough review of existing anomaly detection strategies and methods. The advantageous and disadvantageous of the strategies and methods are elaborated. Afterward, four distinctive anomaly detection methods, especially for time series, are proposed in this work aiming at resolving specific needs of anomaly detection under different scenarios, e.g., enhanced accuracy, interpretable results, and self-evolving models. Experiments are presented and analysed to offer a better understanding of the performance of the methods and their distinct features. To be more specific, the abstracts of the key contents in this thesis are listed as follows: 1) Support Vector Data Description (SVDD) is investigated as a primary method to fulfill accurate anomaly detection. The applicability of SVDD over noisy time series datasets is carefully examined and it is demonstrated that relaxing the decision boundary of SVDD always results in better accuracy in network time series anomaly detection. Theoretical analysis of the parameter utilised in the model is also presented to ensure the validity of the relaxation of the decision boundary. 2) To support a clear explanation of the detected time series anomalies, i.e., anomaly interpretation, the periodic pattern of time series data is considered as the contextual information to be integrated into SVDD for anomaly detection. The formulation of SVDD with contextual information maintains multiple discriminants which help in distinguishing the root causes of the anomalies. 3) In an attempt to further analyse a dataset for anomaly detection and interpretation, Convex Hull Data Description (CHDD) is developed for realising one-class classification together with data clustering. CHDD approximates the convex hull of a given dataset with the extreme points which constitute a dictionary of data representatives. According to the dictionary, CHDD is capable of representing and clustering all the normal data instances so that anomaly detection is realised with certain interpretation. 4) Besides better anomaly detection accuracy and interpretability, better solutions for anomaly detection over streaming data with evolving patterns are also researched. Under the framework of Reinforcement Learning (RL), a time series anomaly detector that is consistently trained to cope with the evolving patterns is designed. Due to the fact that the anomaly detector is trained with labeled time series, it avoids the cumbersome work of threshold setting and the uncertain definitions of anomalies in time series anomaly detection tasks

    Ensemble Learning based Anomaly Detection for IoT Cybersecurity via Bayesian Hyperparameters Sensitivity Analysis

    Full text link
    The Internet of Things (IoT) integrates more than billions of intelligent devices over the globe with the capability of communicating with other connected devices with little to no human intervention. IoT enables data aggregation and analysis on a large scale to improve life quality in many domains. In particular, data collected by IoT contain a tremendous amount of information for anomaly detection. The heterogeneous nature of IoT is both a challenge and an opportunity for cybersecurity. Traditional approaches in cybersecurity monitoring often require different kinds of data pre-processing and handling for various data types, which might be problematic for datasets that contain heterogeneous features. However, heterogeneous types of network devices can often capture a more diverse set of signals than a single type of device readings, which is particularly useful for anomaly detection. In this paper, we present a comprehensive study on using ensemble machine learning methods for enhancing IoT cybersecurity via anomaly detection. Rather than using one single machine learning model, ensemble learning combines the predictive power from multiple models, enhancing their predictive accuracy in heterogeneous datasets rather than using one single machine learning model. We propose a unified framework with ensemble learning that utilises Bayesian hyperparameter optimisation to adapt to a network environment that contains multiple IoT sensor readings. Experimentally, we illustrate their high predictive power when compared to traditional methods

    A Comprehensive Survey on the Cyber-Security of Smart Grids: Cyber-Attacks, Detection, Countermeasure Techniques, and Future Directions

    Full text link
    One of the significant challenges that smart grid networks face is cyber-security. Several studies have been conducted to highlight those security challenges. However, the majority of these surveys classify attacks based on the security requirements, confidentiality, integrity, and availability, without taking into consideration the accountability requirement. In addition, some of these surveys focused on the Transmission Control Protocol/Internet Protocol (TCP/IP) model, which does not differentiate between the application, session, and presentation and the data link and physical layers of the Open System Interconnection (OSI) model. In this survey paper, we provide a classification of attacks based on the OSI model and discuss in more detail the cyber-attacks that can target the different layers of smart grid networks communication. We also propose new classifications for the detection and countermeasure techniques and describe existing techniques under each category. Finally, we discuss challenges and future research directions

    Unsupervised Intrusion Detection with Cross-Domain Artificial Intelligence Methods

    Get PDF
    Cybercrime is a major concern for corporations, business owners, governments and citizens, and it continues to grow in spite of increasing investments in security and fraud prevention. The main challenges in this research field are: being able to detect unknown attacks, and reducing the false positive ratio. The aim of this research work was to target both problems by leveraging four artificial intelligence techniques. The first technique is a novel unsupervised learning method based on skip-gram modeling. It was designed, developed and tested against a public dataset with popular intrusion patterns. A high accuracy and a low false positive rate were achieved without prior knowledge of attack patterns. The second technique is a novel unsupervised learning method based on topic modeling. It was applied to three related domains (network attacks, payments fraud, IoT malware traffic). A high accuracy was achieved in the three scenarios, even though the malicious activity significantly differs from one domain to the other. The third technique is a novel unsupervised learning method based on deep autoencoders, with feature selection performed by a supervised method, random forest. Obtained results showed that this technique can outperform other similar techniques. The fourth technique is based on an MLP neural network, and is applied to alert reduction in fraud prevention. This method automates manual reviews previously done by human experts, without significantly impacting accuracy

    Artificial immune system based security algorithm for mobile ad hoc networks

    Get PDF
    Securing Mobile Ad hoc Networks (MANET) that are a collection of mobile, decentralized, and self-organized nodes is a challenging task. The most fundamental aspect of a MANET is its lack of infrastructure, and most design issues and challenges stem from this characteristic. The lack of a centralized control mechanism brings added difficulty in fault detection and correction. The dynamically changing nature of mobile nodes causes the formation of an unpredictable topology. This varying topology causes frequent traffic routing changes, network partitioning and packet losses. The various attacks that can be carried out on MANETs challenge the security capabilities of the mobile wireless network in which nodes can join, leave and move dynamically. The Human Immune System (HIS) provides a foundation upon which Artificial Immune algorithms are based. The algorithms can be used to secure both host-based and network-based systems. However, it is not only important to utilize the HIS during the development of Artificial Immune System (AIS) based algorithms as much as it is important to introduce an algorithm with high performance. Therefore, creating a balance between utilizing HIS and AIS-based intrusion detection algorithms is a crucial issue that is important to investigate. The immune system is a key to the defence of a host against foreign objects or pathogens. Proper functioning of the immune system is necessary to maintain host homeostasis. The cells that play a fundamental role in this defence process are known as Dendritic Cells (DC). The AIS based Dendritic Cell Algorithm is widely known for its large number of applications and well established in the literature. The dynamic, distributed topology of a MANET provides many challenges, including decentralized infrastructure wherein each node can act as a host, router and relay for traffic. MANETs are a suitable solution for distributed regional, military and emergency networks. MANETs do not utilize fixed infrastructure except where a connection to a carrier network is required, and MANET nodes provide the transmission capability to receive, transmit and route traffic from a sender node to the destination node. In the HIS, cells can distinguish between a range of issues including foreign body attacks as well as cellular senescence. The primary purpose of this research is to improve the security of MANET using the AIS framework. This research presents a new defence approach using AIS which mimics the strategy of the HIS combined with Danger Theory. The proposed framework is known as the Artificial Immune System based Security Algorithm (AISBA). This research also modelled participating nodes as a DC and proposed various signals to indicate the MANET communications state. Two trust models were introduced based on AIS signals and effective communication. The trust models proposed in this research helped to distinguish between a “good node” as well as a “selfish node”. A new MANET security attack was identified titled the Packet Storage Time attack wherein the attacker node modifies its queue time to make the packets stay longer than necessary and then circulates stale packets in the network. This attack is detected using the proposed AISBA. This research, performed extensive simulations with results to support the effectiveness of the proposed framework, and statistical analysis was done which showed the false positive and false negative probability falls below 5%. Finally, two variations of the AISBA were proposed and investigated, including the Grudger based Artificial Immune System Algorithm - to stimulate selfish nodes to cooperate for the benefit of the MANET and Pain reduction based Artificial Immune System Algorithm - to model Pain analogous to HIS

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

    Metabolic profiling on 2D NMR TOCSY spectra using machine learning

    Get PDF
    Due to the dynamicity of biological cells, the role of metabolic profiling in discovering biological fingerprints of diseases, and their evolution, as well as the cellular pathway of different biological or chemical stimuli is most significant. Two-dimensional nuclear magnetic resonance (2D NMR) is one of the fundamental and strong analytical instruments for metabolic profiling. Though, total correlation spectroscopy (2D NMR 1H -1H TOCSY) can be used to improve spectral overlap of 1D NMR, strong peak shift, signal overlap, spectral crowding and matrix effects in complex biological mixtures are extremely challenging in 2D NMR analysis. In this work, we introduce an automated metabolic deconvolution and assignment based on the deconvolution of 2D TOCSY of real breast cancer tissue, in addition to different differentiation pathways of adipose tissue-derived human Mesenchymal Stem cells. A major alternative to the common approaches in NMR based machine learning where images of the spectra are used as an input, our metabolic assignment is based only on the vertical and horizontal frequencies of metabolites in the 1H-1H TOCSY. One- and multi-class Kernel null foley–Sammon transform, support vector machines, polynomial classifier kernel density estimation, and support vector data description classifiers were tested in semi-supervised learning and novelty detection settings. The classifiers’ performance was evaluated by comparing the conventional human-based methodology and automatic assignments under different initial training sizes settings. The results of our novel metabolic profiling methods demonstrate its suitability, robustness, and speed in automated nontargeted NMR metabolic analysis


    Get PDF
    Cancer occurs when normal cells grow and multiply without normal control. As the cells multiply, they form an area of abnormal cells, known as a tumour. Many tumours exhibit abnormal chromosomal segregation at cell division. These anomalies play an important role in detecting molar pregnancy cancer. Molar pregnancy, also known as hydatidiform mole, can be categorised into partial (PHM) and complete (CHM) mole, persistent gestational trophoblastic and choriocarcinoma. Hydatidiform moles are most commonly found in women under the age of 17 or over the age of 35. Hydatidiform moles can be detected by morphological and histopathological examination. Even experienced pathologists cannot easily classify between complete and partial hydatidiform moles. However, the distinction between complete and partial hydatidiform moles is important in order to recommend the appropriate treatment method. Therefore, research into molar pregnancy image analysis and understanding is critical. The hypothesis of this research project is that an anomaly detection approach to analyse molar pregnancy images can improve image analysis and classification of normal PHM and CHM villi. The primary aim of this research project is to develop a novel method, based on anomaly detection, to identify and classify anomalous villi in molar pregnancy stained images. The novel method is developed to simulate expert pathologists’ approach in diagnosis of anomalous villi. The knowledge and heuristics elicited from two expert pathologists are combined with the morphological domain knowledge of molar pregnancy, to develop a heuristic multi-neural network architecture designed to classify the villi into their appropriated anomalous types. This study confirmed that a single feature cannot give enough discriminative power for villi classification. Whereas expert pathologists consider the size and shape before textural features, this thesis demonstrated that the textural feature has a higher discriminative power than size and shape. The first heuristic-based multi-neural network, which was based on 15 elicited features, achieved an improved average accuracy of 81.2%, compared to the traditional multi-layer perceptron (80.5%); however, the recall of CHM villi class was still low (64.3%). Two further textural features, which were elicited and added to the second heuristic-based multi-neural network, have improved the average accuracy from 81.2% to 86.1% and the recall of CHM villi class from 64.3% to 73.5%. The precision of the multi-neural network II has also increased from 82.7% to 89.5% for normal villi class, from 81.3% to 84.7% for PHM villi class and from 80.8% to 86% for CHM villi class. To support pathologists to visualise the results of the segmentation, a software tool, Hydatidiform Mole Analysis Tool (HYMAT), was developed compiling the morphological and pathological data for each villus analysis