634 research outputs found

    Reduction of False Positives in Intrusion Detection Based on Extreme Learning Machine with Situation Awareness

    Get PDF
    Protecting computer networks from intrusions is more important than ever for our privacy, economy, and national security. Seemingly a month does not pass without news of a major data breach involving sensitive personal identity, financial, medical, trade secret, or national security data. Democratic processes can now be potentially compromised through breaches of electronic voting systems. As ever more devices, including medical machines, automobiles, and control systems for critical infrastructure are increasingly networked, human life is also more at risk from cyber-attacks. Research into Intrusion Detection Systems (IDSs) began several decades ago and IDSs are still a mainstay of computer and network protection and continue to evolve. However, detecting previously unseen, or zero-day, threats is still an elusive goal. Many commercial IDS deployments still use misuse detection based on known threat signatures. Systems utilizing anomaly detection have shown great promise to detect previously unseen threats in academic research. But their success has been limited in large part due to the excessive number of false positives that they produce. This research demonstrates that false positives can be better minimized, while maintaining detection accuracy, by combining Extreme Learning Machine (ELM) and Hidden Markov Models (HMM) as classifiers within the context of a situation awareness framework. This research was performed using the University of New South Wales - Network Based 2015 (UNSW-NB15) data set which is more representative of contemporary cyber-attack and normal network traffic than older data sets typically used in IDS research. It is shown that this approach provides better results than either HMM or ELM alone and with a lower False Positive Rate (FPR) than other comparable approaches that also used the UNSW-NB15 data set

    Electronic fraud detection in the U.S. Medicaid Healthcare Program: lessons learned from other industries

    Get PDF
    It is estimated that between 600and600 and 850 billion annually is lost to fraud, waste, and abuse in the US healthcare system,with 125to125 to 175 billion of this due to fraudulent activity (Kelley 2009). Medicaid, a state-run, federally-matchedgovernment program which accounts for roughly one-quarter of all healthcare expenses in the US, has been particularlysusceptible targets for fraud in recent years. With escalating overall healthcare costs, payers, especially government-runprograms, must seek savings throughout the system to maintain reasonable quality of care standards. As such, the need foreffective fraud detection and prevention is critical. Electronic fraud detection systems are widely used in the insurance,telecommunications, and financial sectors. What lessons can be learned from these efforts and applied to improve frauddetection in the Medicaid health care program? In this paper, we conduct a systematic literature study to analyze theapplicability of existing electronic fraud detection techniques in similar industries to the US Medicaid program

    Featured Anomaly Detection Methods and Applications

    Get PDF
    Anomaly detection is a fundamental research topic that has been widely investigated. From critical industrial systems, e.g., network intrusion detection systems, to people’s daily activities, e.g., mobile fraud detection, anomaly detection has become the very first vital resort to protect and secure public and personal properties. Although anomaly detection methods have been under consistent development over the years, the explosive growth of data volume and the continued dramatic variation of data patterns pose great challenges on the anomaly detection systems and are fuelling the great demand of introducing more intelligent anomaly detection methods with distinct characteristics to cope with various needs. To this end, this thesis starts with presenting a thorough review of existing anomaly detection strategies and methods. The advantageous and disadvantageous of the strategies and methods are elaborated. Afterward, four distinctive anomaly detection methods, especially for time series, are proposed in this work aiming at resolving specific needs of anomaly detection under different scenarios, e.g., enhanced accuracy, interpretable results, and self-evolving models. Experiments are presented and analysed to offer a better understanding of the performance of the methods and their distinct features. To be more specific, the abstracts of the key contents in this thesis are listed as follows: 1) Support Vector Data Description (SVDD) is investigated as a primary method to fulfill accurate anomaly detection. The applicability of SVDD over noisy time series datasets is carefully examined and it is demonstrated that relaxing the decision boundary of SVDD always results in better accuracy in network time series anomaly detection. Theoretical analysis of the parameter utilised in the model is also presented to ensure the validity of the relaxation of the decision boundary. 2) To support a clear explanation of the detected time series anomalies, i.e., anomaly interpretation, the periodic pattern of time series data is considered as the contextual information to be integrated into SVDD for anomaly detection. The formulation of SVDD with contextual information maintains multiple discriminants which help in distinguishing the root causes of the anomalies. 3) In an attempt to further analyse a dataset for anomaly detection and interpretation, Convex Hull Data Description (CHDD) is developed for realising one-class classification together with data clustering. CHDD approximates the convex hull of a given dataset with the extreme points which constitute a dictionary of data representatives. According to the dictionary, CHDD is capable of representing and clustering all the normal data instances so that anomaly detection is realised with certain interpretation. 4) Besides better anomaly detection accuracy and interpretability, better solutions for anomaly detection over streaming data with evolving patterns are also researched. Under the framework of Reinforcement Learning (RL), a time series anomaly detector that is consistently trained to cope with the evolving patterns is designed. Due to the fact that the anomaly detector is trained with labeled time series, it avoids the cumbersome work of threshold setting and the uncertain definitions of anomalies in time series anomaly detection tasks

    Unsupervised Intrusion Detection with Cross-Domain Artificial Intelligence Methods

    Get PDF
    Cybercrime is a major concern for corporations, business owners, governments and citizens, and it continues to grow in spite of increasing investments in security and fraud prevention. The main challenges in this research field are: being able to detect unknown attacks, and reducing the false positive ratio. The aim of this research work was to target both problems by leveraging four artificial intelligence techniques. The first technique is a novel unsupervised learning method based on skip-gram modeling. It was designed, developed and tested against a public dataset with popular intrusion patterns. A high accuracy and a low false positive rate were achieved without prior knowledge of attack patterns. The second technique is a novel unsupervised learning method based on topic modeling. It was applied to three related domains (network attacks, payments fraud, IoT malware traffic). A high accuracy was achieved in the three scenarios, even though the malicious activity significantly differs from one domain to the other. The third technique is a novel unsupervised learning method based on deep autoencoders, with feature selection performed by a supervised method, random forest. Obtained results showed that this technique can outperform other similar techniques. The fourth technique is based on an MLP neural network, and is applied to alert reduction in fraud prevention. This method automates manual reviews previously done by human experts, without significantly impacting accuracy

    Self-adaptive structure semi-supervised methods for streamed emblematic gestures

    Get PDF
    Although many researchers try to improve the level of machine intelligence, there is still a long way to achieve intelligence similar to what humans have. Scientists and engineers are continuously trying to increase the level of smartness of the modern technology, i.e. smartphones and robotics. Humans communicate with each other by using the voice and gestures. Hence, gestures are essential to transfer the information to the partner. To reach a higher level of intelligence, the machine should learn from and react to the human gestures, which mean learning from continuously streamed gestures. This task faces serious challenges since processing streamed data suffers from different problems. Besides the stream data being unlabelled, the stream is long. Furthermore, “concept-drift” and “concept evolution” are the main problems of them. The data of the data streams have several other problems that are worth to be mentioned here, e.g. they are: dynamically changed, presented only once, arrived at high speed, and non-linearly distributed. In addition to the general problems of the data streams, gestures have additional problems. For example, different techniques are required to handle the varieties of gesture types. The available methods solve some of these problems individually, while we present a technique to solve these problems altogether. Unlabelled data may have additional information that describes the labelled data more precisely. Hence, semi-supervised learning is used to handle the labelled and unlabelled data. However, the data size increases continuously, which makes training classifiers so hard. Hence, we integrate the incremental learning technique with semi-supervised learning, which enables the model to update itself on new data without the need of the old data. Additionally, we integrate the incremental class learning within the semi-supervised learning, since there is a high possibility of incoming new concepts in the streamed gestures. Moreover, the system should be able to distinguish among different concepts and also should be able to identify random movements. Hence, we integrate the novelty detection to distinguish between the gestures that belong to the known concepts and those that belong to unknown concepts. The extreme value theory is used for this purpose, which overrides the need of additional labelled data to set the novelty threshold and has several other supportive features. Clustering algorithms are used to distinguish among different new concepts and also to identify random movements. Furthermore, the system should be able to update itself on only the trusty assignments, since updating the classifier on wrongly assigned gesture affects the performance of the system. Hence, we propose confidence measures for the assigned labels. We propose six types of semi-supervised algorithms that depend on different techniques to handle different types of gestures. The proposed classifiers are based on the Parzen window classifier, support vector machine classifier, neural network (extreme learning machine), Polynomial classifier, Mahalanobis classifier, and nearest class mean classifier. All of these classifiers are provided with the mentioned features. Additionally, we submit a wrapper method that uses one of the proposed classifiers or ensemble of them to autonomously issue new labels to the new concepts and update the classifiers on the newly incoming information depending on whether they belong to the known classes or new classes. It can recognise the different novel concepts and also identify random movements. To evaluate the system we acquired gesture data with nine different gesture classes. Each of them represents a different order to the machine e.g. come, go, etc. The data are collected using the Microsoft Kinect sensor. The acquired data contain 2878 gestures achieved by ten volunteers. Different sets of features are computed and used in the evaluation of the system. Additionally, we used real data, synthetic data and public data as support to the evaluation process. All the features, incremental learning, incremental class learning, and novelty detection are evaluated individually. The outputs of the classifiers are compared with the original classifier or with the benchmark classifiers. The results show high performances of the proposed algorithms
    corecore