7 research outputs found

    A LogitBoost-based algorithm for detecting known and unknown web attacks

    Get PDF
    © 2017 The Authors. Published by IEEE. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1109/ACCESS.2017.2766844The rapid growth in the volume and importance of web communication throughout the Internet has heightened the need for better security protection. Security experts, when protecting systems, maintain a database featuring signatures of a large number of attacks to assist with attack detection. However used in isolation, this can limit the capability of the system as it is only able to recognize known attacks. To overcome the problem, we propose an anomaly-based intrusion detection system using an ensemble classification approach to detect unknown attacks on web servers. The process involves removing irrelevant and redundant features utilising a filter and wrapper selection procedure. Logitboost is then employed together with random forests as a weak classifier. The proposed ensemble technique was evaluated with some artificial data sets namely NSL-KDD, an improved version of the old KDD Cup from 1999, and the recently published UNSW-NB15 data set. The experimental results show that our approach demonstrates superiority, in terms of accuracy and detection rate over the traditional approaches, whilst preserving low false rejection rates.Published versio

    SuperB: Superior Behavior-based Anomaly Detection Defining Authorized Users\u27 Traffic Patterns

    Get PDF
    Network anomalies are correlated to activities that deviate from regular behavior patterns in a network, and they are undetectable until their actions are defined as malicious. Current work in network anomaly detection includes network-based and host-based intrusion detection systems. However, network anomaly detection schemes can suffer from high false detection rates due to the base rate fallacy. When the detection rate is less than the false positive rate, which is found in network anomaly detection schemes working with live data, a high false detection rate can occur. To overcome such a drawback, this paper proposes a superior behavior-based anomaly detection system (SuperB) that defines legitimate network behaviors of authorized users in order to identify unauthorized accesses. I define the network behaviors of the authorized users by training the proposed deep learning model with time series data extracted from network packets of each of the users. Then, the trained model is used to classify all other behaviors (I define these as anomalies) from the defined legitimate behaviors. As a result, SuperB effectively detects all anomalies of network behaviors. The simulation results show that SuperB needs at least five end-to-end network conversations to achieve over 95% accuracy and over 93% true positive rate. Some simulations achieved 100% accuracy and true positive rate. The simulations use live network data combined with the CICIDS2017 data set. The performance has an average of less than 1.1% false positive rate, with some simulations showing 0%. The execution time to process each conversation is 85.20 ± 0.60 milliseconds (ms), and thus it takes about only 426 ms to process five conversations to identify an anomaly

    A Low-Cost Machine Learning Based Network Intrusion Detection System With Data Privacy Preservation

    Get PDF
    Network intrusion is a well-studied area of cyber security. Current machine learning-based network intrusion detection systems (NIDSs) monitor network data and the patterns within those data but at the cost of presenting significant issues in terms of privacy violations which may threaten end-user privacy. Therefore, to mitigate risk and preserve a balance between security and privacy, it is imperative to protect user privacy with respect to intrusion data. Moreover, cost is a driver of a machine learning-based NIDS because such systems are increasingly being deployed on resource-limited edge devices. To solve these issues, in this paper we propose a NIDS called PCC-LSM-NIDS that is composed of a Pearson Correlation Coefficient (PCC) based feature selection algorithm and a Least Square Method (LSM) based privacy-preserving algorithm to achieve low-cost intrusion detection while providing privacy preservation for sensitive data. The proposed PCC-LSM-NIDS is tested on the benchmark intrusion database UNSW-NB15, using five popular classifiers. The experimental results show that the proposed PCC-LSM-NIDS offers advantages in terms of less computational time, while offering an appropriate degree of privacy protection

    A Low-Cost Machine Learning Based Network Intrusion Detection System with Data Privacy Preservation

    Get PDF
    Network intrusion is a well-studied area of cyber security. Current machine learning-based network intrusion detection systems (NIDSs) monitor network data and the patterns within those data but at the cost of presenting significant issues in terms of privacy violations which may threaten end-user privacy. Therefore, to mitigate risk and preserve a balance between security and privacy, it is imperative to protect user privacy with respect to intrusion data. Moreover, cost is a driver of a machine learning-based NIDS because such systems are increasingly being deployed on resource-limited edge devices. To solve these issues, in this paper we propose a NIDS called PCC-LSM-NIDS that is composed of a Pearson Correlation Coefficient (PCC) based feature selection algorithm and a Least Square Method (LSM) based privacy-preserving algorithm to achieve low-cost intrusion detection while providing privacy preservation for sensitive data. The proposed PCC-LSM-NIDS is tested on the benchmark intrusion database UNSW-NB15, using five popular classifiers. The experimental results show that the proposed PCC-LSM-NIDS offers advantages in terms of less computational time, while offering an appropriate degree of privacy protection.Comment: 14 page

    A Robust Cardiovascular Disease Predictor Based on Genetic Feature Selection and Ensemble Learning Classification

    Get PDF
    Timely detection of heart diseases is crucial for treating cardiac patients prior to the occurrence of any fatality. Automated early detection of these diseases is a necessity in areas where specialized doctors are limited. Deep learning methods provided with a decent set of heart disease data can be used to achieve this. This article proposes a robust heart disease prediction strategy using genetic algorithms and ensemble deep learning techniques. The efficiency of genetic algorithms is utilized to select more significant features from a high-dimensional dataset, combined with deep learning techniques such as Adaptive Neuro-Fuzzy Inference System (ANFIS), Multi-Layer Perceptron (MLP), and Radial Basis Function (RBF), to achieve the goal. The boosting algorithm, Logit Boost, is made use of as a meta-learning classifier for predicting heart disease. The Cleveland heart disease dataset found in the UCI repository yields an overall accuracy of 99.66%, which is higher than many of the most efficient approaches now in existence

    Reduction of False Positives in Intrusion Detection Based on Extreme Learning Machine with Situation Awareness

    Get PDF
    Protecting computer networks from intrusions is more important than ever for our privacy, economy, and national security. Seemingly a month does not pass without news of a major data breach involving sensitive personal identity, financial, medical, trade secret, or national security data. Democratic processes can now be potentially compromised through breaches of electronic voting systems. As ever more devices, including medical machines, automobiles, and control systems for critical infrastructure are increasingly networked, human life is also more at risk from cyber-attacks. Research into Intrusion Detection Systems (IDSs) began several decades ago and IDSs are still a mainstay of computer and network protection and continue to evolve. However, detecting previously unseen, or zero-day, threats is still an elusive goal. Many commercial IDS deployments still use misuse detection based on known threat signatures. Systems utilizing anomaly detection have shown great promise to detect previously unseen threats in academic research. But their success has been limited in large part due to the excessive number of false positives that they produce. This research demonstrates that false positives can be better minimized, while maintaining detection accuracy, by combining Extreme Learning Machine (ELM) and Hidden Markov Models (HMM) as classifiers within the context of a situation awareness framework. This research was performed using the University of New South Wales - Network Based 2015 (UNSW-NB15) data set which is more representative of contemporary cyber-attack and normal network traffic than older data sets typically used in IDS research. It is shown that this approach provides better results than either HMM or ELM alone and with a lower False Positive Rate (FPR) than other comparable approaches that also used the UNSW-NB15 data set

    On the predictability of U.S. stock market using machine learning and deep learning techniques

    Get PDF
    Conventional market theories are considered to be inconsistent approach in modern financial analysis. This thesis focuses mainly on the application of sophisticated machine learning and deep learning techniques in stock market statistical predictability and economic significance over the benchmark conventional efficient market hypothesis and econometric models. Five chapters and three publishable papers were proposed altogether, and each chapter is developed to solve specific identifiable problem(s). Chapter one gives the general introduction of the thesis. It presents the statement of the research problems identified in the relevant literature, the objective of the study and the significance of the study. Chapter two applies a plethora of machine learning techniques to forecast the direction of the U.S. stock market. The notable sophisticated techniques such as regularization, discriminant analysis, classification trees, Bayesian and neural networks were employed. The empirical findings revealed that the discriminant analysis classifiers, classification trees, Bayesian classifiers and penalized binary probit models demonstrate significant outperformance over the binary probit models both statistically and economically, proving significant alternatives to portfolio managers. Chapter three focuses mainly on the application of regression training (RT) techniques to forecast the U.S. equity premium. The RT models demonstrate significant evidence of equity premium predictability both statistically and economically relative to the benchmark historical average, delivering significant utility gains. Chapter four investigates the statistical predictive power and economic significance of financial stock market data by deep learning techniques. Chapter five give the summary, conclusion and present area(s) of further research. The techniques are proven to be robust both statistically and economically when forecasting the equity premium out-of-sample using recursive window method. Overall, the deep learning techniques produced the best result in this thesis. They seek to provide meaningful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk
    corecore