1,236 research outputs found

    DeepSQLi: Deep Semantic Learning for Testing SQL Injection

    Full text link
    Security is unarguably the most serious concern for Web applications, to which SQL injection (SQLi) attack is one of the most devastating attacks. Automatically testing SQLi vulnerabilities is of ultimate importance, yet is unfortunately far from trivial to implement. This is because the existence of a huge, or potentially infinite, number of variants and semantic possibilities of SQL leading to SQLi attacks on various Web applications. In this paper, we propose a deep natural language processing based tool, dubbed DeepSQLi, to generate test cases for detecting SQLi vulnerabilities. Through adopting deep learning based neural language model and sequence of words prediction, DeepSQLi is equipped with the ability to learn the semantic knowledge embedded in SQLi attacks, allowing it to translate user inputs (or a test case) into a new test case, which is semantically related and potentially more sophisticated. Experiments are conducted to compare DeepSQLi with SQLmap, a state-of-the-art SQLi testing automation tool, on six real-world Web applications that are of different scales, characteristics and domains. Empirical results demonstrate the effectiveness and the remarkable superiority of DeepSQLi over SQLmap, such that more SQLi vulnerabilities can be identified by using a less number of test cases, whilst running much faster

    CleanML: A Study for Evaluating the Impact of Data Cleaning on ML Classification Tasks

    Full text link
    Data quality affects machine learning (ML) model performances, and data scientists spend considerable amount of time on data cleaning before model training. However, to date, there does not exist a rigorous study on how exactly cleaning affects ML -- ML community usually focuses on developing ML algorithms that are robust to some particular noise types of certain distributions, while database (DB) community has been mostly studying the problem of data cleaning alone without considering how data is consumed by downstream ML analytics. We propose a CleanML study that systematically investigates the impact of data cleaning on ML classification tasks. The open-source and extensible CleanML study currently includes 14 real-world datasets with real errors, five common error types, seven different ML models, and multiple cleaning algorithms for each error type (including both commonly used algorithms in practice as well as state-of-the-art solutions in academic literature). We control the randomness in ML experiments using statistical hypothesis testing, and we also control false discovery rate in our experiments using the Benjamini-Yekutieli (BY) procedure. We analyze the results in a systematic way to derive many interesting and nontrivial observations. We also put forward multiple research directions for researchers.Comment: published in ICDE 202

    AN ENHANCED SQL INJECTION DETECTION USING ENSEMBLE METHOD

    Get PDF
    SQL injection is a cybercrime that attacks websites. This issue is still a challenging issue in the realm of security that must be resolved. These attacks are very costly financially, which count millions of dollars each year. Due to large data leaks, the losses also impact the world economy, which averages nearly $50 per year, and most of them are caused by SQL injection. In a study of 300,000 attacks worldwide in any given month, 24.6% were SQL injection. Therefore, implementing a strategy to protect against web application attacks is essential and not easy because we have to protect user privacy and enterprise data. This study proposes an enhanced SQL injection detection using the voting classifier method based on several machine learning algorithms. The proposed classifier could achieve the highest accuracy from this research in 97.07%

    A survey on the application of deep learning for code injection detection

    Get PDF
    Abstract Code injection is one of the top cyber security attack vectors in the modern world. To overcome the limitations of conventional signature-based detection techniques, and to complement them when appropriate, multiple machine learning approaches have been proposed. While analysing these approaches, the surveys focus predominantly on the general intrusion detection, which can be further applied to specific vulnerabilities. In addition, among the machine learning steps, data preprocessing, being highly critical in the data analysis process, appears to be the least researched in the context of Network Intrusion Detection, namely in code injection. The goal of this survey is to fill in the gap through analysing and classifying the existing machine learning techniques applied to the code injection attack detection, with special attention to Deep Learning. Our analysis reveals that the way the input data is preprocessed considerably impacts the performance and attack detection rate. The proposed full preprocessing cycle demonstrates how various machine-learning-based approaches for detection of code injection attacks take advantage of different input data preprocessing techniques. The most used machine learning methods and preprocessing stages have been also identified

    Flexible and Robust Real-Time Intrusion Detection Systems to Network Dynamics

    Get PDF
    Deep learning-based intrusion detection systems have advanced due to their technological innovations such as high accuracy, automation, and scalability to develop an effective network intrusion detection system (NIDS). However, most of the previous research has focused on model generation through intensive analysis of feature engineering instead of considering real environments. They have limitations to applying the previous methods for a real network environment to detect real-time network attacks. In this paper, we propose a new flexible and robust NIDS based on Recurrent Neural Network (RNN) with a multi-classifier to generate a detection model in real time. The proposed system adaptively and intelligently adjusts the generated model with given system parameters that can be used as security parameters to defend against the attacker’s obfuscation techniques in real time. In the experimental results, the proposed system detects network attacks with a high accuracy and high-speed model upgrade in real-time while showing robustness under an attack

    A Machine Learning Approach for Intrusion Detection

    Get PDF
    Master's thesis in Information- and communication technology (IKT590)Securing networks and their confidentiality from intrusions is crucial, and for this rea-son, Intrusion Detection Systems have to be employed. The main goal of this thesis is to achieve a proper detection performance of a Network Intrusion Detection System (NIDS). In this thesis, we have examined the detection efficiency of machine learning algorithms such as Neural Network, Convolutional Neural Network, Random Forestand Long Short-Term Memory. We have constructed our models so that they can detect different types of attacks utilizing the CICIDS2017 dataset. We have worked on identifying 15 various attacks present in CICIDS2017, instead of merely identifying normal-abnormal traffic. We have also discussed the reason why to use precisely this dataset, and why should one classify by attack to enhance the detection. Previous works based on benchmark datasets such as NSL-KDD and KDD99 are discussed. Also, how to address and solve these issues. The thesis also shows how the results are effected using different machine learning algorithms. As the research will demon-strate, the Neural Network, Convulotional Neural Network, Random Forest and Long Short-Term Memory are evaluated by conducting cross validation; the average score across five folds of each model is at 92.30%, 87.73%, 94.42% and 87.94%, respectively. Nevertheless, the confusion metrics was also a crucial measurement to evaluate the models, as we shall see. Keywords: Information security, NIDS, Machine Learning, Neural Network, Convolutional Neural Network, Random Forest, Long Short-Term Memory, CICIDS2017

    Evaluation of machine learning algorithms for anomaly detection

    Get PDF
    Malicious attack detection is one of the critical cyber-security challenges in the peer-to-peer smart grid platforms due to the fact that attackers' behaviours change continuously over time. In this paper, we evaluate twelve Machine Learning (ML) algorithms in terms of their ability to detect anomalous behaviours over the networking practice. The evaluation is performed on three publicly available datasets: CICIDS-2017, UNSW-NB15 and the Industrial Control System (ICS) cyber-attack datasets. The experimental work is performed through the ALICE high-performance computing facility at the University of Leicester. Based on these experiments, a comprehensive analysis of the ML algorithms is presented. The evaluation results verify that the Random Forest (RF) algorithm achieves the best performance in terms of accuracy, precision, Recall, F1-Score and Receiver Operating Characteristic (ROC) curves on all these datasets. It is worth pointing out that other algorithms perform closely to RF and that the decision regarding which ML algorithm to select depends on the data produced by the application system
    • …
    corecore