Search CORE

2,802 research outputs found

A survey on the application of deep learning for code injection detection

Author: Giuseppe Bianchi
Stanislav Abaimov
Publication venue
Publication date: 01/09/2021
Field of study

Abstract Code injection is one of the top cyber security attack vectors in the modern world. To overcome the limitations of conventional signature-based detection techniques, and to complement them when appropriate, multiple machine learning approaches have been proposed. While analysing these approaches, the surveys focus predominantly on the general intrusion detection, which can be further applied to specific vulnerabilities. In addition, among the machine learning steps, data preprocessing, being highly critical in the data analysis process, appears to be the least researched in the context of Network Intrusion Detection, namely in code injection. The goal of this survey is to fill in the gap through analysing and classifying the existing machine learning techniques applied to the code injection attack detection, with special attention to Deep Learning. Our analysis reveals that the way the input data is preprocessed considerably impacts the performance and attack detection rate. The proposed full preprocessing cycle demonstrates how various machine-learning-based approaches for detection of code injection attacks take advantage of different input data preprocessing techniques. The most used machine learning methods and preprocessing stages have been also identified

Directory of Open Access Journals

Open Access Repository

A Novel Technique to Pre-Process Web Log Data Using SQL Server Management Studio

Author: Kalaivani S. (S)
Shyamala K. (K)
Publication venue: 'Infogain Publication'
Publication date: 01/07/2016
Field of study

Web log data available at server side helps in identifying user access pattern. Analysis of Web log data poses challenges as it consists of plentiful information of a Web page. Log file contains information about User name, IP address, Access Request, Number of Bytes Transferred, Result Status, Uniform Resource Locator (URL), User Agent and Time stamp. Analysing the log file gives clear idea about the user. Data Pre-Processing is an important step in mining process. Web log data contains irrelevant data so it has to be Pre-Processed. If the collected Web log data is Pre-Processed, then it becomes easy to find the desire information about visitors and also retrieve other information from Web log data. This paper proposes a novel technique to Pre-Process the Web log data and given detailed discussion about the content of Web log data. Each Uniform Resource Locator (URL) in the Web log data is parsed into tokens based on the Web structure and then it is implemented using SQL server management studio

Neliti

Big data security analysis approach using Computational Intelligence techniques in R for desktop users

Author: Jenkins P.
Katos Vasilis
Naik N.
Savage N.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/02/2017
Field of study

© 2016 IEEE.Big Data security analysis is commonly used for the analysis of large volume security data from an organisational perspective, requiring powerful IT infrastructure and expensive data analysis tools. Therefore, it can be considered to be inaccessible to the vast majority of desktop users and is difficult to apply to their rapidly growing data sets for security analysis. A number of commercial companies offer a desktop-oriented big data security analysis solution; however, most of them are prohibitive to ordinary desktop users with respect to cost and IT processing power. This paper presents an intuitive and inexpensive big data security analysis approach using Computational Intelligence (CI) techniques for Windows desktop users, where the combination of Windows batch programming, EmEditor and R are used for the security analysis. The simulation is performed on a real dataset with more than 10 million observations, which are collected from Windows Firewall logs to demonstrate how a desktop user can gain insight into their abundant and untouched data and extract useful information to prevent their system from current and future security threats. This CI-based big data security analysis approach can also be extended to other types of security logs such as event logs, application logs and web logs

Crossref

Portsmouth University Research Portal (Pure)

Bournemouth University Research Online

Towards Enhancement of Machine Learning Techniques Using CSE-CIC-IDS2018 Cybersecurity Dataset

Author: Ravikumar Dharshini
Publication venue: RIT Scholar Works
Publication date: 01/01/2021
Field of study

In machine learning, balanced datasets play a crucial role in the bias observed towards classification and prediction. The CSE-CIC IDS datasets published in 2017 and 2018 have both attracted considerable scholarly attention towards research in intrusion detection systems. Recent work published using this dataset indicates little attention paid to the imbalance of the dataset. The study presented in this paper sets out to explore the degree to which imbalance has been treated and provide a taxonomy of the machine learning approaches developed using these datasets. A survey of published works related to these datasets was done to deliver a combined qualitative and quantitative methodological approach for our analysis towards deriving a taxonomy. The research presented here confirms that the impact of bias due to the imbalance datasets is rarely addressed. This data supports further research and development of supervised machine learning techniques that reduce bias in classification or prediction due to these imbalance datasets. This study\u27s experiment is to train the model using the train, and test split function from sci-kit learn library on the CSE-CIC-IDS2018. The system needs to be trained by a learning algorithm to accomplish this. There are many machine learning algorithms available and presented by the literature. Among which there are three types of classification based Supervised ML techniques which are used in our study: 1) KNN, 2) Random Forest (RF) and 3) Logistic Regression (LR). This experiment also determines how each of the dataset\u27s 67 preprocessed features affects the ML model\u27s performance. Feature drop selection is performed in two ways, independent and group drop. Experimental results generate the threshold values for each classifier and performance metric values such as accuracy, precision, recall, and F1-score. Also, results are generated from the comparison of manual feature drop methods. A good amount of drop is noticed in the group for most of the classifiers

RIT Scholar Works

Big data analytics: a predictive analysis applied to cybersecurity in a financial organization

Author: Pereira Pedro Filipe Martins Tourais
Publication venue
Publication date: 05/07/2019
Field of study

Project Work presented as partial requirement for obtaining the Master’s degree in Information Management, with a specialization in Knowledge Management and Business IntelligenceWith the generalization of the internet access, cyber attacks have registered an alarming growth in frequency and severity of damages, along with the awareness of organizations with heavy investments in cybersecurity, such as in the financial sector. This work is focused on an organization’s financial service that operates on the international markets in the payment systems industry. The objective was to develop a predictive framework solution responsible for threat detection to support the security team to open investigations on intrusive server requests, over the exponentially growing log events collected by the SIEM from the Apache Web Servers for the financial service. A Big Data framework, using Hadoop and Spark, was developed to perform classification tasks over the financial service requests, using Neural Networks, Logistic Regression, SVM, and Random Forests algorithms, while handling the training of the imbalance dataset through BEV. The main conclusions over the analysis conducted, registered the best scoring performances for the Random Forests classifier using all the preprocessed features available. Using the all the available worker nodes with a balanced configuration of the Spark executors, the most performant elapsed times for loading and preprocessing of the data were achieved using the column-oriented ORC with native format, while the row-oriented CSV format performed the best for the training of the classifiers.Com a generalização do acesso à internet, os ciberataques registaram um crescimento alarmante em frequência e severidade de danos causados, a par da consciencialização das organizações, com elevados investimentos em cibersegurança, como no setor financeiro. Este trabalho focou-se no serviço financeiro de uma organização que opera nos mercados internacionais da indústria de sistemas de pagamento. O objetivo consistiu no desenvolvimento uma solução preditiva responsável pela detecção de ameaças, por forma a dar suporte à equipa de segurança na abertura de investigações sobre pedidos intrusivos no servidor, relativamente aos exponencialmente crescentes eventos de log coletados pelo SIEM, referentes aos Apache Web Servers, para o serviço financeiro. Uma solução de Big Data, usando Hadoop e Spark, foi desenvolvida com o objectivo de executar tarefas de classificação sobre os pedidos do serviço financeiros, usando os algoritmos Neural Networks, Logistic Regression, SVM e Random Forests, solucionando os problemas associados ao treino de um dataset desequilibrado através de BEV. As principais conclusões sobre as análises realizadas registaram os melhores resultados de classificação usando o algoritmo Random Forests com todas as variáveis pré-processadas disponíveis. Usando todos os nós do cluster e uma configuração balanceada dos executores do Spark, os melhores tempos para carregar e pré-processar os dados foram obtidos usando o formato colunar ORC nativo, enquanto o formato CSV, orientado a linhas, apresentou os melhores tempos para o treino dos classificadores

Repositório da Universidade Nova de Lisboa

A Secured Cloud System based on Log Analysis

Author: Doddapaneni Sindhusha
Publication venue: SJSU ScholarWorks
Publication date: 01/10/2015
Field of study

Now-a-days, enterprises’ acceptance over the Cloud is increasing but businesses are now finding issues related to security. Everyday, users store a large amount of data in the Cloud and user input may be malicious. Therefore, security has become the critical feature in the applications stored in the Cloud. Though there are many existing systems which provide us different encryption algorithms and security methods, there is still a possibility of attacks to applications and increasing data modifications. The idea behind this project is to find attacks and protect the applications stored in the Cloud using log analysis. The proposed solution detects the SQL injection attack, which is supposed to be the most critical security risk of vulnerable applications. The goal of this research is to detect the SQL injection attacks for an application stored in the Cloud by analyzing the logs. To achieve this, the proposed system automates the intrusion detection process for an application by performing log analysis. Log Analysis is performed by combining the implementation of two different methodologies called learn and detect methodology and pattern recognition system. The accuracy of SQL injections detected on log data is dependent on the order in which these two methodologies are applied. The outcome after applying these two methodologies results in information which helps a security analyst to understand and know the root cause of every attack that is detected on an application

SJSU ScholarWorks

Deep learning algorithms for intrusion detection systems in internet of things using CIC-IDS 2017 dataset

Author: Jose Deepa V.
Jose Jinsi
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/02/2023
Field of study

Due to technological advancements in recent years, the availability and usage of smart electronic gadgets have drastically increased. Adoption of these smart devices for a variety of applications in our day-to-day life has become a new normal. As these devices collect and store data, which is of prime importance, securing is a mandatory requirement by being vigilant against intruders. Many traditional techniques are prevailing for the same, but they may not be a good solution for the devices with resource constraints. The impact of artificial intelligence is not negligible in this concern. This study is an attempt to understand and analyze the performance of deep learning algorithms in intrusion detection. A comparative analysis of the performance of deep neural network, convolutional neural network, and long short-term memory using the CIC-IDS 2017 dataset

ZENODO

Institute of Advanced Engineering and Science