221 research outputs found

    A Graph-Based Semi-Supervised k Nearest-Neighbor Method for Nonlinear Manifold Distributed Data Classification

    Get PDF
    kk Nearest Neighbors (kkNN) is one of the most widely used supervised learning algorithms to classify Gaussian distributed data, but it does not achieve good results when it is applied to nonlinear manifold distributed data, especially when a very limited amount of labeled samples are available. In this paper, we propose a new graph-based kkNN algorithm which can effectively handle both Gaussian distributed data and nonlinear manifold distributed data. To achieve this goal, we first propose a constrained Tired Random Walk (TRW) by constructing an RR-level nearest-neighbor strengthened tree over the graph, and then compute a TRW matrix for similarity measurement purposes. After this, the nearest neighbors are identified according to the TRW matrix and the class label of a query point is determined by the sum of all the TRW weights of its nearest neighbors. To deal with online situations, we also propose a new algorithm to handle sequential samples based a local neighborhood reconstruction. Comparison experiments are conducted on both synthetic data sets and real-world data sets to demonstrate the validity of the proposed new kkNN algorithm and its improvements to other version of kkNN algorithms. Given the widespread appearance of manifold structures in real-world problems and the popularity of the traditional kkNN algorithm, the proposed manifold version kkNN shows promising potential for classifying manifold-distributed data.Comment: 32 pages, 12 figures, 7 table

    Network Intrusion Detection with Edge-Directed Graph Multi-Head Attention Networks

    Full text link
    A network intrusion usually involves a number of network locations. Data flow (including the data generated by intrusion behaviors) among these locations (usually represented by IP addresses) naturally forms a graph. Thus, graph neural networks (GNNs) have been used in the construction of intrusion detection models in recent years since they have an excellent ability to capture graph topological features of intrusion data flow. However, existing GNN models treat node mean aggregation equally in node information aggregation. In reality, the correlations of nodes and their neighbors as well as the linked edges are different. Assigning higher weights to nodes and edges with high similarity can highlight the correlation among them, which will enhance the accuracy and expressiveness of the model. To this end, this paper proposes novel Edge-Directed Graph Multi-Head Attention Networks (EDGMAT) for network intrusion detection. The proposed EDGMAT model introduces a multi-head attention mechanism into the intrusion detection model. Additional weight learning is realized through the combination of a multi-head attention mechanism and edge features. Weighted aggregation makes better use of the relationship between different network traffic data. Experimental results on four recent NIDS benchmark datasets show that the performance of EDGMAT in terms of weighted F1-Score is significantly better than that of four state-of-the-art models in multi-class detection tasks

    Zero-day Network Intrusion Detection using Machine Learning Approach

    Get PDF
    Zero-day network attacks are a growing global cybersecurity concern. Hackers exploit vulnerabilities in network systems, making network traffic analysis crucial in detecting and mitigating unauthorized attacks. However, inadequate and ineffective network traffic analysis can lead to prolonged network compromises. To address this, machine learning-based zero-day network intrusion detection systems (ZDNIDS) rely on monitoring and collecting relevant information from network traffic data. The selection of pertinent features is essential for optimal ZDNIDS performance given the voluminous nature of network traffic data, characterized by attributes. Unfortunately, current machine learning models utilized in this field exhibit inefficiency in detecting zero-day network attacks, resulting in a high false alarm rate and overall performance degradation. To overcome these limitations, this paper introduces a novel approach combining the anomaly-based extended isolation forest algorithm with the BAT algorithm and Nevergrad. Furthermore, the proposed model was evaluated using 5G network traffic, showcasing its effectiveness in efficiently detecting both known and unknown attacks, thereby reducing false alarms when compared to existing systems. This advancement contributes to improved internet security

    Clustering based Intrusion Detection System for effective Detection of known and Zero-day Attacks

    Get PDF
    Developing effective security measures is the most challenging task now a days and hence calls for the development of intelligent intrusion detection systems. Most of the existing intrusion detection systems perform best at detecting known attacks but fail to detect zero-day attacks due to the lack of labeled examples. Authors in this paper, comes with a clustering-based IDS framework that can effectively detect both known and zero-day attacks by following unsupervised machine learning techniques. This research uses NSL-KDD dataset for the motive of experimentation and the experimental results exhibit best performance with an accuracy of 78%

    Performance Evaluation of an Intelligent and Optimized Machine Learning Framework for Attack Detection

    Get PDF
    In current decades, the size and complexity of network traffic data have risen significantly, which increases the likelihood of network penetration. One of today's largest advanced security concerns is the botnet. They are the mechanisms behind several online assaults, including Distribute Denial of Service (DDoS), spams, rebate fraudulence, phishing as well as malware attacks. Several methodologies have been created over time to address these issues. Existing intrusion detection techniques have trouble in processing data from speedy networks and are unable to identify recently launched assaults. Ineffective network traffic categorization has been slowed down by repetitive and pointless characteristics. By identifying the critical attributes and removing the unimportant ones using a feature selection approach could indeed reduce the feature space dimensionality and resolve the problem.Therefore, this articledevelops aninnovative network attack recognitionmodel combining an optimization strategy with machine learning framework namely, Grey Wolf with Artificial Bee Colony optimization-based Support Vector Machine (GWABC-SVM) model. The efficient selection of attributes is accomplished using a novel Grey wolf with artificial bee colony optimization approach and finally the Botnet DDoS attack detection is accomplished through Support Vector machine.This articleconducted an experimental assessment of the machine learning approachesfor UNBS-NB 15 and KDD99 databases for Botnet DDoS attack identification. The proposed optimized machine learning (ML) based network attack detection framework is evaluated in the last phase for its effectiveness in detecting the possible threats. The main advantage of employing SVM is that it offers a wide range of possibilities for intrusion detection program development for difficult complicated situations like cloud computing. In comparison to conventional ML-based models, the suggested technique has a better detection rate of 99.62% and is less time-consuming and robust

    Deep Transfer Learning Applications in Intrusion Detection Systems: A Comprehensive Review

    Full text link
    Globally, the external Internet is increasingly being connected to the contemporary industrial control system. As a result, there is an immediate need to protect the network from several threats. The key infrastructure of industrial activity may be protected from harm by using an intrusion detection system (IDS), a preventive measure mechanism, to recognize new kinds of dangerous threats and hostile activities. The most recent artificial intelligence (AI) techniques used to create IDS in many kinds of industrial control networks are examined in this study, with a particular emphasis on IDS-based deep transfer learning (DTL). This latter can be seen as a type of information fusion that merge, and/or adapt knowledge from multiple domains to enhance the performance of the target task, particularly when the labeled data in the target domain is scarce. Publications issued after 2015 were taken into account. These selected publications were divided into three categories: DTL-only and IDS-only are involved in the introduction and background, and DTL-based IDS papers are involved in the core papers of this review. Researchers will be able to have a better grasp of the current state of DTL approaches used in IDS in many different types of networks by reading this review paper. Other useful information, such as the datasets used, the sort of DTL employed, the pre-trained network, IDS techniques, the evaluation metrics including accuracy/F-score and false alarm rate (FAR), and the improvement gained, were also covered. The algorithms, and methods used in several studies, or illustrate deeply and clearly the principle in any DTL-based IDS subcategory are presented to the reader

    Extending Structural Learning Paradigms for High-Dimensional Machine Learning and Analysis

    Get PDF
    Structure-based machine-learning techniques are frequently used in extensions of supervised learning, such as active, semi-supervised, multi-modal, and multi-task learning. A common step in many successful methods is a structure-discovery process that is made possible through the addition of new information, which can be user feedback, unlabeled data, data from similar tasks, alternate views of the problem, etc. Learning paradigms developed in the above-mentioned fields have led to some extremely flexible, scalable, and successful multivariate analysis approaches. This success and flexibility offer opportunities to expand the use of machine learning paradigms to more complex analyses. In particular, while information is often readily available concerning complex problems, the relationships among the information rarely follow the simple labeled-example-based setup that supervised learning is based upon. Even when it is possible to incorporate additional data in such forms, the result is often an explosion in the dimensionality of the input space, such that both sample complexity and computational complexity can limit real-world success. In this work, we review many of the latest structural learning approaches for dealing with sample complexity. We expand their use to generate new paradigms for combining some of these learning strategies to address more complex problem spaces. We overview extreme-scale data analysis problems where sample complexity is a much more limiting factor than computational complexity, and outline new structural-learning approaches for dealing jointly with both. We develop and demonstrate a method for dealing with sample complexity in complex systems that leads to a more scalable algorithm than other approaches to large-scale multi-variate analysis. This new approach reflects the underlying problem structure more accurately by using interdependence to address sample complexity, rather than ignoring it for the sake of tractability
    • …
    corecore