6,722 research outputs found

    Survey on Incremental Approaches for Network Anomaly Detection

    Full text link
    As the communication industry has connected distant corners of the globe using advances in network technology, intruders or attackers have also increased attacks on networking infrastructure commensurately. System administrators can attempt to prevent such attacks using intrusion detection tools and systems. There are many commercially available signature-based Intrusion Detection Systems (IDSs). However, most IDSs lack the capability to detect novel or previously unknown attacks. A special type of IDSs, called Anomaly Detection Systems, develop models based on normal system or network behavior, with the goal of detecting both known and unknown attacks. Anomaly detection systems face many problems including high rate of false alarm, ability to work in online mode, and scalability. This paper presents a selective survey of incremental approaches for detecting anomaly in normal system or network traffic. The technological trends, open problems, and challenges over anomaly detection using incremental approach are also discussed.Comment: 14 pages, 1 figure, 11 tables referred journal publicatio

    Fast Incremental von Neumann Graph Entropy Computation: Theory, Algorithm, and Applications

    Full text link
    The von Neumann graph entropy (VNGE) facilitates measurement of information divergence and distance between graphs in a graph sequence. It has been successfully applied to various learning tasks driven by network-based data. While effective, VNGE is computationally demanding as it requires the full eigenspectrum of the graph Laplacian matrix. In this paper, we propose a new computational framework, Fast Incremental von Neumann Graph EntRopy (FINGER), which approaches VNGE with a performance guarantee. FINGER reduces the cubic complexity of VNGE to linear complexity in the number of nodes and edges, and thus enables online computation based on incremental graph changes. We also show asymptotic equivalence of FINGER to the exact VNGE, and derive its approximation error bounds. Based on FINGER, we propose efficient algorithms for computing Jensen-Shannon distance between graphs. Our experimental results on different random graph models demonstrate the computational efficiency and the asymptotic equivalence of FINGER. In addition, we apply FINGER to two real-world applications and one synthesized anomaly detection dataset, and corroborate its superior performance over seven baseline graph similarity methods.Comment: Published at ICML 201

    On the Runtime-Efficacy Trade-off of Anomaly Detection Techniques for Real-Time Streaming Data

    Full text link
    Ever growing volume and velocity of data coupled with decreasing attention span of end users underscore the critical need for real-time analytics. In this regard, anomaly detection plays a key role as an application as well as a means to verify data fidelity. Although the subject of anomaly detection has been researched for over 100 years in a multitude of disciplines such as, but not limited to, astronomy, statistics, manufacturing, econometrics, marketing, most of the existing techniques cannot be used as is on real-time data streams. Further, the lack of characterization of performance -- both with respect to real-timeliness and accuracy -- on production data sets makes model selection very challenging. To this end, we present an in-depth analysis, geared towards real-time streaming data, of anomaly detection techniques. Given the requirements with respect to real-timeliness and accuracy, the analysis presented in this paper should serve as a guide for selection of the "best" anomaly detection technique. To the best of our knowledge, this is the first characterization of anomaly detection techniques proposed in very diverse set of fields, using production data sets corresponding to a wide set of application domains.Comment: 12 page

    A Network Intrusions Detection System based on a Quantum Bio Inspired Algorithm

    Full text link
    Network intrusion detection systems (NIDSs) have a role of identifying malicious activities by monitoring the behavior of networks. Due to the currently high volume of networks trafic in addition to the increased number of attacks and their dynamic properties, NIDSs have the challenge of improving their classification performance. Bio-Inspired Optimization Algorithms (BIOs) are used to automatically extract the the discrimination rules of normal or abnormal behavior to improve the classification accuracy and the detection ability of NIDS. A quantum vaccined immune clonal algorithm with the estimation of distribution algorithm (QVICA-with EDA) is proposed in this paper to build a new NIDS. The proposed algorithm is used as classification algorithm of the new NIDS where it is trained and tested using the KDD data set. Also, the new NIDS is compared with another detection system based on particle swarm optimization (PSO). Results shows the ability of the proposed algorithm of achieving high intrusions classification accuracy where the highest obtained accuracy is 94.8 %

    Sequential Outlier Detection based on Incremental Decision Trees

    Full text link
    We introduce an online outlier detection algorithm to detect outliers in a sequentially observed data stream. For this purpose, we use a two-stage filtering and hedging approach. In the first stage, we construct a multi-modal probability density function to model the normal samples. In the second stage, given a new observation, we label it as an anomaly if the value of aforementioned density function is below a specified threshold at the newly observed point. In order to construct our multi-modal density function, we use an incremental decision tree to construct a set of subspaces of the observation space. We train a single component density function of the exponential family using the observations, which fall inside each subspace represented on the tree. These single component density functions are then adaptively combined to produce our multi-modal density function, which is shown to achieve the performance of the best convex combination of the density functions defined on the subspaces. As we observe more samples, our tree grows and produces more subspaces. As a result, our modeling power increases in time, while mitigating overfitting issues. In order to choose our threshold level to label the observations, we use an adaptive thresholding scheme. We show that our adaptive threshold level achieves the performance of the optimal pre-fixed threshold level, which knows the observation labels in hindsight. Our algorithm provides significant performance improvements over the state of the art in our wide set of experiments involving both synthetic as well as real data

    Should I Raise The Red Flag? A comprehensive survey of anomaly scoring methods toward mitigating false alarms

    Full text link
    Nowadays, advanced intrusion detection systems (IDSs) rely on a combination of anomaly detection and signature-based methods. An IDS gathers observations, analyzes behavioral patterns, and reports suspicious events for further investigation. A notorious issue anomaly detection systems (ADSs) and IDSs face is the possibility of high false alarms, which even state-of-the-art systems have not overcome. This is especially a problem with large and complex systems. The number of non-critical alarms can easily overwhelm administrators and increase the likelihood of ignoring future alerts. Mitigation strategies thus aim to avoid raising `too many' false alarms without missing potentially dangerous situations. There are two major categories of false alarm-mitigation strategies: (1) methods that are customized to enhance the quality of anomaly scoring; (2) approaches acting as filtering methods in contexts that aim to decrease false alarm rates. These methods have been widely utilized by many scholars. Herein, we review and compare the existing techniques for false alarm mitigation in ADSs. We also examine the use of promising techniques in signature-based IDS and other relevant contexts, such as commercial security information and event management tools, which are promising for ADSs. We conclude by highlighting promising directions for future research.Comment: arXiv admin note: text overlap with arXiv:1802.04431, arXiv:1503.01158 by other author

    Unsupervised Place Discovery for Place-Specific Change Classifier

    Full text link
    In this study, we address the problem of supervised change detection for robotic map learning applications, in which the aim is to train a place-specific change classifier (e.g., support vector machine (SVM)) to predict changes from a robot's view image. An open question is the manner in which to partition a robot's workspace into places (e.g., SVMs) to maximize the overall performance of change classifiers. This is a chicken-or-egg problem: if we have a well-trained change classifier, partitioning the robot's workspace into places is rather easy. However, training a change classifier requires a set of place-specific training data. In this study, we address this novel problem, which we term unsupervised place discovery. In addition, we present a solution powered by convolutional-feature-based visual place recognition, and validate our approach by applying it to two place-specific change classifiers, namely, nuisance and anomaly predictors.Comment: 6 pages, 6 figures, 2 tables, Technical repor

    Review of Smart Meter Data Analytics: Applications, Methodologies, and Challenges

    Full text link
    The widespread popularity of smart meters enables an immense amount of fine-grained electricity consumption data to be collected. Meanwhile, the deregulation of the power industry, particularly on the delivery side, has continuously been moving forward worldwide. How to employ massive smart meter data to promote and enhance the efficiency and sustainability of the power grid is a pressing issue. To date, substantial works have been conducted on smart meter data analytics. To provide a comprehensive overview of the current research and to identify challenges for future research, this paper conducts an application-oriented review of smart meter data analytics. Following the three stages of analytics, namely, descriptive, predictive and prescriptive analytics, we identify the key application areas as load analysis, load forecasting, and load management. We also review the techniques and methodologies adopted or developed to address each application. In addition, we also discuss some research trends, such as big data issues, novel machine learning technologies, new business models, the transition of energy systems, and data privacy and security.Comment: IEEE Transactions on Smart Grid, 201

    CADDeLaG: Framework for distributed anomaly detection in large dense graph sequences

    Full text link
    Random walk based distance measures for graphs such as commute-time distance are useful in a variety of graph algorithms, such as clustering, anomaly detection, and creating low dimensional embeddings. Since such measures hinge on the spectral decomposition of the graph, the computation becomes a bottleneck for large graphs and do not scale easily to graphs that cannot be loaded in memory. Most existing graph mining libraries for large graphs either resort to sampling or exploit the sparsity structure of such graphs for spectral analysis. However, such methods do not work for dense graphs constructed for studying pairwise relationships among entities in a data set. Examples of such studies include analyzing pairwise locations in gridded climate data for discovering long distance climate phenomena. These graphs representations are fully connected by construction and cannot be sparsified without loss of meaningful information. In this paper we describe CADDeLaG, a framework for scalable computation of commute-time distance based anomaly detection in large dense graphs without the need to load the entire graph in memory. The framework relies on Apache Spark's memory-centric cluster-computing infrastructure and consists of two building blocks: a decomposable algorithm for commute time distance computation and a distributed linear system solver. We illustrate the scalability of CADDeLaG and its dependency on various factors using both synthetic and real world data sets. We demonstrate the usefulness of CADDeLaG in identifying anomalies in a climate graph sequence, that have been historically missed due to ad hoc graph sparsification and on an election donation data set

    Energy-based Models for Video Anomaly Detection

    Full text link
    Automated detection of abnormalities in data has been studied in research area in recent years because of its diverse applications in practice including video surveillance, industrial damage detection and network intrusion detection. However, building an effective anomaly detection system is a non-trivial task since it requires to tackle challenging issues of the shortage of annotated data, inability of defining anomaly objects explicitly and the expensive cost of feature engineering procedure. Unlike existing appoaches which only partially solve these problems, we develop a unique framework to cope the problems above simultaneously. Instead of hanlding with ambiguous definition of anomaly objects, we propose to work with regular patterns whose unlabeled data is abundant and usually easy to collect in practice. This allows our system to be trained completely in an unsupervised procedure and liberate us from the need for costly data annotation. By learning generative model that capture the normality distribution in data, we can isolate abnormal data points that result in low normality scores (high abnormality scores). Moreover, by leverage on the power of generative networks, i.e. energy-based models, we are also able to learn the feature representation automatically rather than replying on hand-crafted features that have been dominating anomaly detection research over many decades. We demonstrate our proposal on the specific application of video anomaly detection and the experimental results indicate that our method performs better than baselines and are comparable with state-of-the-art methods in many benchmark video anomaly detection datasets
    • …
    corecore