64 research outputs found

    SNAPSKETCH: Graph Representation Approach for Anomaly Detection in Graph Stream

    Get PDF
    A novel unsupervised graph representation approach in a graph stream called SNAPSKETCH for anomaly detection is proposed. It first performs a fixed-length random walk from each node in a network and constructs n-shingles from a walk path. The top discriminative n-shingles identified using a frequency measure are projected into a dimensional projection vector chosen uniformly at random. Finally, a network is sketched into a low-dimensional sketch vector using a simplified hashing of projection vector and the cost of shingles. Using the learned sketch vector, anomaly detection is done using the state-of-the-art anomaly detection approach called RRCF [1]. SNAPSKETCHhas several advantages: Fully unsupervised learning, Constant memory space usage, Entire-graph embedding, and Real-time anomaly detection

    SENATUS: An Approach to Joint Traffic Anomaly Detection and Root Cause Analysis

    Full text link
    In this paper, we propose a novel approach, called SENATUS, for joint traffic anomaly detection and root-cause analysis. Inspired from the concept of a senate, the key idea of the proposed approach is divided into three stages: election, voting and decision. At the election stage, a small number of \nop{traffic flow sets (termed as senator flows)}senator flows are chosen\nop{, which are used} to represent approximately the total (usually huge) set of traffic flows. In the voting stage, anomaly detection is applied on the senator flows and the detected anomalies are correlated to identify the most possible anomalous time bins. Finally in the decision stage, a machine learning technique is applied to the senator flows of each anomalous time bin to find the root cause of the anomalies. We evaluate SENATUS using traffic traces collected from the Pan European network, GEANT, and compare against another approach which detects anomalies using lossless compression of traffic histograms. We show the effectiveness of SENATUS in diagnosing anomaly types: network scans and DoS/DDoS attacks

    Precision and Recall for Range-Based Anomaly Detection

    Get PDF
    Classical anomaly detection is principally concerned with point- based anomalies, anomalies that occur at a single data point. In this paper, we present a new mathematical model to express range- based anomalies, anomalies that occur over a range (or period) of time

    Beyond Individual Input for Deep Anomaly Detection on Tabular Data

    Full text link
    Anomaly detection is crucial in various domains, such as finance, healthcare, and cybersecurity. In this paper, we propose a novel deep anomaly detection method for tabular data that leverages Non-Parametric Transformers (NPTs), a model initially proposed for supervised tasks, to capture both feature-feature and sample-sample dependencies. In a reconstruction-based framework, we train the NPT model to reconstruct masked features of normal samples. We use the model's ability to reconstruct the masked features during inference to generate an anomaly score. To the best of our knowledge, our proposed method is the first to combine both feature-feature and sample-sample dependencies for anomaly detection on tabular datasets. We evaluate our method on an extensive benchmark of tabular datasets and demonstrate that our approach outperforms existing state-of-the-art methods based on both the F1-Score and AUROC. Moreover, our work opens up new research directions for exploring the potential of NPTs for other tasks on tabular data

    Finding Skewed Subcubes Under a Distribution

    Get PDF
    Say that we are given samples from a distribution ? over an n-dimensional space. We expect or desire ? to behave like a product distribution (or a k-wise independent distribution over its marginals for small k). We propose the problem of enumerating/list-decoding all large subcubes where the distribution ? deviates markedly from what we expect; we refer to such subcubes as skewed subcubes. Skewed subcubes are certificates of dependencies between small subsets of variables in ?. We motivate this problem by showing that it arises naturally in the context of algorithmic fairness and anomaly detection. In this work we focus on the special but important case where the space is the Boolean hypercube, and the expected marginals are uniform. We show that the obvious definition of skewed subcubes can lead to intractable list sizes, and propose a better definition of a minimal skewed subcube, which are subcubes whose skew cannot be attributed to a larger subcube that contains it. Our main technical contribution is a list-size bound for this definition and an algorithm to efficiently find all such subcubes. Both the bound and the algorithm rely on Fourier-analytic techniques, especially the powerful hypercontractive inequality. On the lower bounds side, we show that finding skewed subcubes is as hard as the sparse noisy parity problem, and hence our algorithms cannot be improved on substantially without a breakthrough on this problem which is believed to be intractable. Motivated by this, we study alternate models allowing query access to ? where finding skewed subcubes might be easier
    • …
    corecore