11 research outputs found

    Querying Streaming System Monitoring Data for Enterprise System Anomaly Detection

    Full text link
    The need for countering Advanced Persistent Threat (APT) attacks has led to the solutions that ubiquitously monitor system activities in each enterprise host, and perform timely abnormal system behavior detection over the stream of monitoring data. However, existing stream-based solutions lack explicit language constructs for expressing anomaly models that capture abnormal system behaviors, thus facing challenges in incorporating expert knowledge to perform timely anomaly detection over the large-scale monitoring data. To address these limitations, we build SAQL, a novel stream-based query system that takes as input, a real-time event feed aggregated from multiple hosts in an enterprise, and provides an anomaly query engine that queries the event feed to identify abnormal behaviors based on the specified anomaly models. SAQL provides a domain-specific query language, Stream-based Anomaly Query Language (SAQL), that uniquely integrates critical primitives for expressing major types of anomaly models. In the demo, we aim to show the complete usage scenario of SAQL by (1) performing an APT attack in a controlled environment, and (2) using SAQL to detect the abnormal behaviors in real time by querying the collected stream of system monitoring data that contains the attack traces. The audience will have the option to interact with the system and detect the attack footprints in real time via issuing queries and checking the query results through a command-line UI.Comment: Accepted paper at ICDE 2020 demonstrations track. arXiv admin note: text overlap with arXiv:1806.0933

    The Taint Rabbit: Optimizing Generic Taint Analysis with Dynamic Fast Path Generation

    Full text link
    Generic taint analysis is a pivotal technique in software security. However, it suffers from staggeringly high overhead. In this paper, we explore the hypothesis whether just-in-time (JIT) generation of fast paths for tracking taint can enhance the performance. To this end, we present the Taint Rabbit, which supports highly customizable user-defined taint policies and combines a JIT with fast context switching. Our experimental results suggest that this combination outperforms notable existing implementations of generic taint analysis and bridges the performance gap to specialized trackers. For instance, Dytan incurs an average overhead of 237x, while the Taint Rabbit achieves 1.7x on the same set of benchmarks. This compares favorably to the 1.5x overhead delivered by the bitwise, non-generic, taint engine LibDFT

    libdft: Practical Dynamic Data Flow Tracking for Commodity Systems

    Get PDF
    Dynamic data flow tracking (DFT) deals with tagging and tracking data of interest as they propagate during program execution. DFT has been repeatedly implemented by a variety of tools for numerous purposes, including protection from zero-day and cross-site scripting attacks, detection and prevention of information leaks, and for the analysis of legitimate and malicious software. We present libdft, a dynamic DFT framework that unlike previous work is at once fast, reusable, and works with commodity software and hardware. libdft provides an API for building DFT-enabled tools that work on unmodified binaries, running on common operating systems and hardware, thus facilitating research and rapid prototyping. We explore different approaches for implementing the low-level aspects of instruction-level data tracking, introduce a more efficient and 64-bit capable shadow memory, and identify (and avoid) the common pitfalls responsible for the excessive performance overhead of previous studies. We evaluate libdft using real applications with large codebases like the Apache and MySQL servers, and the Firefox web browser. We also use a series of benchmarks and utilities to compare libdft with similar systems. Our results indicate that it performs at least as fast, if not faster, than previous solutions, and to the best of our knowledge, we are the first to evaluate the performance overhead of a fast dynamic DFT implementation in such depth. Finally, libdft is freely available as open source software

    ShadowReplica: Efficient Parallelization of Dynamic Data Flow Tracking

    No full text
    Dynamic data flow tracking (DFT) is a technique broadly used in a variety of security applications that, unfortunately, exhibits poor performance, preventing its adoption in production systems. We present ShadowReplica, a new and efficient approach for accelerating DFT and other shadow memory-based analyses, by decoupling analysis from execution and utilizing spare CPU cores to run them in parallel. Our approach enables us to run a heavyweight technique, like dynamic taint analysis (DTA), twice as fast, while concurrently consuming fewer CPU cycles than when applying it in-line. DFT is run in parallel by a second shadow thread that is spawned for each application thread, and the two communicate using a shared data structure. We avoid the problems suffered by previous approaches, by introducing an off-line application analysis phase that utilizes both static and dynamic analysis methodologies to generate optimized code for decoupling execution and implementing DFT, while it also minimizes the amount of information that needs to be communicated between the two threads. Furthermore, we use a lock-free ring buffer structure and an N-way buffering scheme to efficiently exchange data between threads and maintain high cache-hit rates on multi-core CPUs. Our evaluation shows that ShadowReplica is on average ∼2.3 × faster than in-line DFT (∼2.75 × slowdown over native execution) when running the SPEC CPU2006 benchmark, while similar speed ups were observed with command-line utilities and popular server software. Astoundingly, ShadowReplica also reduces the CPU cycles used up to 30%

    A General Approach for Efficiently Accelerating Software-based Dynamic Data Flow Tracking on Commodity Hardware

    No full text
    Despite the demonstrated usefulness of dynamic data flow tracking (DDFT) techniques in a variety of security applications, the poor performance achieved by available prototypes prevents their widespread adoption and use in production systems. We present and evaluate a novel methodology for improving the performance overhead of DDFT frameworks, by combining static and dynamic analysis. Our intuition is to separate the program logic from the corresponding tracking logic, extracting the semantics of the latter and abstracting them using a Taint Flow Algebra. We then apply optimization techniques to eliminate redundant tracking logic and minimize interference with the target program. Our optimizations are directly applicable to binary-only software and do not require any high level semantics. Furthermore, they do not require additional resources to improve performance, neither do they restrict or remove functionality. Most importantly, our approach is orthogonal to optimizations devised in the past, and can deliver additive performance benefits. We extensively evaluate the correctness and impact of our optimizations, by augmenting a freely available high-performance DDFT framework, and applying it to multiple applications, including command line utilities, server applications, language runtimes, and web browsers. Our results show a speedup of DDFT by as much as 2.23×, with an average of 1.72 × across all tested applications.
    corecore