37 research outputs found

    Protein-Protein Docking Using Long Range Nuclear Magnetic Resonance Constraints

    Get PDF
    One of the main methods for experimentally determining protein structure is nuclear magnetic resonance (NMR) spectroscopy. The advantage of using NMR compared to other methods is that the molecule may be studied in its natural state and environment. However, NMR is limited in its facility to analyze multi-domain molecules because of the scarcity of inter-atomic NMR constraints between the domains. In those cases it might be possible to dock the domains based on long range NMR constraints that are related to the molecule's overall structure. We present two computational methods for rigid docking based on long range NMR constraints. The first docking method is based on the overall alignment tensor of the complex. The docking algorithm is based on the minimization of the difference between the predicted and experimental alignment tensor. In order to efficiently dock the complex we introduce a new, computationally efficient method called PATI for predicting the molecular alignment tensor based on the three-dimensional structure of the molecule. The increase in speed compared to the currently best-known method (PALES) is achieved by re-expressing the problem as one of numerical integration, rather than a simple uniform sampling (as in the PALES method), and by using a convex hull rather than a detailed representation of the surface of a molecule. Using PATI, we derive a method called PATIDOCK for efficiently docking a two-domain complex based solely on the novel idea of using the difference between the experimental alignment tensor and the predicted alignment tensor computed by PATI. We show that the alignment tensor fundamentally contains enough information to accurately dock a two-domain complex, and that we can very quickly dock the two domains by pre-computing the right set of data. A second new docking method is based on a similar concept but using the rotational diffusion tensor. We derive a minimization algorithm for this docking method by separating the problem into two simpler minimization problems and approximating our energy function by a quadratic equation. These methods provide two new efficient procedures for protein docking computations

    That Escalated Quickly: An ML Framework for Alert Prioritization

    Full text link
    In place of in-house solutions, organizations are increasingly moving towards managed services for cyber defense. Security Operations Centers are specialized cybersecurity units responsible for the defense of an organization, but the large-scale centralization of threat detection is causing SOCs to endure an overwhelming amount of false positive alerts -- a phenomenon known as alert fatigue. Large collections of imprecise sensors, an inability to adapt to known false positives, evolution of the threat landscape, and inefficient use of analyst time all contribute to the alert fatigue problem. To combat these issues, we present That Escalated Quickly (TEQ), a machine learning framework that reduces alert fatigue with minimal changes to SOC workflows by predicting alert-level and incident-level actionability. On real-world data, the system is able to reduce the time it takes to respond to actionable incidents by 22.9%22.9\%, suppress 54%54\% of false positives with a 95.1%95.1\% detection rate, and reduce the number of alerts an analyst needs to investigate within singular incidents by 14%14\%.Comment: Submitted to Usenix Security Symposiu

    qwLSH: Cache-conscious Indexing for Processing Similarity Search Query Workloads in High-Dimensional Spaces

    Full text link
    Similarity search queries in high-dimensional spaces are an important type of queries in many domains such as image processing, machine learning, etc. Since exact similarity search indexing techniques suffer from the well-known curse of dimensionality in high-dimensional spaces, approximate search techniques are often utilized instead. Locality Sensitive Hashing (LSH) has been shown to be an effective approximate search method for solving similarity search queries in high-dimensional spaces. Often times, queries in real-world settings arrive as part of a query workload. LSH and its variants are particularly designed to solve single queries effectively. They suffer from one major drawback while executing query workloads: they do not take into consideration important data characteristics for effective cache utilization while designing the index structures. In this paper, we present qwLSH, an index structure for efficiently processing similarity search query workloads in high-dimensional spaces. We intelligently divide a given cache during processing of a query workload by using novel cost models. Experimental results show that, given a query workload, qwLSH is able to perform faster than existing techniques due to its unique cost models and strategies.Comment: Extended version of the published wor
    corecore