37 research outputs found
Protein-Protein Docking Using Long Range Nuclear Magnetic Resonance Constraints
One of the main methods for experimentally determining protein structure is nuclear magnetic resonance (NMR) spectroscopy. The advantage of using NMR compared to other methods is that the molecule may be studied in its natural state and environment. However, NMR is limited in its facility to analyze multi-domain molecules
because of the scarcity of inter-atomic NMR constraints between the domains. In those cases it might be possible to dock the domains based on long range NMR constraints that are related to the molecule's overall structure.
We present two computational methods for rigid docking based on long range NMR constraints. The first docking method is based on the overall alignment tensor of the complex. The docking algorithm is based on the minimization of the difference between the predicted and experimental alignment tensor. In order to efficiently dock the complex we introduce a new, computationally efficient method called PATI for predicting the molecular alignment tensor based on the three-dimensional structure of the molecule. The increase in speed compared to the currently best-known method (PALES) is achieved by re-expressing the problem as one of numerical integration, rather than a simple uniform sampling (as in the PALES method), and by using a convex hull rather than a detailed representation of the surface of a molecule. Using PATI, we derive a method called PATIDOCK for efficiently docking a two-domain complex based solely on the novel idea of using the difference between the experimental alignment tensor and the predicted alignment tensor computed by PATI. We show that the alignment tensor fundamentally contains enough information to accurately dock a two-domain complex, and that we can very quickly dock the two domains by pre-computing the right set of data.
A second new docking method is based on a similar concept but using the rotational diffusion tensor. We derive a minimization algorithm for this docking method by separating the problem into two simpler minimization problems and approximating our energy function by a quadratic equation.
These methods provide two new efficient procedures for protein docking computations
That Escalated Quickly: An ML Framework for Alert Prioritization
In place of in-house solutions, organizations are increasingly moving towards
managed services for cyber defense. Security Operations Centers are specialized
cybersecurity units responsible for the defense of an organization, but the
large-scale centralization of threat detection is causing SOCs to endure an
overwhelming amount of false positive alerts -- a phenomenon known as alert
fatigue. Large collections of imprecise sensors, an inability to adapt to known
false positives, evolution of the threat landscape, and inefficient use of
analyst time all contribute to the alert fatigue problem. To combat these
issues, we present That Escalated Quickly (TEQ), a machine learning framework
that reduces alert fatigue with minimal changes to SOC workflows by predicting
alert-level and incident-level actionability. On real-world data, the system is
able to reduce the time it takes to respond to actionable incidents by
, suppress of false positives with a detection rate,
and reduce the number of alerts an analyst needs to investigate within singular
incidents by .Comment: Submitted to Usenix Security Symposiu
qwLSH: Cache-conscious Indexing for Processing Similarity Search Query Workloads in High-Dimensional Spaces
Similarity search queries in high-dimensional spaces are an important type of
queries in many domains such as image processing, machine learning, etc. Since
exact similarity search indexing techniques suffer from the well-known curse of
dimensionality in high-dimensional spaces, approximate search techniques are
often utilized instead. Locality Sensitive Hashing (LSH) has been shown to be
an effective approximate search method for solving similarity search queries in
high-dimensional spaces. Often times, queries in real-world settings arrive as
part of a query workload. LSH and its variants are particularly designed to
solve single queries effectively. They suffer from one major drawback while
executing query workloads: they do not take into consideration important data
characteristics for effective cache utilization while designing the index
structures. In this paper, we present qwLSH, an index structure for efficiently
processing similarity search query workloads in high-dimensional spaces. We
intelligently divide a given cache during processing of a query workload by
using novel cost models. Experimental results show that, given a query
workload, qwLSH is able to perform faster than existing techniques due to its
unique cost models and strategies.Comment: Extended version of the published wor