7 research outputs found
Interpretable network propagation with application to expanding the repertoire of human proteins that interact with SARS-CoV-2
BACKGROUND: Network propagation has been widely used for nearly 20 years to predict gene functions and phenotypes. Despite the popularity of this approach, little attention has been paid to the question of provenance tracing in this context, e.g., determining how much any experimental observation in the input contributes to the score of every prediction. RESULTS: We design a network propagation framework with 2 novel components and apply it to predict human proteins that directly or indirectly interact with SARS-CoV-2 proteins. First, we trace the provenance of each prediction to its experimentally validated sources, which in our case are human proteins experimentally determined to interact with viral proteins. Second, we design a technique that helps to reduce the manual adjustment of parameters by users. We find that for every top-ranking prediction, the highest contribution to its score arises from a direct neighbor in a human protein-protein interaction network. We further analyze these results to develop functional insights on SARS-CoV-2 that expand on known biology such as the connection between endoplasmic reticulum stress, HSPA5, and anti-clotting agents. CONCLUSIONS: We examine how our provenance-tracing method can be generalized to a broad class of network-based algorithms. We provide a useful resource for the SARS-CoV-2 community that implicates many previously undocumented proteins with putative functional relationships to viral infection. This resource includes potential drugs that can be opportunistically repositioned to target these proteins. We also discuss how our overall framework can be extended to other, newly emerging viruses.DBI-1759858 - National Science Foundation; Boston UniversityPublished versio
Faster Algorithms for Edge Connectivity via Random -Out Contractions
We provide a simple new randomized contraction approach to the global minimum
cut problem for simple undirected graphs. The contractions exploit 2-out edge
sampling from each vertex rather than the standard uniform edge sampling. We
demonstrate the power of our new approach by obtaining better algorithms for
sequential, distributed, and parallel models of computation. Our end results
include the following randomized algorithms for computing edge connectivity
with high probability:
-- Two sequential algorithms with complexities and . These improve on a long line of developments including a celebrated
algorithm of Karger [STOC'96] and the state of the art algorithm of Henzinger et al. [SODA'17]. Moreover,
our algorithm is optimal whenever .
Within our new time bounds, whp, we can also construct the cactus
representation of all minimal cuts.
-- An round distributed algorithm, where D
denotes the graph diameter. This improves substantially on a recent
breakthrough of Daga et al. [STOC'19], which achieved a round complexity of
, hence providing the first sublinear
distributed algorithm for exactly computing the edge connectivity.
-- The first round algorithm for the massively parallel computation
setting with linear memory per machine.Comment: algorithms and data structures, graph algorithms, edge connectivity,
out-contractions, randomized algorithms, distributed algorithms, massively
parallel computatio
A Dynamic Shortest Paths Toolbox: Low-Congestion Vertex Sparsifiers and their Applications
We present a general toolbox, based on new vertex sparsifiers, for designing
data structures to maintain shortest paths in dynamic graphs.
In an -edge graph undergoing edge insertions and deletions, our data
structures give the first algorithms for maintaining (a) -approximate
all-pairs shortest paths (APSP) with \emph{worst-case} update time
and query time , and (b) a tree that has diameter no larger
than a subpolynomial factor times the diameter of the underlying graph, where
each update is handled in amortized subpolynomial time.
In graphs undergoing only edge deletions, we develop a simpler and more
efficient data structure to maintain a -approximate single-source
shortest paths (SSSP) tree in a graph undergoing edge deletions in
amortized time per update.
Our data structures are deterministic. The trees we can maintain are not
subgraphs of , but embed with small edge congestion into . This is in
stark contrast to previous approaches and is useful for algorithms that
internally use trees to route flow.
To illustrate the power of our new toolbox, we show that our SSSP data
structure gives simple deterministic implementations of flow-routing MWU
methods in several contexts, where previously only randomized methods had been
known.
To obtain our toolbox, we give the first algorithm that, given a graph
undergoing edge insertions and deletions and a dynamic terminal set ,
maintains a vertex sparsifier that approximately preserves distances
between terminals in , consists of at most vertices and edges,
and can be updated in worst-case time .
Crucially, our vertex sparsifier construction allows us to maintain a low
edge-congestion embedding of into , which is needed for our
applications