20,364 research outputs found

    Connectivity Oracles for Graphs Subject to Vertex Failures

    Full text link
    We introduce new data structures for answering connectivity queries in graphs subject to batched vertex failures. A deterministic structure processes a batch of d≀d⋆d\leq d_{\star} failed vertices in O~(d3)\tilde{O}(d^3) time and thereafter answers connectivity queries in O(d)O(d) time. It occupies space O(d⋆mlog⁑n)O(d_{\star} m\log n). We develop a randomized Monte Carlo version of our data structure with update time O~(d2)\tilde{O}(d^2), query time O(d)O(d), and space O~(m)\tilde{O}(m) for any failure bound d≀nd\le n. This is the first connectivity oracle for general graphs that can efficiently deal with an unbounded number of vertex failures. We also develop a more efficient Monte Carlo edge-failure connectivity oracle. Using space O(nlog⁑2n)O(n\log^2 n), dd edge failures are processed in O(dlog⁑dlog⁑log⁑n)O(d\log d\log\log n) time and thereafter, connectivity queries are answered in O(log⁑log⁑n)O(\log\log n) time, which are correct w.h.p. Our data structures are based on a new decomposition theorem for an undirected graph G=(V,E)G=(V,E), which is of independent interest. It states that for any terminal set UβŠ†VU\subseteq V we can remove a set BB of ∣U∣/(sβˆ’2)|U|/(s-2) vertices such that the remaining graph contains a Steiner forest for Uβˆ’BU-B with maximum degree ss

    Weighted Min-Cut: Sequential, Cut-Query and Streaming Algorithms

    Get PDF
    Consider the following 2-respecting min-cut problem. Given a weighted graph GG and its spanning tree TT, find the minimum cut among the cuts that contain at most two edges in TT. This problem is an important subroutine in Karger's celebrated randomized near-linear-time min-cut algorithm [STOC'96]. We present a new approach for this problem which can be easily implemented in many settings, leading to the following randomized min-cut algorithms for weighted graphs. * An O(mlog⁑2nlog⁑log⁑n+nlog⁑6n)O(m\frac{\log^2 n}{\log\log n} + n\log^6 n)-time sequential algorithm: This improves Karger's O(mlog⁑3n)O(m \log^3 n) and O(m(log⁑2n)log⁑(n2/m)log⁑log⁑n+nlog⁑6n)O(m\frac{(\log^2 n)\log (n^2/m)}{\log\log n} + n\log^6 n) bounds when the input graph is not extremely sparse or dense. Improvements over Karger's bounds were previously known only under a rather strong assumption that the input graph is simple [Henzinger et al. SODA'17; Ghaffari et al. SODA'20]. For unweighted graphs with parallel edges, our bound can be improved to O(mlog⁑1.5nlog⁑log⁑n+nlog⁑6n)O(m\frac{\log^{1.5} n}{\log\log n} + n\log^6 n). * An algorithm requiring O~(n)\tilde O(n) cut queries to compute the min-cut of a weighted graph: This answers an open problem by Rubinstein et al. ITCS'18, who obtained a similar bound for simple graphs. * A streaming algorithm that requires O~(n)\tilde O(n) space and O(log⁑n)O(\log n) passes to compute the min-cut: The only previous non-trivial exact min-cut algorithm in this setting is the 2-pass O~(n)\tilde O(n)-space algorithm on simple graphs [Rubinstein et al., ITCS'18] (observed by Assadi et al. STOC'19). In contrast to Karger's 2-respecting min-cut algorithm which deploys sophisticated dynamic programming techniques, our approach exploits some cute structural properties so that it only needs to compute the values of O~(n)\tilde O(n) cuts corresponding to removing O~(n)\tilde O(n) pairs of tree edges, an operation that can be done quickly in many settings.Comment: Updates on this version: (1) Minor corrections in Section 5.1, 5.2; (2) Reference to newer results by GMW SOSA21 (arXiv:2008.02060v2), DEMN STOC21 (arXiv:2004.09129v2) and LMN 21 (arXiv:2102.06565v1

    The study of probability model for compound similarity searching

    Get PDF
    Information Retrieval or IR system main task is to retrieve relevant documents according to the users query. One of IR most popular retrieval model is the Vector Space Model. This model assumes relevance based on similarity, which is defined as the distance between query and document in the concept space. All currently existing chemical compound database systems have adapt the vector space model to calculate the similarity of a database entry to a query compound. However, it assumes that fragments represented by the bits are independent of one another, which is not necessarily true. Hence, the possibility of applying another IR model is explored, which is the Probabilistic Model, for chemical compound searching. This model estimates the probabilities of a chemical structure to have the same bioactivity as a target compound. It is envisioned that by ranking chemical structures in decreasing order of their probability of relevance to the query structure, the effectiveness of a molecular similarity searching system can be increased. Both fragment dependencies and independencies assumption are taken into consideration in achieving improvement towards compound similarity searching system. After conducting a series of simulated similarity searching, it is concluded that PM approaches really did perform better than the existing similarity searching. It gave better result in all evaluation criteria to confirm this statement. In terms of which probability model performs better, the BD model shown improvement over the BIR model
    • …
    corecore