1,584 research outputs found

    Distance Oracles for Time-Dependent Networks

    Full text link
    We present the first approximate distance oracle for sparse directed networks with time-dependent arc-travel-times determined by continuous, piecewise linear, positive functions possessing the FIFO property. Our approach precomputes (1+Ï”)−(1+\epsilon)-approximate distance summaries from selected landmark vertices to all other vertices in the network. Our oracle uses subquadratic space and time preprocessing, and provides two sublinear-time query algorithms that deliver constant and (1+σ)−(1+\sigma)-approximate shortest-travel-times, respectively, for arbitrary origin-destination pairs in the network, for any constant σ>Ï”\sigma > \epsilon. Our oracle is based only on the sparsity of the network, along with two quite natural assumptions about travel-time functions which allow the smooth transition towards asymmetric and time-dependent distance metrics.Comment: A preliminary version appeared as Technical Report ECOMPASS-TR-025 of EU funded research project eCOMPASS (http://www.ecompass-project.eu/). An extended abstract also appeared in the 41st International Colloquium on Automata, Languages, and Programming (ICALP 2014, track-A

    Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms

    Get PDF
    Let pp be an unknown and arbitrary probability distribution over [0,1)[0,1). We consider the problem of {\em density estimation}, in which a learning algorithm is given i.i.d. draws from pp and must (with high probability) output a hypothesis distribution that is close to pp. The main contribution of this paper is a highly efficient density estimation algorithm for learning using a variable-width histogram, i.e., a hypothesis distribution with a piecewise constant probability density function. In more detail, for any kk and Ï”\epsilon, we give an algorithm that makes O~(k/Ï”2)\tilde{O}(k/\epsilon^2) draws from pp, runs in O~(k/Ï”2)\tilde{O}(k/\epsilon^2) time, and outputs a hypothesis distribution hh that is piecewise constant with O(klog⁥2(1/Ï”))O(k \log^2(1/\epsilon)) pieces. With high probability the hypothesis hh satisfies dTV(p,h)≀C⋅optk(p)+Ï”d_{\mathrm{TV}}(p,h) \leq C \cdot \mathrm{opt}_k(p) + \epsilon, where dTVd_{\mathrm{TV}} denotes the total variation distance (statistical distance), CC is a universal constant, and optk(p)\mathrm{opt}_k(p) is the smallest total variation distance between pp and any kk-piecewise constant distribution. The sample size and running time of our algorithm are optimal up to logarithmic factors. The "approximation factor" CC in our result is inherent in the problem, as we prove that no algorithm with sample size bounded in terms of kk and Ï”\epsilon can achieve C<2C<2 regardless of what kind of hypothesis distribution it uses.Comment: conference version appears in NIPS 201

    New efficient algorithms for multiple change-point detection with kernels

    Get PDF
    Several statistical approaches based on reproducing kernels have been proposed to detect abrupt changes arising in the full distribution of the observations and not only in the mean or variance. Some of these approaches enjoy good statistical properties (oracle inequality, \ldots). Nonetheless, they have a high computational cost both in terms of time and memory. This makes their application difficult even for small and medium sample sizes (n<104n< 10^4). This computational issue is addressed by first describing a new efficient and exact algorithm for kernel multiple change-point detection with an improved worst-case complexity that is quadratic in time and linear in space. It allows dealing with medium size signals (up to n≈105n \approx 10^5). Second, a faster but approximation algorithm is described. It is based on a low-rank approximation to the Gram matrix. It is linear in time and space. This approximation algorithm can be applied to large-scale signals (n≄106n \geq 10^6). These exact and approximation algorithms have been implemented in \texttt{R} and \texttt{C} for various kernels. The computational and statistical performances of these new algorithms have been assessed through empirical experiments. The runtime of the new algorithms is observed to be faster than that of other considered procedures. Finally, simulations confirmed the higher statistical accuracy of kernel-based approaches to detect changes that are not only in the mean. These simulations also illustrate the flexibility of kernel-based approaches to analyze complex biological profiles made of DNA copy number and allele B frequencies. An R package implementing the approach will be made available on github

    Small space and streaming pattern matching with k edits

    Full text link
    In this work, we revisit the fundamental and well-studied problem of approximate pattern matching under edit distance. Given an integer kk, a pattern PP of length mm, and a text TT of length n≄mn \ge m, the task is to find substrings of TT that are within edit distance kk from PP. Our main result is a streaming algorithm that solves the problem in O~(k5)\tilde{O}(k^5) space and O~(k8)\tilde{O}(k^8) amortised time per character of the text, providing answers correct with high probability. (Hereafter, O~(⋅)\tilde{O}(\cdot) hides a poly(log⁥n)\mathrm{poly}(\log n) factor.) This answers a decade-old question: since the discovery of a poly(klog⁥n)\mathrm{poly}(k\log n)-space streaming algorithm for pattern matching under Hamming distance by Porat and Porat [FOCS 2009], the existence of an analogous result for edit distance remained open. Up to this work, no poly(klog⁥n)\mathrm{poly}(k\log n)-space algorithm was known even in the simpler semi-streaming model, where TT comes as a stream but PP is available for read-only access. In this model, we give a deterministic algorithm that achieves slightly better complexity. In order to develop the fully streaming algorithm, we introduce a new edit distance sketch parametrised by integers n≄kn\ge k. For any string of length at most nn, the sketch is of size O~(k2)\tilde{O}(k^2) and it can be computed with an O~(k2)\tilde{O}(k^2)-space streaming algorithm. Given the sketches of two strings, in O~(k3)\tilde{O}(k^3) time we can compute their edit distance or certify that it is larger than kk. This result improves upon O~(k8)\tilde{O}(k^8)-size sketches of Belazzougui and Zhu [FOCS 2016] and very recent O~(k3)\tilde{O}(k^3)-size sketches of Jin, Nelson, and Wu [STACS 2021]
    • 

    corecore