230,684 research outputs found

    Maximum Matchings via Glauber Dynamics

    Full text link
    In this paper we study the classic problem of computing a maximum cardinality matching in general graphs G=(V,E)G = (V, E). The best known algorithm for this problem till date runs in O(mn)O(m \sqrt{n}) time due to Micali and Vazirani \cite{MV80}. Even for general bipartite graphs this is the best known running time (the algorithm of Karp and Hopcroft \cite{HK73} also achieves this bound). For regular bipartite graphs one can achieve an O(m)O(m) time algorithm which, following a series of papers, has been recently improved to O(nlogn)O(n \log n) by Goel, Kapralov and Khanna (STOC 2010) \cite{GKK10}. In this paper we present a randomized algorithm based on the Markov Chain Monte Carlo paradigm which runs in O(mlog2n)O(m \log^2 n) time, thereby obtaining a significant improvement over \cite{MV80}. We use a Markov chain similar to the \emph{hard-core model} for Glauber Dynamics with \emph{fugacity} parameter λ\lambda, which is used to sample independent sets in a graph from the Gibbs Distribution \cite{V99}, to design a faster algorithm for finding maximum matchings in general graphs. Our result crucially relies on the fact that the mixing time of our Markov Chain is independent of λ\lambda, a significant deviation from the recent series of works \cite{GGSVY11,MWW09, RSVVY10, S10, W06} which achieve computational transition (for estimating the partition function) on a threshold value of λ\lambda. As a result we are able to design a randomized algorithm which runs in O(mlog2n)O(m\log^2 n) time that provides a major improvement over the running time of the algorithm due to Micali and Vazirani. Using the conductance bound, we also prove that mixing takes Ω(mk)\Omega(\frac{m}{k}) time where kk is the size of the maximum matching.Comment: It has been pointed to us independently by Yuval Peres, Jonah Sherman, Piyush Srivastava and other anonymous reviewers that the coupling used in this paper doesn't have the right marginals because of which the mixing time bound doesn't hold, and also the main result presented in the paper. We thank them for reading the paper with interest and promptly pointing out this mistak

    Finding Motif Sets in Time Series

    Get PDF
    Time-series motifs are representative subsequences that occur frequently in a time series; a motif set is the set of subsequences deemed to be instances of a given motif. We focus on finding motif sets. Our motivation is to detect motif sets in household electricity-usage profiles, representing repeated patterns of household usage. We propose three algorithms for finding motif sets. Two are greedy algorithms based on pairwise comparison, and the third uses a heuristic measure of set quality to find the motif set directly. We compare these algorithms on simulated datasets and on electricity-usage data. We show that Scan MK, the simplest way of using the best-matching pair to find motif sets, is less accurate on our synthetic data than Set Finder and Cluster MK, although the latter is very sensitive to parameter settings. We qualitatively analyse the outputs for the electricity-usage data and demonstrate that both Scan MK and Set Finder can discover useful motif sets in such data

    Cross Recurrence Plot Based Synchronization of Time Series

    Get PDF
    The method of recurrence plots is extended to the cross recurrence plots (CRP), which among others enables the study of synchronization or time differences in two time series. This is emphasized in a distorted main diagonal in the cross recurrence plot, the line of synchronization (LOS). A non-parametrical fit of this LOS can be used to rescale the time axis of the two data series (whereby one of it is e.g. compressed or stretched) so that they are synchronized. An application of this method to geophysical sediment core data illustrates its suitability for real data. The rock magnetic data of two different sediment cores from the Makarov Basin can be adjusted to each other by using this method, so that they are comparable.Comment: Nonlinear Processes in Geophysics, 9, 2002, in pres

    DRSP : Dimension Reduction For Similarity Matching And Pruning Of Time Series Data Streams

    Get PDF
    Similarity matching and join of time series data streams has gained a lot of relevance in today's world that has large streaming data. This process finds wide scale application in the areas of location tracking, sensor networks, object positioning and monitoring to name a few. However, as the size of the data stream increases, the cost involved to retain all the data in order to aid the process of similarity matching also increases. We develop a novel framework to addresses the following objectives. Firstly, Dimension reduction is performed in the preprocessing stage, where large stream data is segmented and reduced into a compact representation such that it retains all the crucial information by a technique called Multi-level Segment Means (MSM). This reduces the space complexity associated with the storage of large time-series data streams. Secondly, it incorporates effective Similarity Matching technique to analyze if the new data objects are symmetric to the existing data stream. And finally, the Pruning Technique that filters out the pseudo data object pairs and join only the relevant pairs. The computational cost for MSM is O(l*ni) and the cost for pruning is O(DRF*wsize*d), where DRF is the Dimension Reduction Factor. We have performed exhaustive experimental trials to show that the proposed framework is both efficient and competent in comparison with earlier works.Comment: 20 pages,8 figures, 6 Table
    corecore