230,684 research outputs found
Maximum Matchings via Glauber Dynamics
In this paper we study the classic problem of computing a maximum cardinality
matching in general graphs . The best known algorithm for this
problem till date runs in time due to Micali and Vazirani
\cite{MV80}. Even for general bipartite graphs this is the best known running
time (the algorithm of Karp and Hopcroft \cite{HK73} also achieves this bound).
For regular bipartite graphs one can achieve an time algorithm which,
following a series of papers, has been recently improved to by
Goel, Kapralov and Khanna (STOC 2010) \cite{GKK10}. In this paper we present a
randomized algorithm based on the Markov Chain Monte Carlo paradigm which runs
in time, thereby obtaining a significant improvement over
\cite{MV80}.
We use a Markov chain similar to the \emph{hard-core model} for Glauber
Dynamics with \emph{fugacity} parameter , which is used to sample
independent sets in a graph from the Gibbs Distribution \cite{V99}, to design a
faster algorithm for finding maximum matchings in general graphs. Our result
crucially relies on the fact that the mixing time of our Markov Chain is
independent of , a significant deviation from the recent series of
works \cite{GGSVY11,MWW09, RSVVY10, S10, W06} which achieve computational
transition (for estimating the partition function) on a threshold value of
. As a result we are able to design a randomized algorithm which runs
in time that provides a major improvement over the running time
of the algorithm due to Micali and Vazirani. Using the conductance bound, we
also prove that mixing takes time where is the size
of the maximum matching.Comment: It has been pointed to us independently by Yuval Peres, Jonah
Sherman, Piyush Srivastava and other anonymous reviewers that the coupling
used in this paper doesn't have the right marginals because of which the
mixing time bound doesn't hold, and also the main result presented in the
paper. We thank them for reading the paper with interest and promptly
pointing out this mistak
Finding Motif Sets in Time Series
Time-series motifs are representative subsequences that occur frequently in a time series; a motif set is the set of subsequences deemed to be instances of a given motif. We focus on finding motif sets. Our motivation is to detect motif sets in household electricity-usage profiles, representing repeated patterns of household usage. We propose three algorithms for finding motif sets. Two are greedy algorithms based on pairwise comparison, and the third uses a heuristic measure of set quality to find the motif set directly. We compare these algorithms on simulated datasets and on electricity-usage data. We show that Scan MK, the simplest way of using the best-matching pair to find motif sets, is less accurate on our synthetic data than Set Finder and Cluster MK, although the latter is very sensitive to parameter settings. We qualitatively analyse the outputs for the electricity-usage data and demonstrate that both Scan MK and Set Finder can discover useful motif sets in such data
Cross Recurrence Plot Based Synchronization of Time Series
The method of recurrence plots is extended to the cross recurrence plots
(CRP), which among others enables the study of synchronization or time
differences in two time series. This is emphasized in a distorted main diagonal
in the cross recurrence plot, the line of synchronization (LOS). A
non-parametrical fit of this LOS can be used to rescale the time axis of the
two data series (whereby one of it is e.g. compressed or stretched) so that
they are synchronized. An application of this method to geophysical sediment
core data illustrates its suitability for real data. The rock magnetic data of
two different sediment cores from the Makarov Basin can be adjusted to each
other by using this method, so that they are comparable.Comment: Nonlinear Processes in Geophysics, 9, 2002, in pres
DRSP : Dimension Reduction For Similarity Matching And Pruning Of Time Series Data Streams
Similarity matching and join of time series data streams has gained a lot of
relevance in today's world that has large streaming data. This process finds
wide scale application in the areas of location tracking, sensor networks,
object positioning and monitoring to name a few. However, as the size of the
data stream increases, the cost involved to retain all the data in order to aid
the process of similarity matching also increases. We develop a novel framework
to addresses the following objectives. Firstly, Dimension reduction is
performed in the preprocessing stage, where large stream data is segmented and
reduced into a compact representation such that it retains all the crucial
information by a technique called Multi-level Segment Means (MSM). This reduces
the space complexity associated with the storage of large time-series data
streams. Secondly, it incorporates effective Similarity Matching technique to
analyze if the new data objects are symmetric to the existing data stream. And
finally, the Pruning Technique that filters out the pseudo data object pairs
and join only the relevant pairs. The computational cost for MSM is O(l*ni) and
the cost for pruning is O(DRF*wsize*d), where DRF is the Dimension Reduction
Factor. We have performed exhaustive experimental trials to show that the
proposed framework is both efficient and competent in comparison with earlier
works.Comment: 20 pages,8 figures, 6 Table
- …