Search CORE

3,978 research outputs found

Dynamic Ordered Sets with Exponential Search Trees

Author: Andersson Arne
Thorup Mikkel
Publication venue
Publication date: 01/01/2002
Field of study

We introduce exponential search trees as a novel technique for converting static polynomial space search structures for ordered sets into fully-dynamic linear space data structures. This leads to an optimal bound of O(sqrt(log n/loglog n)) for searching and updating a dynamic set of n integer keys in linear space. Here searching an integer y means finding the maximum key in the set which is smaller than or equal to y. This problem is equivalent to the standard text book problem of maintaining an ordered set (see, e.g., Cormen, Leiserson, Rivest, and Stein: Introduction to Algorithms, 2nd ed., MIT Press, 2001). The best previous deterministic linear space bound was O(log n/loglog n) due Fredman and Willard from STOC 1990. No better deterministic search bound was known using polynomial space. We also get the following worst-case linear space trade-offs between the number n, the word length w, and the maximal key U < 2^w: O(min{loglog n+log n/log w, (loglog n)(loglog U)/(logloglog U)}). These trade-offs are, however, not likely to be optimal. Our results are generalized to finger searching and string searching, providing optimal results for both in terms of n.Comment: Revision corrects some typoes and state things better for applications in subsequent paper

arXiv.org e-Print Archive

CiteSeerX

New Guarantees for Blind Compressed Sensing

Author: Aghagolzadeh Mohammad
Radha Hayder
Publication venue
Publication date: 07/08/2015
Field of study

Blind Compressed Sensing (BCS) is an extension of Compressed Sensing (CS) where the optimal sparsifying dictionary is assumed to be unknown and subject to estimation (in addition to the CS sparse coefficients). Since the emergence of BCS, dictionary learning, a.k.a. sparse coding, has been studied as a matrix factorization problem where its sample complexity, uniqueness and identifiability have been addressed thoroughly. However, in spite of the strong connections between BCS and sparse coding, recent results from the sparse coding problem area have not been exploited within the context of BCS. In particular, prior BCS efforts have focused on learning constrained and complete dictionaries that limit the scope and utility of these efforts. In this paper, we develop new theoretical bounds for perfect recovery for the general unconstrained BCS problem. These unconstrained BCS bounds cover the case of overcomplete dictionaries, and hence, they go well beyond the existing BCS theory. Our perfect recovery results integrate the combinatorial theories of sparse coding with some of the recent results from low-rank matrix recovery. In particular, we propose an efficient CS measurement scheme that results in practical recovery bounds for BCS. Moreover, we discuss the performance of BCS under polynomial-time sparse coding algorithms.Comment: To appear in the 53rd Annual Allerton Conference on Communication, Control and Computing, University of Illinois at Urbana-Champaign, IL, USA, 201

arXiv.org e-Print Archive

Crossref

Pattern Matching in Multiple Streams

Author: A. Amir
D. Breslauer
F. Ergun
G.M. Landau
G.M. Landau
H. Karloff
K. Abrahamson
M. Ružić
R. Clifford
R. Clifford
R. Clifford
R. Clifford
R. Clifford
T.S. Jayram
Z. Galil
Publication venue
Publication date: 01/01/2012
Field of study

We investigate the problem of deterministic pattern matching in multiple streams. In this model, one symbol arrives at a time and is associated with one of s streaming texts. The task at each time step is to report if there is a new match between a fixed pattern of length m and a newly updated stream. As is usual in the streaming context, the goal is to use as little space as possible while still reporting matches quickly. We give almost matching upper and lower space bounds for three distinct pattern matching problems. For exact matching we show that the problem can be solved in constant time per arriving symbol and O(m+s) words of space. For the k-mismatch and k-difference problems we give O(k) time solutions that require O(m+ks) words of space. In all three cases we also give space lower bounds which show our methods are optimal up to a single logarithmic factor. Finally we set out a number of open problems related to this new model for pattern matching.Comment: 13 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

A practical index for approximate dictionary matching with few mismatches

Author: Cisłak Aleksander
Grabowski Szymon
Publication venue
Publication date: 11/02/2016
Field of study

Approximate dictionary matching is a classic string matching problem (checking if a query string occurs in a collection of strings) with applications in, e.g., spellchecking, online catalogs, geolocation, and web searchers. We present a surprisingly simple solution called a split index, which is based on the Dirichlet principle, for matching a keyword with few mismatches, and experimentally show that it offers competitive space-time tradeoffs. Our implementation in the C++ language is focused mostly on data compaction, which is beneficial for the search speed (e.g., by being cache friendly). We compare our solution with other algorithms and we show that it performs better for the Hamming distance. Query times in the order of 1 microsecond were reported for one mismatch for the dictionary size of a few megabytes on a medium-end PC. We also demonstrate that a basic compression technique consisting in

q

-gram substitution can significantly reduce the index size (up to 50% of the input text size for the DNA), while still keeping the query time relatively low

arXiv.org e-Print Archive

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)