Search CORE

1,123 research outputs found

Compressed and Practical Data Structures for Strings

Author: Christiansen Anders Roy
Publication venue: DTU Compute
Publication date: 01/01/2018
Field of study

Finger Search in Grammar-Compressed Strings

Author: Bille Philip
Christiansen Anders Roy
Cording Patrick Hagge
Gørtz Inge Li
Publication venue
Publication date: 01/01/2016
Field of study

Grammar-based compression, where one replaces a long string by a small context-free grammar that generates the string, is a simple and powerful paradigm that captures many popular compression schemes. Given a grammar, the random access problem is to compactly represent the grammar while supporting random access, that is, given a position in the original uncompressed string report the character at that position. In this paper we study the random access problem with the finger search property, that is, the time for a random access query should depend on the distance between a specified index

f

, called the \emph{finger}, and the query index

i

. We consider both a static variant, where we first place a finger and subsequently access indices near the finger efficiently, and a dynamic variant where also moving the finger such that the time depends on the distance moved is supported. Let

n

be the size the grammar, and let

N

be the size of the string. For the static variant we give a linear space representation that supports placing the finger in

O(\log N)

time and subsequently accessing in

O(\log D)

time, where

D

is the distance between the finger and the accessed index. For the dynamic variant we give a linear space representation that supports placing the finger in

O(\log N)

time and accessing and moving the finger in

O(\log D + \log \log N)

time. Compared to the best linear space solution to random access, we improve a

O(\log N)

query bound to

O(\log D)

for the static variant and to

O(\log D + \log \log N)

for the dynamic variant, while maintaining linear space. As an application of our results we obtain an improved solution to the longest common extension problem in grammar compressed strings. To obtain our results, we introduce several new techniques of independent interest, including a novel van Emde Boas style decomposition of grammars

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Online Research Database In Technology

Fast Dynamic Arrays

Author: Bille Philip
Christiansen Anders Roy
Ettienne Mikko Berggren
Gørtz Inge Li
Publication venue
Publication date: 01/01/2017
Field of study

We present a highly optimized implementation of tiered vectors, a data structure for maintaining a sequence of

n

elements supporting access in time

O(1)

and insertion and deletion in time

O(n^\epsilon)

for

\epsilon > 0

while using

o(n)

extra space. We consider several different implementation optimizations in C++ and compare their performance to that of vector and multiset from the standard library on sequences with up to

10^8

elements. Our fastest implementation uses much less space than multiset while providing speedups of

40\times

for access operations compared to multiset and speedups of

10.000\times

compared to vector for insertion and deletion operations while being competitive with both data structures for all other operations

arXiv.org e-Print Archive

Online Research Database In Technology

Fast Dynamic Arrays

Author: Bille Philip
Christiansen Anders Roy
Ettienne Mikko Berggren
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 25th Annual European Symposium on Algorithms (ESA 2017)
Publication date: 01/01/2017
Field of study

We present a highly optimized implementation of tiered vectors, a data structure for maintaining a sequence of n elements supporting access in time O(1) and insertion and deletion in time O(n^e) for e > 0 while using o(n) extra space. We consider several different implementation optimizations in C++ and compare their performance to that of vector and set from the standard library on sequences with up to 10^8 elements. Our fastest implementation uses much less space than set while providing speedups of 40x for access operations compared to set and speedups of 10.000x compared to vector for insertion and deletion operations while being competitive with both data structures for all other operations

Dagstuhl Research Online Publication Server

Compressed Indexing with Signature Grammars

Author: Christiansen Anders Roy
Ettienne Mikko Berggren
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

The compressed indexing problem is to preprocess a string

S

of length

n

into a compressed representation that supports pattern matching queries. That is, given a string

P

of length

m

report all occurrences of

P

S

. We present a data structure that supports pattern matching queries in

O(m + occ (\lg\lg n + \lg^\epsilon z))

time using

O(z \lg(n / z))

space where

z

is the size of the LZ77 parse of

S

and

\epsilon > 0

is an arbitrarily small constant, when the alphabet is small or

z = O(n^{1 - \delta})

for any constant

\delta > 0

. We also present two data structures for the general case; one where the space is increased by

O(z\lg\lg z)

, and one where the query time changes from worst-case to expected. These results improve the previously best known solutions. Notably, this is the first data structure that decides if

P

occurs in

S

O(m)

time using

O(z\lg(n/z))

space. Our results are mainly obtained by a novel combination of a randomized grammar construction algorithm with well known techniques relating pattern matching to 2D-range reporting

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology

Optimal-Time Dictionary-Compressed Indexes

Author: Christiansen Anders Roy
Ettienne Mikko Berggren
Kociumaka Tomasz
Navarro Gonzalo
Prezza Nicola
Publication venue
Publication date: 04/09/2019
Field of study

We describe the first self-indexes able to count and locate pattern occurrences in optimal time within a space bounded by the size of the most popular dictionary compressors. To achieve this result we combine several recent findings, including \emph{string attractors} --- new combinatorial objects encompassing most known compressibility measures for highly repetitive texts ---, and grammars based on \emph{locally-consistent parsing}. More in detail, let

\gamma

be the size of the smallest attractor for a text

T

of length

n

. The measure

\gamma

is an (asymptotic) lower bound to the size of dictionary compressors based on Lempel--Ziv, context-free grammars, and many others. The smallest known text representations in terms of attractors use space

O(\gamma\log(n/\gamma))

, and our lightest indexes work within the same asymptotic space. Let

\epsilon>0

be a suitably small constant fixed at construction time,

m

be the pattern length, and

occ

be the number of its text occurrences. Our index counts pattern occurrences in

O(m+\log^{2+\epsilon}n)

time, and locates them in

O(m+(occ+1)\log^\epsilon n)

time. These times already outperform those of most dictionary-compressed indexes, while obtaining the least asymptotic space for any index searching within

O((m+occ)\,\textrm{polylog}\,n)

time. Further, by increasing the space to

O(\gamma\log(n/\gamma)\log^\epsilon n)

, we reduce the locating time to the optimal

O(m+occ)

, and within

O(\gamma\log(n/\gamma)\log n)

space we can also count in optimal

O(m)

time. No dictionary-compressed index had obtained this time before. All our indexes can be constructed in

O(n)

space and

O(n\log n)

expected time. As a byproduct of independent interest..

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Online Research Database In Technology

Finger Search in Grammar-Compressed Strings

Author: Bille Philip
Christiansen Anders Roy
Cording Patrick Hagge
Gørtz Inge Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Crossref

Online Research Database In Technology

Dynamic Relative Compression, Dynamic Partial Sums, and Substring Concatenation

Author: Bille Philip
Christiansen Anders Roy
Cording Patrick Hagge
Gørtz Inge Li
Skjoldjensen Frederik Rye
Vildhøj Hjalte Wedel
Vind Søren
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/10/2017
Field of study

Crossref

Online Research Database In Technology

Active megadetachment beneath the western United States

Author: Allmendinger
Anders
Axen
Beaumont
Beghoul
Bell
Bell
Bennett
Bennett
Bennett
Bergman
Bergman
Berthe
Bills
Block
Brace
Brian Wernicke
Brodie
Brodie
Brodie
Buck
Buck
Burchfiel
Caskey
Catchings
Christiansen
Christiansen
Cook
Davis
Davis
Davis
DeCelles
Demouchy
dePolo
Dixon
Dokka
Dragert
Ducea
Ducea
Dueker
Dumond
Elósegui
England
England
Flack
Flesch
Flesch
Fountain
Fountain
Frankel
Freed
Friedrich
Friedrich
Gan
Gashawbeza
Glazner
Goetze
Gourmelen
Gripp
Gripp
Hales
Hammond
Hammond
Hauge
Hauser
Hetland
Hey
Hey
Hirth
Hobbs
Hobbs
Holbrook
Holbrook
Humphreys
Humphreys
Humphreys
Jackson
James L. Davis
Jarchow
Jones
Jordan
Katayama
Kelemen
Klemperer
Klemperer
Kostoglodov
Kruse
Lachenbruch
Le Pourhiet
Lee
Lerch
Liu
Louie
Lowry
Lowry
Lowry
Luedke
Machette
Maggi
McGeary
McKenzie
McKenzie
McQuarrie
Melbourne
Melbourne
Miller
Miller
Molnar
Nathan A. Niemi
Niemi
Okada
Oldow
Oldow
Oldow
Pancha
Park
Parsons
Passchier
Peter Luffi
Pollitz
Potter
Regenauer-Lieb
Regenauer-Lieb
Reheis
Reston
Roy
Rutter
Saleeby
Saltzer
Savage
Schellart
Schellart
Silver
Silver
Smith
Smith
Smith
Smith
Snoke
Sonder
Sonder
Sonder
Stewart
Suetnova
Sunil Bisnath
Suppe
Taber
Tapponnier
Thatcher
Thatcher
Thompson
Thompson
Usui
Waite
Walker
Wallace
Wallace
Wang
Watts
Weinberg
Wernicke
Wernicke
Wernicke
Wernicke
Wernicke
Wesnousky
Wesnousky
Wesnousky
Wilson
York
Zandt
Zandt
Zandt
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 01/11/2008
Field of study

Geodetic data, interpreted in light of seismic imaging, seismicity, xenolith studies, and the late Quaternary geologic history of the northern Great Basin, suggest that a subcontinental-scale extensional detachment is localized near the Moho. To first order, seismic yielding in the upper crust at any given latitude in this region occurs via an M7 earthquake every 100 years. Here we develop the hypothesis that since 1996, the region has undergone a cycle of strain accumulation and release similar to “slow slip events” observed on subduction megathrusts, but yielding occurred on a subhorizontal surface 5–10 times larger in the slip direction, and at temperatures >800°C. Net slip was variable, ranging from 5 to 10 mm over most of the region. Strain energy with moment magnitude equivalent to an M7 earthquake was released along this “megadetachment,” primarily between 2000.0 and 2005.5. Slip initiated in late 1998 to mid-1999 in northeastern Nevada and is best expressed in late 2003 during a magma injection event at Moho depth beneath the Sierra Nevada, accompanied by more rapid eastward relative displacement across the entire region. The event ended in the east at 2004.0 and in the remainder of the network at about 2005.5. Strain energy thus appears to have been transmitted from the Cordilleran interior toward the plate boundary, from high gravitational potential to low, via yielding on the megadetachment. The size and kinematic function of the proposed structure, in light of various proxies for lithospheric thickness, imply that the subcrustal lithosphere beneath Nevada is a strong, thin plate, even though it resides in a high heat flow tectonic regime. A strong lowermost crust and upper mantle is consistent with patterns of postseismic relaxation in the southern Great Basin, deformation microstructures and low water content in dunite xenoliths in young lavas in central Nevada, and high-temperature microstructures in analog surface exposures of deformed lower crust. Large-scale decoupling between crust and upper mantle is consistent with the broad distribution of strain in the upper crust versus the more localized distribution in the subcrustal lithosphere, as inferred by such proxies as low P wave velocity and mafic magmatism

Crossref

Caltech Authors

Deep Blue Documents at the University of Michigan

Mapping localised freshwater anomalies in the brackish paleo-lake sediments of the Machile–Zambezi Basin with transient electromagnetic sounding, geoelectrical imaging and induced polarisation

Crossref