410 research outputs found
Online Pattern Matching for String Edit Distance with Moves
Edit distance with moves (EDM) is a string-to-string distance measure that
includes substring moves in addition to ordinal editing operations to turn one
string to the other. Although optimizing EDM is intractable, it has many
applications especially in error detections. Edit sensitive parsing (ESP) is an
efficient parsing algorithm that guarantees an upper bound of parsing
discrepancies between different appearances of the same substrings in a string.
ESP can be used for computing an approximate EDM as the L1 distance between
characteristic vectors built by node labels in parsing trees. However, ESP is
not applicable to a streaming text data where a whole text is unknown in
advance. We present an online ESP (OESP) that enables an online pattern
matching for EDM. OESP builds a parse tree for a streaming text and computes
the L1 distance between characteristic vectors in an online manner. For the
space-efficient computation of EDM, OESP directly encodes the parse tree into a
succinct representation by leveraging the idea behind recent results of a
dynamic succinct tree. We experimentally test OESP on the ability to compute
EDM in an online manner on benchmark datasets, and we show OESP's efficiency.Comment: This paper has been accepted to the 21st edition of the International
Symposium on String Processing and Information Retrieval (SPIRE2014
Online Self-Indexed Grammar Compression
Although several grammar-based self-indexes have been proposed thus far,
their applicability is limited to offline settings where whole input texts are
prepared, thus requiring to rebuild index structures for given additional
inputs, which is often the case in the big data era. In this paper, we present
the first online self-indexed grammar compression named OESP-index that can
gradually build the index structure by reading input characters one-by-one.
Such a property is another advantage which enables saving a working space for
construction, because we do not need to store input texts in memory. We
experimentally test OESP-index on the ability to build index structures and
search query texts, and we show OESP-index's efficiency, especially
space-efficiency for building index structures.Comment: To appear in the Proceedings of the 22nd edition of the International
Symposium on String Processing and Information Retrieval (SPIRE2015
Damage detection and quantification in composite beam structure using strain energy and vibration data
10.1088/1742-6596/842/1/012027Journal of Physics: Conference Series84211202
Danskernes forventninger til selvkørende biler.
Vejdirektoratet har sammen med Wilke undersøgt danskernes forventninger til selvkørende biler med vægt på accept, drivere, bekymringer og brugsforventninger. I overensstemmelse med andre undersøgelser forventer danskerne at køre mere i bil i en situation med helt selvkørende biler. Forventningen dækker både over øget kørsel hverdage og weekends/ferie, længere transportafstande, samt over at forskellige grupper der i dag enten ikke kan køre i bil, eller kan lide at køre bil, forventer at kunne køre mere.
De fordele som befolkningen forventer at selvkørende biler giver er især øget sikkerhed, aflastningen fra køre-opgaver og muligheden for at slappe af undervejs. Bekymringerne knytter sig til at miste færdigheder og tabe kontrol, samt spørgsmål om ansvar og muligheder for misbrug/hacking.I forhold til accept og hvem der forventninger at bruge selvkørende biler når det bliver muligt, afspejler undersøgelsens resultater både en sammenhæng med behovet for biltransport, oplevelsen af trængsel; en generel betydning af alder og teknologiorientering; og en mulig betydning af befolkningens erfaringer med fører-støtte systemer i biler (delvis automatisering)
Linear-Time Computation of Cyclic Roots and Cyclic Covers of a String
Cyclic versions of covers and roots of a string are considered in this paper. A prefix V of a string S is a cyclic root of S if S is a concatenation of cyclic rotations of V. A prefix V of S is a cyclic cover of S if the occurrences of the cyclic rotations of V cover all positions of S. We present ?(n)-time algorithms computing all cyclic roots (using number-theoretic tools) and all cyclic covers (using tools related to seeds) of a length-n string over an integer alphabet. Our results improve upon ?(n log log n) and ?(n log n) time complexities of recent algorithms of Grossi et al. (WALCOM 2023) for the respective problems and provide novel approaches to the problems. As a by-product, we obtain an optimal data structure for Internal Circular Pattern Matching queries that generalize Internal Pattern Matching and Cyclic Equivalence queries of Kociumaka et al. (SODA 2015)
Composite repetition-aware data structures
In highly repetitive strings, like collections of genomes from the same
species, distinct measures of repetition all grow sublinearly in the length of
the text, and indexes targeted to such strings typically depend only on one of
these measures. We describe two data structures whose size depends on multiple
measures of repetition at once, and that provide competitive tradeoffs between
the time for counting and reporting all the exact occurrences of a pattern, and
the space taken by the structure. The key component of our constructions is the
run-length encoded BWT (RLBWT), which takes space proportional to the number of
BWT runs: rather than augmenting RLBWT with suffix array samples, we combine it
with data structures from LZ77 indexes, which take space proportional to the
number of LZ77 factors, and with the compact directed acyclic word graph
(CDAWG), which takes space proportional to the number of extensions of maximal
repeats. The combination of CDAWG and RLBWT enables also a new representation
of the suffix tree, whose size depends again on the number of extensions of
maximal repeats, and that is powerful enough to support matching statistics and
constant-space traversal.Comment: (the name of the third co-author was inadvertently omitted from
previous version
Compressed Subsequence Matching and Packed Tree Coloring
We present a new algorithm for subsequence matching in grammar compressed
strings. Given a grammar of size compressing a string of size and a
pattern string of size over an alphabet of size , our algorithm
uses space and or time. Here
is the word size and is the number of occurrences of the pattern. Our
algorithm uses less space than previous algorithms and is also faster for
occurrences. The algorithm uses a new data structure
that allows us to efficiently find the next occurrence of a given character
after a given position in a compressed string. This data structure in turn is
based on a new data structure for the tree color problem, where the node colors
are packed in bit strings.Comment: To appear at CPM '1
- …