1,654 research outputs found
Composite repetition-aware data structures
In highly repetitive strings, like collections of genomes from the same
species, distinct measures of repetition all grow sublinearly in the length of
the text, and indexes targeted to such strings typically depend only on one of
these measures. We describe two data structures whose size depends on multiple
measures of repetition at once, and that provide competitive tradeoffs between
the time for counting and reporting all the exact occurrences of a pattern, and
the space taken by the structure. The key component of our constructions is the
run-length encoded BWT (RLBWT), which takes space proportional to the number of
BWT runs: rather than augmenting RLBWT with suffix array samples, we combine it
with data structures from LZ77 indexes, which take space proportional to the
number of LZ77 factors, and with the compact directed acyclic word graph
(CDAWG), which takes space proportional to the number of extensions of maximal
repeats. The combination of CDAWG and RLBWT enables also a new representation
of the suffix tree, whose size depends again on the number of extensions of
maximal repeats, and that is powerful enough to support matching statistics and
constant-space traversal.Comment: (the name of the third co-author was inadvertently omitted from
previous version
No Effect of Steady Rotation on Solid He in a Torsional Oscillator
We have measured the response of a torsional oscillator containing
polycrystalline hcp solid He to applied steady rotation in an attempt to
verify the observations of several other groups that were initially interpreted
as evidence for macroscopic quantum effects. The geometry of the cell was that
of a simple annulus, with a fill line of relatively narrow diameter in the
centre of the torsion rod. Varying the angular velocity of rotation up to
2\,rad\,s showed that there were no step-like features in the resonant
frequency or dissipation of the oscillator and no history dependence, even
though we achieved the sensitivity required to detect the various effects seen
in earlier experiments on other rotating cryostats. All small changes during
rotation were consistent with those occurring with an empty cell. We thus
observed no effects on the samples of solid He attributable to steady
rotation.Comment: 8 pages, 3 figures, accepted in J. Low Temp. Phy
Fast Label Extraction in the CDAWG
The compact directed acyclic word graph (CDAWG) of a string of length
takes space proportional just to the number of right extensions of the
maximal repeats of , and it is thus an appealing index for highly repetitive
datasets, like collections of genomes from similar species, in which grows
significantly more slowly than . We reduce from to
the time needed to count the number of occurrences of a pattern of
length , using an existing data structure that takes an amount of space
proportional to the size of the CDAWG. This implies a reduction from
to in the time needed to
locate all the occurrences of the pattern. We also reduce from
to the time needed to read the characters of the
label of an edge of the suffix tree of , and we reduce from
to the time needed to compute the matching
statistics between a query of length and , using an existing
representation of the suffix tree based on the CDAWG. All such improvements
derive from extracting the label of a vertex or of an arc of the CDAWG using a
straight-line program induced by the reversed CDAWG.Comment: 16 pages, 1 figure. In proceedings of the 24th International
Symposium on String Processing and Information Retrieval (SPIRE 2017). arXiv
admin note: text overlap with arXiv:1705.0864
A Faster Implementation of Online Run-Length Burrows-Wheeler Transform
Run-length encoding Burrows-Wheeler Transformed strings, resulting in
Run-Length BWT (RLBWT), is a powerful tool for processing highly repetitive
strings. We propose a new algorithm for online RLBWT working in run-compressed
space, which runs in time and bits of space, where
is the length of input string received so far and is the number of runs
in the BWT of the reversed . We improve the state-of-the-art algorithm for
online RLBWT in terms of empirical construction time. Adopting the dynamic list
for maintaining a total order, we can replace rank queries in a dynamic wavelet
tree on a run-length compressed string by the direct comparison of labels in a
dynamic list. The empirical result for various benchmarks show the efficiency
of our algorithm, especially for highly repetitive strings.Comment: In Proc. IWOCA201
Developing LCA-based benchmarks for sustainable consumption - for and with users
This article presents the development process of a consumer-oriented, illustrative benchmarking tool enabling consumers to use the results of environmental life cycle assessment (LCA) to make informed decisions. Active and environmentally conscious consumers and environmental communicators were identified as key target groups for this type of information. A brochure presenting the benchmarking tool was developed as an participatory, iterative process involving consumer focus groups, stakeholder workshops and questionnaire-based feedback. In addition to learning what works and what does not, detailed suggestions on improved wording and figures were obtained, as well as a wealth of ideas for future applications
Suffix Tree of Alignment: An Efficient Index for Similar Data
We consider an index data structure for similar strings. The generalized
suffix tree can be a solution for this. The generalized suffix tree of two
strings and is a compacted trie representing all suffixes in and
. It has leaves and can be constructed in time.
However, if the two strings are similar, the generalized suffix tree is not
efficient because it does not exploit the similarity which is usually
represented as an alignment of and .
In this paper we propose a space/time-efficient suffix tree of alignment
which wisely exploits the similarity in an alignment. Our suffix tree for an
alignment of and has leaves where is the sum of
the lengths of all parts of different from and is the sum of the
lengths of some common parts of and . We did not compromise the pattern
search to reduce the space. Our suffix tree can be searched for a pattern
in time where is the number of occurrences of in and
. We also present an efficient algorithm to construct the suffix tree of
alignment. When the suffix tree is constructed from scratch, the algorithm
requires time where is the sum of the lengths
of other common substrings of and . When the suffix tree of is
already given, it requires time.Comment: 12 page
Роль маркетинга в сфере культуры
Сегодня все мы ощущаем завершение очередного этапа развития нашего общества, который выражается в многочисленных кризисах (политическом, экономическом, экологическом и т.д.), что в полной мере
отражает художественная культура
- …