289,661 research outputs found
Sub-fingerprint masking for a robust audio fingerprinting system in a real-noise environment for portable consumer devices
The file attached is author's final draft.The robustness of audio fingerprinting system in a noisy environment
is a principal challenge in the area of content-based audio retrieval,
especially for use in portable consumer devices. Our new audio
fingerprint method using sub-fingerprint masking based on the
predominant pitch extraction increases the accuracy of the audio
fingerprinting system in a noisy environment dramatically, while
requiring much less computing power compared to the expanded
hash table lookup method, where the searching complexity increases
by the factor of 33 times the degree of expansion.Supported by Seoul R&BD Program
# 10581
RasBhari: optimizing spaced seeds for database searching, read mapping and alignment-free sequence comparison
Many algorithms for sequence analysis rely on word matching or word
statistics. Often, these approaches can be improved if binary patterns
representing match and don't-care positions are used as a filter, such that
only those positions of words are considered that correspond to the match
positions of the patterns. The performance of these approaches, however,
depends on the underlying patterns. Herein, we show that the overlap complexity
of a pattern set that was introduced by Ilie and Ilie is closely related to the
variance of the number of matches between two evolutionarily related sequences
with respect to this pattern set. We propose a modified hill-climbing algorithm
to optimize pattern sets for database searching, read mapping and
alignment-free sequence comparison of nucleic-acid sequences; our
implementation of this algorithm is called rasbhari. Depending on the
application at hand, rasbhari can either minimize the overlap complexity of
pattern sets, maximize their sensitivity in database searching or minimize the
variance of the number of pattern-based matches in alignment-free sequence
comparison. We show that, for database searching, rasbhari generates pattern
sets with slightly higher sensitivity than existing approaches. In our Spaced
Words approach to alignment-free sequence comparison, pattern sets calculated
with rasbhari led to more accurate estimates of phylogenetic distances than the
randomly generated pattern sets that we previously used. Finally, we used
rasbhari to generate patterns for short read classification with CLARK-S. Here
too, the sensitivity of the results could be improved, compared to the default
patterns of the program. We integrated rasbhari into Spaced Words; the source
code of rasbhari is freely available at http://rasbhari.gobics.de
Using Program Synthesis for Program Analysis
In this paper, we identify a fragment of second-order logic with restricted
quantification that is expressive enough to capture numerous static analysis
problems (e.g. safety proving, bug finding, termination and non-termination
proving, superoptimisation). We call this fragment the {\it synthesis
fragment}. Satisfiability of a formula in the synthesis fragment is decidable
over finite domains; specifically the decision problem is NEXPTIME-complete. If
a formula in this fragment is satisfiable, a solution consists of a satisfying
assignment from the second order variables to \emph{functions over finite
domains}. To concretely find these solutions, we synthesise \emph{programs}
that compute the functions. Our program synthesis algorithm is complete for
finite state programs, i.e. every \emph{function} over finite domains is
computed by some \emph{program} that we can synthesise. We can therefore use
our synthesiser as a decision procedure for the synthesis fragment of
second-order logic, which in turn allows us to use it as a powerful backend for
many program analysis tasks. To show the tractability of our approach, we
evaluate the program synthesiser on several static analysis problems.Comment: 19 pages, to appear in LPAR 2015. arXiv admin note: text overlap with
arXiv:1409.492
Regular Expression Search on Compressed Text
We present an algorithm for searching regular expression matches in
compressed text. The algorithm reports the number of matching lines in the
uncompressed text in time linear in the size of its compressed version. We
define efficient data structures that yield nearly optimal complexity bounds
and provide a sequential implementation --zearch-- that requires up to 25% less
time than the state of the art.Comment: 10 pages, published in Data Compression Conference (DCC'19
- …