Search CORE

2,666 research outputs found

Recommended from our members

A constraint based structure description language for Biosequences

Author: Eidhammer I
Gilbert D
Grindhaug SH
Jonassen J
Ratnayake R
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2001
Field of study

Brunel University Research Archive

Optimum Search Schemes for Approximate String Matching Using Bidirectional FM-Index

Author: Kianfar Kiavash
Luo Haochen
Pockrandt Christopher
Reinert Knut
Torkamandi Bahman
Publication venue
Publication date: 05/03/2018
Field of study

Finding approximate occurrences of a pattern in a text using a full-text index is a central problem in bioinformatics and has been extensively researched. Bidirectional indices have opened new possibilities in this regard allowing the search to start from anywhere within the pattern and extend in both directions. In particular, use of search schemes (partitioning the pattern and searching the pieces in certain orders with given bounds on errors) can yield significant speed-ups. However, finding optimal search schemes is a difficult combinatorial optimization problem. Here for the first time, we propose a mixed integer program (MIP) capable to solve this optimization problem for Hamming distance with given number of pieces. Our experiments show that the optimal search schemes found by our MIP significantly improve the performance of search in bidirectional FM-index upon previous ad-hoc solutions. For example, approximate matching of 101-bp Illumina reads (with two errors) becomes 35 times faster than standard backtracking. Moreover, despite being performed purely in the index, the running time of search using our optimal schemes (for up to two errors) is comparable to the best state-of-the-art aligners, which benefit from combining search in index with in-text verification using dynamic programming. As a result, we anticipate a full-fledged aligner that employs an intelligent combination of search in the bidirectional FM-index using our optimal search schemes and in-text verification using dynamic programming outperforms today's best aligners. The development of such an aligner, called FAMOUS (Fast Approximate string Matching using OptimUm search Schemes), is ongoing as our future work

arXiv.org e-Print Archive

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

Model transformations in Converge

Author: Clark Tony
Tratt Laurence
Publication venue
Publication date
Field of study

Model transformations are currently the focus of much interest and research due to the OMG’s QVT initiative. Current proposals for model transformation languages can be divided into two main camps: those taking a ‘declarative’ approach, and those opting for an ‘imperative’ approach. In this paper we detail an imperative, meta-circular, object orientated, pattern matching programming language Converge which is enriched with features pioneered by the Icon programming language, amongst them: success/failure, generators and goal-directed evaluation. By presenting these features in a language suitable for representing models, we show that we are able to gain some of the advantages of declarative approaches in an imperative setting

Bournemouth University Research Online

Ada as an implementation language for knowledge based systems

Author: Rochowiak Daniel
Publication venue
Publication date
Field of study

Debates about the selection of programming languages often produce cultural collisions that are not easily resolved. This is especially true in the case of Ada and knowledge based programming. The construction of programming tools provides a desirable alternative for resolving the conflict

NASA Technical Reports Server

Measuring the Propagation of Information in Partial Evaluation

Author: Rohde Henning Korsholm
Publication venue: 'Aarhus University Library'
Publication date: 11/08/2005
Field of study

We present the first measurement-based analysis of the information propagated by a partial evaluator. Our analysis is based on measuring implementations of string-matching algorithms, based on the observation that the sequence of character comparisons accurately reflects maintained information. Notably, we can easily prove matchers to be different and we show that they display more variety and finesse than previously believed. As a consequence, we are able to pinpoint differences and inaccuracies in many results previously considered equivalent. Our analysis includes a framework that lets us obtain string matchers - notably the family of Boyer-Moore algorithms - in a systematic formalism-independent way from a few information-propagation primitives. By leveraging the existing research in string matching, we show that the landscape of information propagation is non-trivial in the sense that small changes in information propagation may dramatically change the properties of the resulting string matchers. We thus expect that this work will prove useful as a test and feedback mechanism for information propagation in the development of advanced program transformations, such as GPC or Supercompilation

Tidsskrift.dk (Det Kongelige Bibliotek)

Optimising Unicode Regular Expression Evaluation with Previews

Author: Chivers Howard Robert
Publication venue
Publication date: 15/09/2016
Field of study

The jsre regular expression library was designed to provide fast matching of complex expressions over large input streams using user-selectable character encodings. An established design approach was used: a simulated non-deterministic automaton (NFA) implemented as a virtual machine, avoiding exponential cost functions in either space or time. A deterministic automaton (DFA) was chosen as a general dispatching mechanism for Unicode character classes and this also provided the opportunity to use compact DFAs in various optimization strategies. The result was the development of a regular expression Preview which provides a summary of all the matches possible from a given point in a regular expression in a form that can be implemented as a compact DFA and can be used to further improve the performance of the standard NFA simulation algorithm. This paper formally defines a preview and describes and evaluates several optimizations using this construct. They provide significant speed improvements accrued from fast scanning of anchor positions, avoiding retesting of repeated strings in unanchored searches, and efficient searching of multiple alternate expressions which in the case of keyword searching has a time complexity which is logarithmic in the number of words to be searched

White Rose Research Online

Generating trails automatically, to aid navigation when you revisit an environment

Author: Ruddle R.A.
Publication venue: 'MIT Press - Journals'
Publication date: 01/12/2008
Field of study

A new method for generating trails from a person’s movement through a virtual environment (VE) is described. The method is entirely automatic (no user input is needed), and uses string-matching to identify similar sequences of movement and derive the person’s primary trail. The method was evaluated in a virtual building, and generated trails that substantially reduced the distance participants traveled when they searched for target objects in the building 5-8 weeks after a set of familiarization sessions. Only a modest amount of data (typically five traversals of the building) was required to generate trails that were both effective and stable, and the method was not affected by the order in which objects were visited. The trail generation method models an environment as a graph and, therefore, may be applied to aiding navigation in the real world and information spaces, as well as VEs

Crossref

White Rose Research Online

On the efficient evaluation of relaxed queries in biological databases

Author: Duren Che
Karl Aberer
Yangjun Chen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Crossref