Search CORE

15,651 research outputs found

Languages of lossless seeds

Author: Břinda Karel
Publication venue: 'Open Publishing Association'
Publication date: 21/05/2014
Field of study

Several algorithms for similarity search employ seeding techniques to quickly discard very dissimilar regions. In this paper, we study theoretical properties of lossless seeds, i.e., spaced seeds having full sensitivity. We prove that lossless seeds coincide with languages of certain sofic subshifts, hence they can be recognized by finite automata. Moreover, we show that these subshifts are fully given by the number of allowed errors k and the seed margin l. We also show that for a fixed k, optimal seeds must asymptotically satisfy l ~ m^(k/(k+1)).Comment: In Proceedings AFL 2014, arXiv:1405.527

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Protein Sequencing with an Adaptive Genetic Algorithm from Tandem Mass Spectrometry

Author: Boisson Jean-Charles
Jourdan Laetitia
Rolando Christian
Talbi El-Ghazali
Publication venue
Publication date: 06/02/2008
Field of study

In Proteomics, only the de novo peptide sequencing approach allows a partial amino acid sequence of a peptide to be found from a MS/MS spectrum. In this article a preliminary work is presented to discover a complete protein sequence from spectral data (MS and MS/MS spectra). For the moment, our approach only uses MS spectra. A Genetic Algorithm (GA) has been designed with a new evaluation function which works directly with a complete MS spectrum as input and not with a mass list like the other methods using this kind of data. Thus the mono isotopic peak extraction step which needs a human intervention is deleted. The goal of this approach is to discover the sequence of unknown proteins and to allow a better understanding of the differences between experimental proteins and proteins from databases

arXiv.org e-Print Archive

CiteSeerX

A backward procedure for change-point detection with applications to copy number variation detection

Author: Hao Ning
Shin Seung Jun
Wu Yichao
Publication venue
Publication date: 18/08/2019
Field of study

Change-point detection regains much attention recently for analyzing array or sequencing data for copy number variation (CNV) detection. In such applications, the true signals are typically very short and buried in the long data sequence, which makes it challenging to identify the variations efficiently and accurately. In this article, we propose a new change-point detection method, a backward procedure, which is not only fast and simple enough to exploit high-dimensional data but also performs very well for detecting short signals. Although motivated by CNV detection, the backward procedure is generally applicable to assorted change-point problems that arise in a variety of scientific applications. It is illustrated by both simulated and real CNV data that the backward detection has clear advantages over other competing methods especially when the true signal is short

arXiv.org e-Print Archive

Crossref

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

Optimum Search Schemes for Approximate String Matching Using Bidirectional FM-Index

Author: Kianfar Kiavash
Luo Haochen
Pockrandt Christopher
Reinert Knut
Torkamandi Bahman
Publication venue
Publication date: 05/03/2018
Field of study

Finding approximate occurrences of a pattern in a text using a full-text index is a central problem in bioinformatics and has been extensively researched. Bidirectional indices have opened new possibilities in this regard allowing the search to start from anywhere within the pattern and extend in both directions. In particular, use of search schemes (partitioning the pattern and searching the pieces in certain orders with given bounds on errors) can yield significant speed-ups. However, finding optimal search schemes is a difficult combinatorial optimization problem. Here for the first time, we propose a mixed integer program (MIP) capable to solve this optimization problem for Hamming distance with given number of pieces. Our experiments show that the optimal search schemes found by our MIP significantly improve the performance of search in bidirectional FM-index upon previous ad-hoc solutions. For example, approximate matching of 101-bp Illumina reads (with two errors) becomes 35 times faster than standard backtracking. Moreover, despite being performed purely in the index, the running time of search using our optimal schemes (for up to two errors) is comparable to the best state-of-the-art aligners, which benefit from combining search in index with in-text verification using dynamic programming. As a result, we anticipate a full-fledged aligner that employs an intelligent combination of search in the bidirectional FM-index using our optimal search schemes and in-text verification using dynamic programming outperforms today's best aligners. The development of such an aligner, called FAMOUS (Fast Approximate string Matching using OptimUm search Schemes), is ongoing as our future work

arXiv.org e-Print Archive

Crossref

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

A Sample Average Approximation Approach for Event-Driven Probabilistic Constraint Programming

Author: Hnich B.
Prestwich S.
Rossi R.
Tarim S.A.
Publication venue: Cork Constraint Computation Centre, University College
Publication date
Field of study

Wageningen University & Research Publications

Qualitative Analysis of POMDPs with Temporal Logic Specifications for Robotics Applications

Author: Chatterjee Krishnendu
Chmelík Martin
Gupta Raghav
Kanodia Ayush
Publication venue
Publication date: 01/01/2015
Field of study

We consider partially observable Markov decision processes (POMDPs), that are a standard framework for robotics applications to model uncertainties present in the real world, with temporal logic specifications. All temporal logic specifications in linear-time temporal logic (LTL) can be expressed as parity objectives. We study the qualitative analysis problem for POMDPs with parity objectives that asks whether there is a controller (policy) to ensure that the objective holds with probability 1 (almost-surely). While the qualitative analysis of POMDPs with parity objectives is undecidable, recent results show that when restricted to finite-memory policies the problem is EXPTIME-complete. While the problem is intractable in theory, we present a practical approach to solve the qualitative analysis problem. We designed several heuristics to deal with the exponential complexity, and have used our implementation on a number of well-known POMDP examples for robotics applications. Our results provide the first practical approach to solve the qualitative analysis of robot motion planning with LTL properties in the presence of uncertainty

arXiv.org e-Print Archive

Crossref

IST PubRep

IST Austria: PubRep (Institute of Science and Technology)