4,926 research outputs found
An efficient algorithm for sequence comparison with block reversals
AbstractGiven two sequences X and Y that are strings over some alphabet set, we consider the distance d(X,Y) between them defined to be minimum number of character replacements and block (substring) reversals needed to transform X to Y (or vice versa). The operations are required to be disjoint. This is the “simplest” sequence comparison problem we know of that allows natural block edit operations. Block reversals arise naturally in genomic sequence comparison; they are also of interest in matching music data. We present an algorithm for exactly computing the distance d(X,Y); it takes time O(|X|log2|X|), and hence, is near-linear. Trivial approach takes quadratic time
Parking functions, labeled trees and DCJ sorting scenarios
In genome rearrangement theory, one of the elusive questions raised in recent
years is the enumeration of rearrangement scenarios between two genomes. This
problem is related to the uniform generation of rearrangement scenarios, and
the derivation of tests of statistical significance of the properties of these
scenarios. Here we give an exact formula for the number of double-cut-and-join
(DCJ) rearrangement scenarios of co-tailed genomes. We also construct effective
bijections between the set of scenarios that sort a cycle and well studied
combinatorial objects such as parking functions and labeled trees.Comment: 12 pages, 3 figure
Online Pattern Matching for String Edit Distance with Moves
Edit distance with moves (EDM) is a string-to-string distance measure that
includes substring moves in addition to ordinal editing operations to turn one
string to the other. Although optimizing EDM is intractable, it has many
applications especially in error detections. Edit sensitive parsing (ESP) is an
efficient parsing algorithm that guarantees an upper bound of parsing
discrepancies between different appearances of the same substrings in a string.
ESP can be used for computing an approximate EDM as the L1 distance between
characteristic vectors built by node labels in parsing trees. However, ESP is
not applicable to a streaming text data where a whole text is unknown in
advance. We present an online ESP (OESP) that enables an online pattern
matching for EDM. OESP builds a parse tree for a streaming text and computes
the L1 distance between characteristic vectors in an online manner. For the
space-efficient computation of EDM, OESP directly encodes the parse tree into a
succinct representation by leveraging the idea behind recent results of a
dynamic succinct tree. We experimentally test OESP on the ability to compute
EDM in an online manner on benchmark datasets, and we show OESP's efficiency.Comment: This paper has been accepted to the 21st edition of the International
Symposium on String Processing and Information Retrieval (SPIRE2014
On Weighting Schemes for Gene Order Analysis
Gene order analysis aims at extracting phylogenetic information from the comparison of the order and orientation of the genes on the genomes of different species. This can be achieved by computing parsimonious rearrangement scenarios, i.e. to determine a sequence of rearrangements events that transforms one given gene order into another such that the sum of weights of the included rearrangement events is minimal. In this sequence only certain types of rearrangements, given by the rearrangement model, are admissible and weights are assigned with respect to the rearrangement type. The choice of a suitable rearrangement model and corresponding weights for the included rearrangement types is important for the meaningful reconstruction. So far the analysis of weighting schemes for gene order analysis has not been considered sufficiently. In this paper weighting schemes for gene order analysis are considered for two
rearrangement models: 1) inversions, transpositions, and inverse
transpositions; 2) inversions, block interchanges, and inverse transpositions. For both rearrangement models we determined properties of the weighting functions that exclude certain types of rearrangements from parsimonious rearrangement scenarios
GNSS signal acquisition in the presence of sign transitions
The next generation of Global Navigation Satellite Systems (GNSS), such as Galileo [1] and GPS modernization [2], will use signals with equal code and bit periods, which will introduce a potential sign transition in each segment of the signal processed in the acquisition block. If FFT is used to perform the correlations a sign transition occurring within the integration time may cause a splitting of the main peak of the Cross Ambiguity Function (CAF) into two smaller lobes along the Doppler shift axis [3]. In this paper a method to overcome the possible impairments due to the lobe splitting is proposed and validated by simulatio
- …