1,057 research outputs found
Efficient chaining of seeds in ordered trees
We consider here the problem of chaining seeds in ordered trees. Seeds are
mappings between two trees Q and T and a chain is a subset of non overlapping
seeds that is consistent with respect to postfix order and ancestrality. This
problem is a natural extension of a similar problem for sequences, and has
applications in computational biology, such as mining a database of RNA
secondary structures. For the chaining problem with a set of m constant size
seeds, we describe an algorithm with complexity O(m2 log(m)) in time and O(m2)
in space
On morphological hierarchical representations for image processing and spatial data clustering
Hierarchical data representations in the context of classi cation and data
clustering were put forward during the fties. Recently, hierarchical image
representations have gained renewed interest for segmentation purposes. In this
paper, we briefly survey fundamental results on hierarchical clustering and
then detail recent paradigms developed for the hierarchical representation of
images in the framework of mathematical morphology: constrained connectivity
and ultrametric watersheds. Constrained connectivity can be viewed as a way to
constrain an initial hierarchy in such a way that a set of desired constraints
are satis ed. The framework of ultrametric watersheds provides a generic scheme
for computing any hierarchical connected clustering, in particular when such a
hierarchy is constrained. The suitability of this framework for solving
practical problems is illustrated with applications in remote sensing
Fast local fragment chaining using sum-of-pair gap costs
<p>Abstract</p> <p>Background</p> <p>Fast seed-based alignment heuristics such as <monospace>BLAST</monospace> and <monospace>BLAT</monospace> have become indispensable tools in comparative genomics for all studies aiming at the evolutionary relations of proteins, genes, and non-coding RNAs. This is true in particular for the large mammalian genomes. The sensitivity and specificity of these tools, however, crucially depend on parameters such as seed sizes or maximum expectation values. In settings that require high sensitivity the amount of short local match fragments easily becomes intractable. Then, fragment chaining is a powerful leverage to quickly connect, score, and rank the fragments to improve the specificity.</p> <p>Results</p> <p>Here we present a fast and flexible fragment chainer that for the first time also supports a sum-of-pair gap cost model. This model has proven to achieve a higher accuracy and sensitivity in its own field of application. Due to a highly time-efficient index structure our method outperforms the only existing tool for fragment chaining under the linear gap cost model. It can easily be applied to the output generated by alignment tools such as <monospace>segemehl</monospace> or <monospace>BLAST</monospace>. As an example we consider homology-based searches for human and mouse snoRNAs demonstrating that a highly sensitive <monospace>BLAST</monospace> search with subsequent chaining is an attractive option. The sum-of-pair gap costs provide a substantial advantage is this context.</p> <p>Conclusions</p> <p>Chaining of short match fragments helps to quickly and accurately identify regions of homology that may not be found using local alignment heuristics alone. By providing both the linear and the sum-of-pair gap cost model, a wider range of application can be covered. The software clasp is available at <url>http://www.bioinf.uni-leipzig.de/Software/clasp/</url>.</p
Rule-based Test Generation with Mind Maps
This paper introduces basic concepts of rule based test generation with mind
maps, and reports experiences learned from industrial application of this
technique in the domain of smart card testing by Giesecke & Devrient GmbH over
the last years. It describes the formalization of test selection criteria used
by our test generator, our test generation architecture and test generation
framework.Comment: In Proceedings MBT 2012, arXiv:1202.582
Mapping Topographic Structure in White Matter Pathways with Level Set Trees
Fiber tractography on diffusion imaging data offers rich potential for
describing white matter pathways in the human brain, but characterizing the
spatial organization in these large and complex data sets remains a challenge.
We show that level set trees---which provide a concise representation of the
hierarchical mode structure of probability density functions---offer a
statistically-principled framework for visualizing and analyzing topography in
fiber streamlines. Using diffusion spectrum imaging data collected on
neurologically healthy controls (N=30), we mapped white matter pathways from
the cortex into the striatum using a deterministic tractography algorithm that
estimates fiber bundles as dimensionless streamlines. Level set trees were used
for interactive exploration of patterns in the endpoint distributions of the
mapped fiber tracks and an efficient segmentation of the tracks that has
empirical accuracy comparable to standard nonparametric clustering methods. We
show that level set trees can also be generalized to model pseudo-density
functions in order to analyze a broader array of data types, including entire
fiber streamlines. Finally, resampling methods show the reliability of the
level set tree as a descriptive measure of topographic structure, illustrating
its potential as a statistical descriptor in brain imaging analysis. These
results highlight the broad applicability of level set trees for visualizing
and analyzing high-dimensional data like fiber tractography output
Fast and sensitive multiple alignment of large genomic sequences.
BACKGROUND: Genomic sequence alignment is a powerful method for genome analysis and annotation, as alignments are routinely used to identify functional sites such as genes or regulatory elements. With a growing number of partially or completely sequenced genomes, multiple alignment is playing an increasingly important role in these studies. In recent years, various tools for pair-wise and multiple genomic alignment have been proposed. Some of them are extremely fast, but often efficiency is achieved at the expense of sensitivity. One way of combining speed and sensitivity is to use an anchored-alignment approach. In a first step, a fast search program identifies a chain of strong local sequence similarities. In a second step, regions between these anchor points are aligned using a slower but more accurate method. RESULTS: Herein, we present CHAOS, a novel algorithm for rapid identification of chains of local pair-wise sequence similarities. Local alignments calculated by CHAOS are used as anchor points to improve the running time of DIALIGN, a slow but sensitive multiple-alignment tool. We show that this way, the running time of DIALIGN can be reduced by more than 95% for BAC-sized and longer sequences, without affecting the quality of the resulting alignments. We apply our approach to a set of five genomic sequences around the stem-cell-leukemia (SCL) gene and demonstrate that exons and small regulatory elements can be identified by our multiple-alignment procedure. CONCLUSION: We conclude that the novel CHAOS local alignment tool is an effective way to significantly speed up global alignment tools such as DIALIGN without reducing the alignment quality. We likewise demonstrate that the DIALIGN/CHAOS combination is able to accurately align short regulatory sequences in distant orthologues.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
- âŠ