16,888 research outputs found
Sparse approaches for the exact distribution of patterns in long state sequences generated by a Markov source
We present two novel approaches for the computation of the exact distribution
of a pattern in a long sequence. Both approaches take into account the sparse
structure of the problem and are two-part algorithms. The first approach relies
on a partial recursion after a fast computation of the second largest
eigenvalue of the transition matrix of a Markov chain embedding. The second
approach uses fast Taylor expansions of an exact bivariate rational
reconstruction of the distribution. We illustrate the interest of both
approaches on a simple toy-example and two biological applications: the
transcription factors of the Human Chromosome 5 and the PROSITE signatures of
functional motifs in proteins. On these example our methods demonstrate their
complementarity and their hability to extend the domain of feasibility for
exact computations in pattern problems to a new level
Texture fusion for batik motif retrieval system
This paper systematically investigates the effect of image texture features on batik motif retrieval performance. The retrieval process uses a query motif image to find matching motif images in a database. In this study, feature fusion of various image texture features such as Gabor, Log-Gabor, Grey Level Co-Occurrence Matrices (GLCM), and Local Binary Pattern (LBP) features are attempted in motif image retrieval. With regards to performance evaluation, both individual features and fused feature sets are applied. Experimental results show that optimal feature fusion outperforms individual features in batik motif retrieval. Among the individual features tested, Log-Gabor features provide the best result. The proposed approach is best used in a scenario where a query image containing multiple basic motif objects is applied to a dataset in which retrieved images also contain multiple motif objects. The retrieval rate achieves 84.54% for the rank 3 precision when the feature space is fused with Gabor, GLCM and Log-Gabor features. The investigation also shows that the proposed method does not work well for a retrieval scenario where the query image contains multiple basic motif objects being applied to a dataset in which the retrieved images only contain one basic motif object
Recommended from our members
A Haystack Heuristic for Autoimmune Disease Biomarker Discovery Using Next-Gen Immune Repertoire Sequencing Data.
Large-scale DNA sequencing of immunological repertoires offers an opportunity for the discovery of novel biomarkers for autoimmune disease. Available bioinformatics techniques however, are not adequately suited for elucidating possible biomarker candidates from within large immunosequencing datasets due to unsatisfactory scalability and sensitivity. Here, we present the Haystack Heuristic, an algorithm customized to computationally extract disease-associated motifs from next-generation-sequenced repertoires by contrasting disease and healthy subjects. This technique employs a local-search graph-theory approach to discover novel motifs in patient data. We apply the Haystack Heuristic to nine million B-cell receptor sequences obtained from nearly 100 individuals in order to elucidate a new motif that is significantly associated with multiple sclerosis. Our results demonstrate the effectiveness of the Haystack Heuristic in computing possible biomarker candidates from high throughput sequencing data and could be generalized to other datasets
Analysis of Three-Dimensional Protein Images
A fundamental goal of research in molecular biology is to understand protein
structure. Protein crystallography is currently the most successful method for
determining the three-dimensional (3D) conformation of a protein, yet it
remains labor intensive and relies on an expert's ability to derive and
evaluate a protein scene model. In this paper, the problem of protein structure
determination is formulated as an exercise in scene analysis. A computational
methodology is presented in which a 3D image of a protein is segmented into a
graph of critical points. Bayesian and certainty factor approaches are
described and used to analyze critical point graphs and identify meaningful
substructures, such as alpha-helices and beta-sheets. Results of applying the
methodologies to protein images at low and medium resolution are reported. The
research is related to approaches to representation, segmentation and
classification in vision, as well as to top-down approaches to protein
structure prediction.Comment: See http://www.jair.org/ for any accompanying file
- …