Search CORE

7,781 research outputs found

Truncated Profile Hidden Markov Models

Author: Smith Jennifer A.
Publication venue: 'IUScholarWorks'
Publication date: 01/11/2005
Field of study

The profile hidden Markov model (HMM) is a powerful method for remote homolog database search. However, evaluating the score of each database sequence against a profile HMM is computationally demanding. The computation time required for score evaluation is proportional to the number of states in the profile HMM. This paper examines whether the number of states can be truncated without reducing the ability of the HMM to find proteins containing members of a protein domain family. A genetic algorithm (GA) is presented which finds a good truncation of the HMM states. The results of using truncation on searches of the yeast, E. coli, and pig genomes for several different protein domain families is shown

Boise State University - ScholarWorks

Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction.

Author: Bourne Philip E
Scheeff Eric D
Publication venue: eScholarship, University of California
Publication date: 01/09/2006
Field of study

BackgroundOne of the most powerful methods for the prediction of protein structure from sequence information alone is the iterative construction of profile-type models. Because profiles are built from sequence alignments, the sequences included in the alignment and the method used to align them will be important to the sensitivity of the resulting profile. The inclusion of highly diverse sequences will presumably produce a more powerful profile, but distantly related sequences can be difficult to align accurately using only sequence information. Therefore, it would be expected that the use of protein structure alignments to improve the selection and alignment of diverse sequence homologs might yield improved profiles. However, the actual utility of such an approach has remained unclear.ResultsWe explored several iterative protocols for the generation of profile hidden Markov models. These protocols were tailored to allow the inclusion of protein structure alignments in the process, and were used for large-scale creation and benchmarking of structure alignment-enhanced models. We found that models using structure alignments did not provide an overall improvement over sequence-only models for superfamily-level structure predictions. However, the results also revealed that the structure alignment-enhanced models were complimentary to the sequence-only models, particularly at the edge of the "twilight zone". When the two sets of models were combined, they provided improved results over sequence-only models alone. In addition, we found that the beneficial effects of the structure alignment-enhanced models could not be realized if the structure-based alignments were replaced with sequence-based alignments. Our experiments with different iterative protocols for sequence-only models also suggested that simple protocol modifications were unable to yield equivalent improvements to those provided by the structure alignment-enhanced models. Finally, we found that models using structure alignments provided fold-level structure assignments that were superior to those produced by sequence-only models.ConclusionWhen attempting to predict the structure of remote homologs, we advocate a combined approach in which both traditional models and models incorporating structure alignments are used

PubMed Central

eScholarship - University of California

High-Rate Vector Quantization for the Neyman-Pearson Detection of Correlated Processes

Author: Bianchi Pascal
Villard Joffrey
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

This paper investigates the effect of quantization on the performance of the Neyman-Pearson test. It is assumed that a sensing unit observes samples of a correlated stationary ergodic multivariate process. Each sample is passed through an N-point quantizer and transmitted to a decision device which performs a binary hypothesis test. For any false alarm level, it is shown that the miss probability of the Neyman-Pearson test converges to zero exponentially as the number of samples tends to infinity, assuming that the observed process satisfies certain mixing conditions. The main contribution of this paper is to provide a compact closed-form expression of the error exponent in the high-rate regime i.e., when the number N of quantization levels tends to infinity, generalizing previous results of Gupta and Hero to the case of non-independent observations. If d represents the dimension of one sample, it is proved that the error exponent converges at rate N^{2/d} to the one obtained in the absence of quantization. As an application, relevant high-rate quantization strategies which lead to a large error exponent are determined. Numerical results indicate that the proposed quantization rule can yield better performance than existing ones in terms of detection error.Comment: 47 pages, 7 figures, 1 table. To appear in the IEEE Transactions on Information Theor

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

HAL-Rennes 1

Learning a Hybrid Architecture for Sequence Regression and Annotation

Author: Carin Lawrence
Hartemink Alexander J.
Henao Ricardo
Zhang Yizhe
Zhong Jianling
Publication venue
Publication date: 16/12/2015
Field of study

When learning a hidden Markov model (HMM), sequen- tial observations can often be complemented by real-valued summary response variables generated from the path of hid- den states. Such settings arise in numerous domains, includ- ing many applications in biology, like motif discovery and genome annotation. In this paper, we present a flexible frame- work for jointly modeling both latent sequence features and the functional mapping that relates the summary response variables to the hidden state sequence. The algorithm is com- patible with a rich set of mapping functions. Results show that the availability of additional continuous response vari- ables can simultaneously improve the annotation of the se- quential observations and yield good prediction performance in both synthetic data and real-world datasets.Comment: AAAI 201

arXiv.org e-Print Archive

DukeSpace

Association for the Advancement of Artificial Intelligence: AAAI Publications

Hierarchical Bayesian sparse image reconstruction with application to MRFM

Author: Dobigeon Nicolas
Hero Alfred O.
Tourneret Jean-Yves
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

This paper presents a hierarchical Bayesian model to reconstruct sparse images when the observations are obtained from linear transformations and corrupted by an additive white Gaussian noise. Our hierarchical Bayes model is well suited to such naturally sparse image applications as it seamlessly accounts for properties such as sparsity and positivity of the image via appropriate Bayes priors. We propose a prior that is based on a weighted mixture of a positive exponential distribution and a mass at zero. The prior has hyperparameters that are tuned automatically by marginalization over the hierarchical Bayesian model. To overcome the complexity of the posterior distribution, a Gibbs sampling strategy is proposed. The Gibbs samples can be used to estimate the image to be recovered, e.g. by maximizing the estimated posterior distribution. In our fully Bayesian approach the posteriors of all the parameters are available. Thus our algorithm provides more information than other previously proposed sparse reconstruction methods that only give a point estimate. The performance of our hierarchical Bayesian sparse reconstruction method is illustrated on synthetic and real data collected from a tobacco virus sample using a prototype MRFM instrument.Comment: v2: final version; IEEE Trans. Image Processing, 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL Descartes

Bayesian separation of spectral sources under non-negativity and full additivity constraints

Author: Bishop
Cardoso
Carteret
Chang
Cichocki
Comon
Comon
Cédric Carteret
Dandeu
de Juan
Djurić
Dobigeon
Dobigeon
Dobigeon
Févotte
Gelman
Gelman
Godsill
Hoyer
Hsiao
Hyvärinen
Jean-Yves Tourneret
Kitamura
Lee
Malinowski
Mazet
Miskin
Moussaoui
Moussaoui
Nicolas Dobigeon
Plumbley
Plumbley
Punskaya
Robert
Robert
Robert
Robert
Sajda
Saïd Moussaoui
Snoussi
Tugnait
Publication venue: 'Elsevier BV'
Publication date: 23/09/2009
Field of study

This paper addresses the problem of separating spectral sources which are linearly mixed with unknown proportions. The main difficulty of the problem is to ensure the full additivity (sum-to-one) of the mixing coefficients and non-negativity of sources and mixing coefficients. A Bayesian estimation approach based on Gamma priors was recently proposed to handle the non-negativity constraints in a linear mixture model. However, incorporating the full additivity constraint requires further developments. This paper studies a new hierarchical Bayesian model appropriate to the non-negativity and sum-to-one constraints associated to the regressors and regression coefficients of linear mixtures. The estimation of the unknown parameters of this model is performed using samples generated using an appropriate Gibbs sampler. The performance of the proposed algorithm is evaluated through simulation results conducted on synthetic mixture models. The proposed approach is also applied to the processing of multicomponent chemical mixtures resulting from Raman spectroscopy.Comment: v4: minor grammatical changes; Signal Processing, 200

arXiv.org e-Print Archive

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL Descartes

SATCHMO-JS: a webserver for simultaneous protein multiple sequence alignment and phylogenetic tree construction.

Author: Datta Ruchira S
Davidson John R
Hagopian Raffi
Jarvis Glen R
Samad Bushra
Sjölander Kimmen
Publication venue: eScholarship, University of California
Publication date: 29/04/2010
Field of study

We present the jump-start simultaneous alignment and tree construction using hidden Markov models (SATCHMO-JS) web server for simultaneous estimation of protein multiple sequence alignments (MSAs) and phylogenetic trees. The server takes as input a set of sequences in FASTA format, and outputs a phylogenetic tree and MSA; these can be viewed online or downloaded from the website. SATCHMO-JS is an extension of the SATCHMO algorithm, and employs a divide-and-conquer strategy to jump-start SATCHMO at a higher point in the phylogenetic tree, reducing the computational complexity of the progressive all-versus-all HMM-HMM scoring and alignment. Results on a benchmark dataset of 983 structurally aligned pairs from the PREFAB benchmark dataset show that SATCHMO-JS provides a statistically significant improvement in alignment accuracy over MUSCLE, Multiple Alignment using Fast Fourier Transform (MAFFT), ClustalW and the original SATCHMO algorithm. The SATCHMO-JS webserver is available at http://phylogenomics.berkeley.edu/satchmo-js. The datasets used in these experiments are available for download at http://phylogenomics.berkeley.edu/satchmo-js/supplementary/

PubMed Central

eScholarship - University of California

Conditioned stochastic particle systems and integrable quantum spin systems

Author: A Ayyer
A Blumen
A Borodin
A Szabo
AA Lushnikov
B Derrida
B Derrida
B Derrida
B Meerson
C Appert
C Bahadoran
C Giardinà
CP Espigares
D Kim
F Colomo
F Rezakhanlou
FC Alcaraz
G Schütz
G Tripathy
GM Schütz
GM Schütz
GM Schütz
H Spohn
J Szavits-Nossan
JE Santos
JL Lebowitz
L Santen
L-H Tang
LH Gwa
M Ha
M Henkel
M Katori
ME Fisher
PI Hurtado
R Arratia
RJ Harris
RJ Harris
RL Jack
SA Janowsky
SA Janowsky
SF Burlatsky
SF Burlatsky
T Bodineau
T Bodineau
T Imamura
T Seppäläinen
TM Liggett
TM Liggett
V Belitsky
V Lecomte
V Popkov
V Popkov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2014
Field of study

We consider from a microscopic perspective large deviation properties of several stochastic interacting particle systems, using their mapping to integrable quantum spin systems. A brief review of recent work is given and several new results are presented: (i) For the general disordered symmectric exclusion process (SEP) on some finite lattice conditioned on no jumps into some absorbing sublattice and with initial Bernoulli product measure with density

\rho

we prove that the probability

S_\rho(t)

of no absorption event up to microscopic time

t

can be expressed in terms of the generating function for the particle number of a SEP with particle injection and empty initial lattice. Specifically, for the symmetric simple exclusion process on

\mathbb Z

conditioned on no jumps into the origin we obtain the explicit first and second order expansion in

\rho

S_\rho(t)

and also to first order in

\rho

the optimal microscopic density profile under this conditioning. For the disordered ASEP on the finite torus conditioned on a very large current we show that the effective dynamics that optimally realizes this rare event does not depend on the disorder, except for the time scale. For annihilating and coalescing random walkers we obtain the generating function of the number of annihilated particles up to time

t

, which turns out to exhibit some universal features.Comment: 25 page

arXiv.org e-Print Archive

Crossref

Juelich Shared Electronic Resources