29,360 research outputs found

    HMM with auxiliary memory: a new tool for modeling RNA structures

    Get PDF
    For a long time, proteins have been believed to perform most of the important functions in all cells. However, recent results in genomics have revealed that many RNAs that do not encode proteins play crucial roles in the cell machinery. The so-called ncRNA genes that are transcribed into RNAs but not translated into proteins, frequently conserve their secondary structures more than they conserve their primary sequences. Therefore, in order to identify ncRNA genes, we have to take the secondary structure of RNAs into consideration. Traditional approaches that are mainly based on base-composition statistics cannot be used for modeling and identifying such structures and models with more descriptive power are required. In this paper, we introduce the concept of context-sensitive HMMs, which is capable of describing pairwise interactions between distant symbols. It is demonstrated that the proposed model can efficiently model various RNA secondary structures that are frequently observed

    Modeling and identification of alternative folding in regulatory RNAs using context-sensitive HMMS

    Get PDF
    Recent research on gene regulation has revealed that many noncoding RNAs (ncRNAs) are actively involved in controlling various gene-regulatory networks. For such ncRNAs, their secondary structures play crucial roles in carrying out their functions. Interestingly enough, many regulatory RNAs can choose from two alternative structures based on external factors, which enables the RNAs to regulate the expression of certain genes in an environment-dependent manner. The existence of alternative structures give rise to complex correlations in the primary sequence of the RNA. In this paper, we propose an efficient method for modeling alternative secondary structures in regulatory RNAs. The proposed method can be applied to the prediction of novel regulatory RNAs in genome sequences

    Discrete pdf estimation in the presence of noise

    Get PDF
    The problem of estimating a pdf from measurements has been widely studied by many researchers. However, most of the work was focused on estimating a probability density function of continuous random variables, especially in the absence of noise. In this paper, we consider a model for representing discrete probability density functions based on multirate dsp models. Using this model, we propose an efficient and stable scheme for pdf estimation when the measurements are corrupted by independent additive noise. This approach makes use of well-known results from multirate dsp theory, especially that of biorthogonal partners. Simulation results are given, which clearly show the advantage of the proposed method

    Profile Context-Sensitive HMMs for Probabilistic Modeling of Sequences With Complex Correlations

    Get PDF
    The profile hidden Markov model is a specific type of HMM that is well suited for describing the common features of a set of related sequences. It has been extensively used in computational biology, where it is still one of the most popular tools. In this paper, we propose a new model called the profile context-sensitive HMM. Unlike traditional profile-HMMs, the proposed model is capable of describing complex long-range correlations between distant symbols in a consensus sequence. We also introduce a general algorithm that can be used for finding the optimal state-sequence of an observed symbol sequence based on the given profile-csHMM. The proposed model has an important application in RNA sequence analysis, especially in modeling and analyzing RNA pseudoknots

    Identification of CpG islands using a bank of IIR lowpass filters

    Get PDF
    It has been known that biological sequences such as the DNA sequence display different kinds of patterns depending on their biological functions. This statistical difference can be exploited for identifying the region of interest, such as the protein coding regions or CpG islands, in a new biological sequence that has not been annotated yet. A region of particular interest is the CpG island, which is a region in a DNA sequence that is rich in the dinucleotide CpG, since it is known that they can be used as gene markers. There have been several computational methods for identifying CpG islands, each with its own strength and weakness. In this paper, we propose a novel scheme for detecting CpG islands in a genomic sequence, which is based on a bank of IIR lowpass filters. The proposed method is capable of identifying CpG islands efficiently at a low computational expense. Simulation results are included where appropriate to demonstrate the idea

    Wavelet-based denoising by customized thresholding

    Get PDF
    The problem of estimating a signal that is corrupted by additive noise has been of interest to many researchers for practical as well as theoretical reasons. Many of the traditional denoising methods have been using linear methods such as the Wiener filtering. Recently, nonlinear methods, especially those based on wavelets have become increasingly popular, due to a number of advantages over the linear methods. It has been shown that wavelet-thresholding has near-optimal properties in the minimax sense, and guarantees better rate of convergence, despite its simplicity. Even though much work has been done in the field of wavelet-thresholding, most of it was focused on statistical modeling of the wavelet coefficients and the optimal choice of the thresholds. In this paper, we propose a custom thresholding function which can improve the denoised results significantly. Simulation results are given to demonstrate the advantage of the new thresholding function

    Discrete probability density estimation using multirate DSP models

    Get PDF
    We propose a model based approach for estimation of probability mass functions for discrete random variables. The model is based on tools from multirate signal processing. Similar in principle to the kernel based methods, the approach takes advantage of well-known results from multirate signal processing theory. Similarities to and differences from wavelet based approaches is also indicated where appropriate. In the final form, the probability estimates are obtained by filtering the square root of the histogram through a multirate system whose components are biorthogonal partners of each other

    Optimal alignment algorithm for context-sensitive hidden Markov models

    Get PDF
    The hidden Markov model is well-known for its efficiency in modeling short-term dependencies between adjacent samples. However, it cannot be used for modeling longer-range interactions between symbols that are distant from each other. In this paper, we introduce the concept of context-sensitive HMM that is capable of modeling strong pairwise correlations between distant symbols. Based on this model, we propose a polynomial-time algorithm that can be used for finding the optimal state sequence of an observed symbol string. The proposed model is especially useful in modeling palindromes, which has an important application in RNA secondary structure analysis

    An overview of the role of context-sensitive HMMs in the prediction of ncRNA genes

    Get PDF
    Non-coding RNAs (ncRNA) are RNA molecules that function in the cells without being translated into proteins. In recent years, much evidence has been found that ncRNAs play a crucial role in various biological processes. As a result, there has been an increasing interest in the prediction of ncRNA genes. Due to the conserved secondary structure in ncRNAs, there exist pairwise dependencies between distant bases. These dependencies cannot be effectively modeled using traditional HMMs, and we need a more complex model such as the context-sensitive HMM (csHMM). In this paper, we overview the role of csHMMs in the RNA secondary structure analysis and the prediction of ncRNA genes. It is demonstrated that the context-sensitive HMMs can serve as an efficient framework for these purposes
    • …
    corecore