45,092 research outputs found
Computational identification and analysis of noncoding RNAs - Unearthing the buried treasures in the genome
The central dogma of molecular biology states that the genetic information flows from DNA to RNA to protein. This dogma has exerted a substantial influence on our understanding of the genetic activities in the cells. Under this influence, the prevailing assumption until the recent past was that genes are basically repositories for protein coding information, and proteins are responsible for most of the important biological functions in all cells. In the meanwhile, the importance of RNAs has remained rather obscure, and RNA was mainly viewed as a passive intermediary that bridges the gap between DNA and protein. Except for classic examples such as tRNAs (transfer RNAs) and rRNAs (ribosomal RNAs), functional noncoding RNAs were considered to be rare.
However, this view has experienced a dramatic change during the last decade, as systematic screening of various genomes identified myriads of noncoding RNAs (ncRNAs), which are RNA molecules that function without being translated into proteins [11], [40]. It has been realized that many ncRNAs play important roles in various biological processes. As RNAs can interact with other RNAs and DNAs in a sequence-specific manner, they are especially useful in tasks that require highly specific nucleotide recognition [11]. Good examples are the miRNAs (microRNAs) that regulate gene expression by targeting mRNAs (messenger RNAs) [4], [20], and the siRNAs (small interfering RNAs) that take part in the RNAi (RNA interference) pathways for gene silencing [29], [30]. Recent developments show that ncRNAs are extensively involved in many gene regulatory mechanisms [14], [17].
The roles of ncRNAs known to this day are truly diverse. These include transcription and translation control, chromosome replication, RNA processing and modification, and protein degradation and translocation [40], just to name a few. These days, it is even claimed that ncRNAs dominate the genomic output of the higher organisms such as mammals, and it is being suggested that the greater portion of their genome (which does not encode proteins) is dedicated to the control and regulation of cell development [27]. As more and more evidence piles up, greater attention is paid to ncRNAs, which have been neglected for a long time. Researchers began to realize that the vast majority of the genome that was regarded as “junk,” mainly because it was not well understood, may indeed hold the key for the best kept secrets in life, such as the mechanism of alternative splicing, the control of epigenetic variations and so forth [27]. The complete range and extent of the role of ncRNAs are not so obvious at this point, but it is certain that a comprehensive understanding of cellular processes is not possible without understanding the functions of ncRNAs [47]
Learning the Structure for Structured Sparsity
Structured sparsity has recently emerged in statistics, machine learning and
signal processing as a promising paradigm for learning in high-dimensional
settings. All existing methods for learning under the assumption of structured
sparsity rely on prior knowledge on how to weight (or how to penalize)
individual subsets of variables during the subset selection process, which is
not available in general. Inferring group weights from data is a key open
research problem in structured sparsity.In this paper, we propose a Bayesian
approach to the problem of group weight learning. We model the group weights as
hyperparameters of heavy-tailed priors on groups of variables and derive an
approximate inference scheme to infer these hyperparameters. We empirically
show that we are able to recover the model hyperparameters when the data are
generated from the model, and we demonstrate the utility of learning weights in
synthetic and real denoising problems
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Regression trees for non parametric modeling and time series prediction
We present a non-parametric approach to nonlinear modeling and prediction based on adaptive partitioning of the reconstructed
phase space associated with the process . The partitioning method is implemented with a recursive tree-structured algorithm which
successively refines the partition by binary splitting where the splitting threshold is determined by a penalized maximum entropy
criterion. An analysis of the statistical behavior of the splitting rule suggests a criterion for determining the depth of the tree . The
effectiveness of this method is illustrated through comparisons with classical approaches for nonlinear system analysis on the basis
of reconstruction error and computational complexity . An important relation between our tree-structured model for the process
and generalized non-linear thresholded AR model (ART) is established . We illustrate our method for cases where classical linear
prediction is known to be rather ineffective : chaotic signals (measured at the output of a Chua-type electronic circuit), and second
order ART signal .Nous présentons une approche non linéaire non paramétrique pour la modélisation et la prédiction de signaux, basée sur une méthode de partition récursive de l'espace des phases reconstruit, associé au système sur lequel le signal est prélevé. La partition de l'espace des phases est obtenue par un algorithme récursif de partition binaire. Les seuils de partition sont déterminés à l'aide d'un critère de maximum d'entropie. Une courte analyse statistique du comportement de ces seuils permet de définir un critère simple d'arrêt de la partition récursive. L'intérêt de cette méthode est illustré par la comparaison avec des méthodes classiques dans le cadre de l'analyse de systèmes non linéaires, ainsi que du point de vue du coût de calcul. Nous présentons un lien important entre cette méthode reposant sur une partition hiérarchique (en arbre) et les modèles non linéaires auto-régressifs à seuils (ART). Dans ce contexte, la méthode présentée est appliquée dans des cas pour lesquels les méthodes linéaires échouent en général : les signaux de chaos (séries expérimentales mesurées sur des circuits électroniques de type Chua), ainsi que sur des séries numériques ART d'ordre deux
- …