Search CORE

286 research outputs found

Distribution of Mutual Information

Author: Abarbanel Henry D. I.
Masuda Naoki
Rabinovich M. I.
Tumer Evren
Publication venue
Publication date: 09/11/2000
Field of study

The mutual information of two random variables i and j with joint probabilities t_ij is commonly used in learning Bayesian nets as well as in many other fields. The chances t_ij are usually estimated by the empirical sampling frequency n_ij/n leading to a point estimate I(n_ij/n) for the mutual information. To answer questions like "is I(n_ij/n) consistent with zero?" or "what is the probability that the true mutual information is much larger than the point estimate?" one has to go beyond the point estimate. In the Bayesian framework one can answer these questions by utilizing a (second order) prior distribution p(t) comprising prior information about t. From the prior p(t) one can compute the posterior p(t|n), from which the distribution p(I|n) of the mutual information can be calculated. We derive reliable and quickly computable approximations for p(I|n). We concentrate on the mean, variance, skewness, and kurtosis, and non-informative priors. For the mean we also give an exact expression. Numerical issues and the range of validity are discussed.Comment: 8 page

arXiv.org e-Print Archive

CiteSeerX

The Australian National University

Model Selection for Gaussian Mixture Models

Author: Huang Tao
Peng Heng
Zhang Kun
Publication venue
Publication date: 15/01/2013
Field of study

This paper is concerned with an important issue in finite mixture modelling, the selection of the number of mixing components. We propose a new penalized likelihood method for model selection of finite multivariate Gaussian mixture models. The proposed method is shown to be statistically consistent in determining of the number of components. A modified EM algorithm is developed to simultaneously select the number of components and to estimate the mixing weights, i.e. the mixing probabilities, and unknown parameters of Gaussian distributions. Simulations and a real data analysis are presented to illustrate the performance of the proposed method

arXiv.org e-Print Archive

CiteSeerX

The Value of Information for Populations in Varying Environments

Author: A. Rosenblueth
A. Sasaki
A. Wagner
A.J. Robson
A.R. Barron
C. Adami
C. Shannon
C.E. Shannon
C.E. Shannon
C.T. Bergstrom
D. Bernoulli
D. Polani
D. Tanny
D.W. Stephens
E. Jablonka
E. Kussell
E. Kussell
G.N. Iyengar
H. Atlan
H. Furstenberg
H. Marko
H. Markowitz
H. Touchette
H. Touchette
H.H. Permuter
H.S. Witsenhausen
I. Csiszár
J. Beatty
J. Kelly
J. Maynard Smith
J. Maynard-Smith
J. Seger
J.F.C. Kingman
J.L. Massey
J.O. Berger
J.W. Szostak
K.B. Athreya
L. Breiman
L. Pack Kaelbling
L.A. Real
M. Gastpar
M.C. Donaldson-Matasci
N. Rashevsky
N. Wiener
Olivier Rivoire
P. Godfrey-Smith
P. Haccou
P. Leslie
P. Nurse
P.A. Samuelson
P.H. Algoet
R.C. Lewontin
R.C. Merton
S. Karlin
S. Mills
S. Tuljapurkar
S.C. Stearns
S.K. Mitter
Stanislas Leibler
T. Berger
T.G. Kurtz
T.J. Perkins
T.M. Cover
W.R. Ashby
W.R. Ashby
Y.-H. Kim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/10/2010
Field of study

The notion of information pervades informal descriptions of biological systems, but formal treatments face the problem of defining a quantitative measure of information rooted in a concept of fitness, which is itself an elusive notion. Here, we present a model of population dynamics where this problem is amenable to a mathematical analysis. In the limit where any information about future environmental variations is common to the members of the population, our model is equivalent to known models of financial investment. In this case, the population can be interpreted as a portfolio of financial assets and previous analyses have shown that a key quantity of Shannon's communication theory, the mutual information, sets a fundamental limit on the value of information. We show that this bound can be violated when accounting for features that are irrelevant in finance but inherent to biological systems, such as the stochasticity present at the individual level. This leads us to generalize the measures of uncertainty and information usually encountered in information theory

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

Learning Model Structure from Data : an Application to On-Line Handwriting

Author: Artières Thierry
Binsztok Henri
Publication venue: 'Universitat Autonoma de Barcelona'
Publication date: 01/01/2005
Field of study

We present a learning strategy for Hidden Markov Models that may be used to cluster handwriting sequences or to learn a character model by identifying its main writing styles. Our approach aims at learning both the structure and parameters of a Hidden Markov Model (HMM) from the data. A byproduct of this learning strategy is the ability to cluster signals and identify allograph. We provide experimental results on artificial data that demonstrate the possibility to learn from data HMM parameters and topology. For a given topology, our approach outperforms in some cases that we identify standard Maximum Likelihood learning scheme. We also apply our unsupervised learning scheme on on-line handwritten signals for allograph clustering as well as for learning HMM models for handwritten digit recognition

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Directory of Open Access Journals

Revistes Catalanes amb Accés Obert

HAL Descartes

Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)

Diposit Digital de Documents de la UAB

Hal-Diderot

Learning from Partial Labels with Minimum Entropy

Author: Yoshua Bengio
Yves Grandvalet
Publication venue
Publication date
Field of study

This paper introduces the minimum entropy regularizer for learning from partial labels. This learning problem encompasses the semi-supervised setting, where a decision rule is to be learned from labeled and unlabeled examples. The minimum entropy regularizer applies to diagnosis models, i.e. models of the posterior probabilities of classes. It is shown to include other approaches to the semi-supervised problem as particular or limiting cases. A series of experiments illustrates that the proposed criterion provides solutions taking advantage of unlabeled examples when the latter convey information. Even when the data are sampled from the distribution class spanned by a generative model, the proposed approach improves over the estimated generative model when the number of features is of the order of sample size. The performances are definitely in favor of minimum entropy when the generative model is slightly misspecified. Finally, the robustness of the learning scheme is demonstrated: in situations where unlabeled examples do not convey information, minimum entropy returns a solution discarding unlabeled examples and performs as well as supervised learning. Cet article introduit le régularisateur à entropie minimum pour l'apprentissage d'étiquettes partielles. Ce problème d'apprentissage incorpore le cadre non supervisé, où une règle de décision doit être apprise à partir d'exemples étiquetés et non étiquetés. Le régularisateur à entropie minimum s'applique aux modèles de diagnostics, c'est-à-dire aux modèles des probabilités postérieures de classes. Nous montrons comment inclure d'autres approches comme un cas particulier ou limité du problème semi-supervisé. Une série d'expériences montrent que le critère proposé fournit des solutions utilisant les exemples non étiquetés lorsque ces dernières sont instructives. Même lorsque les données sont échantillonnées à partir de la classe de distribution balayée par un modèle génératif, l'approche mentionnée améliore le modèle génératif estimé lorsque le nombre de caractéristiques est de l'ordre de la taille de l'échantillon. Les performances avantagent certainement l'entropie minimum lorsque le modèle génératif est légèrement mal spécifié. Finalement, la robustesse de ce cadre d'apprentissage est démontré : lors de situations où les exemples non étiquetés n'apportent aucune information, l'entropie minimum retourne une solution rejetant les exemples non étiquetés et est aussi performante que l'apprentissage supervisé.discriminant learning, semi-supervised learning, minimum entropy, apprentissage discriminant, apprentissage semi-supervisé, entropie minimum

Research Papers in Economics

Intentional Motion On-line Learning and Prediction

Author: Aycard Olivier
Fraichard Thierry
Laugier Christian
Vasquez Govea Dizan Alejandro
Publication venue: Springer Verlag
Publication date: 01/01/2008
Field of study

International audiencePredicting motion of humans, animals and other objects which move according to internal plans is a challenging problem. Most existing approaches operate in two stages: a) learning typical motion patterns by observing an environment and b) predicting future motion on the basis of the learned patterns. In existing techniques, learning is performed off-line, hence, it is impossible to refine the existing knowledge on the basis of the new observations obtained during the prediction phase. We propose an approach which uses Hidden Markov Models to represent motion patterns. It is different from similar approaches because it is able to learn and predict in a concurrent fashion thanks to a novel approximate learning approach, based on the Growing Neural Gas algorithm, which estimates both HMM parameters and structure. The found structure has the property of being a planar graph, thus enabling exact inference in linear time with respect to the number of states in the model. Our experiments demonstrate that the technique works in real-time, and is able to produce sound long-term predictions of people motion

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Model Selection in Summary Evaluation

Author: Perez-Breva Luis
Yoshimi Osamu
Publication venue
Publication date: 01/01/2002
Field of study

A difficulty in the design of automated text summarization algorithms is in the objective evaluation. Viewing summarization as a tradeoff between length and information content, we introduce a technique based on a hierarchy of classifiers to rank, through model selection, different summarization methods. This summary evaluation technique allows for broader comparison of summarization methods than the traditional techniques of summary evaluation. We present an empirical study of two simple, albeit widely used, summarization methods that shows the different usages of this automated task-based evaluation system and confirms the results obtained with human-based evaluation methods over smaller corpora

CiteSeerX

DSpace@MIT

Statistical modeling of RNA structure profiling experiments enables parsimonious reconstruction of structure landscapes.

Author: Aviran Sharon
Li Hua
Publication venue: eScholarship, University of California
Publication date: 01/02/2018
Field of study

RNA plays key regulatory roles in diverse cellular processes, where its functionality often derives from folding into and converting between structures. Many RNAs further rely on co-existence of alternative structures, which govern their response to cellular signals. However, characterizing heterogeneous landscapes is difficult, both experimentally and computationally. Recently, structure profiling experiments have emerged as powerful and affordable structure characterization methods, which improve computational structure prediction. To date, efforts have centered on predicting one optimal structure, with much less progress made on multiple-structure prediction. Here, we report a probabilistic modeling approach that predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data. We demonstrate robust landscape reconstruction and quantitative insights into structural dynamics by analyzing numerous data sets. This work establishes a framework for data-directed characterization of structure landscapes to aid experimentalists in performing structure-function studies

Directory of Open Access Journals

eScholarship - University of California

On Separation Between Learning and Control in Partially Observed Markov Decision Processes

Author: Malikopoulos Andreas A.
Publication venue
Publication date: 27/11/2022
Field of study

Cyber-physical systems (CPS) encounter a large volume of data which is added to the system gradually in real time and not altogether in advance. As the volume of data increases, the domain of the control strategies also increases, and thus it becomes challenging to search for an optimal strategy. Even if an optimal control strategy is found, implementing such strategies with increasing domains is burdensome. To derive an optimal control strategy in CPS, we typically assume an ideal model of the system. Such model-based control approaches cannot effectively facilitate optimal solutions with performance guarantees due to the discrepancy between the model and the actual CPS. Alternatively, traditional supervised learning approaches cannot always facilitate robust solutions using data derived offline. Similarly, applying reinforcement learning approaches directly to the actual CPS might impose significant implications on safety and robust operation of the system. The goal of this chapter is to provide a theoretical framework that aims at separating the control and learning tasks which allows us to combine offline model-based control with online learning approaches, and thus circumvent the challenges in deriving optimal control strategies for CPS.Comment: 18 pages, 5 figures. arXiv admin note: text overlap with arXiv:2101.1099

arXiv.org e-Print Archive