32,403 research outputs found

    On estimation of entropy and mutual information of continuous distributions

    Get PDF
    Mutual information is used in a procedure to estimate time-delays between recordings of electroencephalogram (EEG) signals originating from epileptic animals and patients. We present a simple and reliable histogram-based method to estimate mutual information. The accuracies of this mutual information estimator and of a similar entropy estimator are discussed. The bias and variance calculations presented can also be applied to discrete valued systems. Finally, we present some simulation results, which are compared with earlier work

    Forest Density Estimation

    Full text link
    We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bivariate and univariate marginals, and apply Kruskal's algorithm to estimate the optimal forest on held out data. We prove an oracle inequality on the excess risk of the resulting estimator relative to the risk of the best forest. For graph estimation, we consider the problem of estimating forests with restricted tree sizes. We prove that finding a maximum weight spanning forest with restricted tree size is NP-hard, and develop an approximation algorithm for this problem. Viewing the tree size as a complexity parameter, we then select a forest using data splitting, and prove bounds on excess risk and structure selection consistency of the procedure. Experiments with simulated data and microarray data indicate that the methods are a practical alternative to Gaussian graphical models.Comment: Extended version of earlier paper titled "Tree density estimation

    A theoretical model of neuronal population coding of stimuli with both continuous and discrete dimensions

    Full text link
    In a recent study the initial rise of the mutual information between the firing rates of N neurons and a set of p discrete stimuli has been analytically evaluated, under the assumption that neurons fire independently of one another to each stimulus and that each conditional distribution of firing rates is gaussian. Yet real stimuli or behavioural correlates are high-dimensional, with both discrete and continuously varying features.Moreover, the gaussian approximation implies negative firing rates, which is biologically implausible. Here, we generalize the analysis to the case where the stimulus or behavioural correlate has both a discrete and a continuous dimension. In the case of large noise we evaluate the mutual information up to the quadratic approximation as a function of population size. Then we consider a more realistic distribution of firing rates, truncated at zero, and we prove that the resulting correction, with respect to the gaussian firing rates, can be expressed simply as a renormalization of the noise parameter. Finally, we demonstrate the effect of averaging the distribution across the discrete dimension, evaluating the mutual information only with respect to the continuously varying correlate.Comment: 20 pages, 10 figure

    Distribution of Mutual Information

    Full text link
    The mutual information of two random variables i and j with joint probabilities t_ij is commonly used in learning Bayesian nets as well as in many other fields. The chances t_ij are usually estimated by the empirical sampling frequency n_ij/n leading to a point estimate I(n_ij/n) for the mutual information. To answer questions like "is I(n_ij/n) consistent with zero?" or "what is the probability that the true mutual information is much larger than the point estimate?" one has to go beyond the point estimate. In the Bayesian framework one can answer these questions by utilizing a (second order) prior distribution p(t) comprising prior information about t. From the prior p(t) one can compute the posterior p(t|n), from which the distribution p(I|n) of the mutual information can be calculated. We derive reliable and quickly computable approximations for p(I|n). We concentrate on the mean, variance, skewness, and kurtosis, and non-informative priors. For the mean we also give an exact expression. Numerical issues and the range of validity are discussed.Comment: 8 page
    corecore