Search CORE

2,447 research outputs found

The equivalence of information-theoretic and likelihood-based methods for neural dimensionality reduction

Author: Pillow Jonathan W.
Sahani Maneesh
Williamson Ross S.
Publication venue
Publication date: 24/02/2015
Field of study

Stimulus dimensionality-reduction methods in neuroscience seek to identify a low-dimensional space of stimulus features that affect a neuron's probability of spiking. One popular method, known as maximally informative dimensions (MID), uses an information-theoretic quantity known as "single-spike information" to identify this space. Here we examine MID from a model-based perspective. We show that MID is a maximum-likelihood estimator for the parameters of a linear-nonlinear-Poisson (LNP) model, and that the empirical single-spike information corresponds to the normalized log-likelihood under a Poisson model. This equivalence implies that MID does not necessarily find maximally informative stimulus dimensions when spiking is not well described as Poisson. We provide several examples to illustrate this shortcoming, and derive a lower bound on the information lost when spiking is Bernoulli in discrete time bins. To overcome this limitation, we introduce model-based dimensionality reduction methods for neurons with non-Poisson firing statistics, and show that they can be framed equivalently in likelihood-based or information-theoretic terms. Finally, we show how to overcome practical limitations on the number of stimulus dimensions that MID can estimate by constraining the form of the non-parametric nonlinearity in an LNP model. We illustrate these methods with simulations and data from primate visual cortex

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

UCL Discovery

Practical targeted learning from large data sets by survey sampling

Author: Bertail Patrice
Chambaz Antoine
Joly Emilien
Publication venue
Publication date: 29/06/2016
Field of study

We address the practical construction of asymptotic confidence intervals for smooth (i.e., path-wise differentiable), real-valued statistical parameters by targeted learning from independent and identically distributed data in contexts where sample size is so large that it poses computational challenges. We observe some summary measure of all data and select a sub-sample from the complete data set by Poisson rejective sampling with unequal inclusion probabilities based on the summary measures. Targeted learning is carried out from the easier to handle sub-sample. We derive a central limit theorem for the targeted minimum loss estimator (TMLE) which enables the construction of the confidence intervals. The inclusion probabilities can be optimized to reduce the asymptotic variance of the TMLE. We illustrate the procedure with two examples where the parameters of interest are variable importance measures of an exposure (binary or continuous) on an outcome. We also conduct a simulation study and comment on its results. keywords: semiparametric inference; survey sampling; targeted minimum loss estimation (TMLE

arXiv.org e-Print Archive

Collection Of Biostatistics Research Archive

HAL-Polytechnique

Demystifying Fixed k-Nearest Neighbor Information Estimators

Author: Gao Weihao
Oh Sewoong
Viswanath Pramod
Publication venue
Publication date: 10/08/2016
Field of study

Estimating mutual information from i.i.d. samples drawn from an unknown joint density function is a basic statistical problem of broad interest with multitudinous applications. The most popular estimator is one proposed by Kraskov and St\"ogbauer and Grassberger (KSG) in 2004, and is nonparametric and based on the distances of each sample to its

k^{\rm th}

nearest neighboring sample, where

k

is a fixed small integer. Despite its widespread use (part of scientific software packages), theoretical properties of this estimator have been largely unexplored. In this paper we demonstrate that the estimator is consistent and also identify an upper bound on the rate of convergence of the bias as a function of number of samples. We argue that the superior performance benefits of the KSG estimator stems from a curious "correlation boosting" effect and build on this intuition to modify the KSG estimator in novel ways to construct a superior estimator. As a byproduct of our investigations, we obtain nearly tight rates of convergence of the

\ell_2

error of the well known fixed

k

nearest neighbor estimator of differential entropy by Kozachenko and Leonenko.Comment: 55 pages, 8 figure

arXiv.org e-Print Archive

Crossref