Search CORE

69,830 research outputs found

The Mutual Information in Random Linear Estimation Beyond i.i.d. Matrices

Author: Barbier Jean
Krzakala Florent
Macris Nicolas
Maillard Antoine
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/11/2018
Field of study

There has been definite progress recently in proving the variational single-letter formula given by the heuristic replica method for various estimation problems. In particular, the replica formula for the mutual information in the case of noisy linear estimation with random i.i.d. matrices, a problem with applications ranging from compressed sensing to statistics, has been proven rigorously. In this contribution we go beyond the restrictive i.i.d. matrix assumption and discuss the formula proposed by Takeda, Uda, Kabashima and later by Tulino, Verdu, Caire and Shamai who used the replica method. Using the recently introduced adaptive interpolation method and random matrix theory, we prove this formula for a relevant large sub-class of rotationally invariant matrices.Comment: Presented at the 2018 IEEE International Symposium on Information Theory (ISIT

arXiv.org e-Print Archive

A Rate-Splitting Approach to Fading Channels with Imperfect Channel-State Information

Author: Fonollosa Javier Rodríguez
Koch Tobias
Pastore Adriano
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

As shown by M\'edard, the capacity of fading channels with imperfect channel-state information (CSI) can be lower-bounded by assuming a Gaussian channel input

X

with power

P

and by upper-bounding the conditional entropy

h(X|Y,\hat{H})

by the entropy of a Gaussian random variable with variance equal to the linear minimum mean-square error in estimating

X

from

(Y,\hat{H})

. We demonstrate that, using a rate-splitting approach, this lower bound can be sharpened: by expressing the Gaussian input

X

as the sum of two independent Gaussian variables

X_1

and

X_2

and by applying M\'edard's lower bound first to bound the mutual information between

X_1

and

Y

while treating

X_2

as noise, and by applying it a second time to the mutual information between

X_2

and

Y

while assuming

X_1

to be known, we obtain a capacity lower bound that is strictly larger than M\'edard's lower bound. We then generalize this approach to an arbitrary number

L

of layers, where

X

is expressed as the sum of

L

independent Gaussian random variables of respective variances

P_{\ell}

\ell = 1,\dotsc,L

summing up to

P

. Among all such rate-splitting bounds, we determine the supremum over power allocations

P_\ell

and total number of layers

L

. This supremum is achieved for

L\to\infty

and gives rise to an analytically expressible capacity lower bound. For Gaussian fading, this novel bound is shown to converge to the Gaussian-input mutual information as the signal-to-noise ratio (SNR) grows, provided that the variance of the channel estimation error

H-\hat{H}

tends to zero as the SNR tends to infinity.Comment: 28 pages, 8 figures, submitted to IEEE Transactions on Information Theory. Revised according to first round of review

arXiv.org e-Print Archive

Repository for Publications and Research Data

Max-Sliced Mutual Information

Author: Goldfeld Ziv
Greenewald Kristjan
Tsur Dor
Publication venue
Publication date: 28/09/2023
Field of study

Quantifying the dependence between high-dimensional random variables is central to statistical learning and inference. Two classical methods are canonical correlation analysis (CCA), which identifies maximally correlated projected versions of the original variables, and Shannon's mutual information, which is a universal dependence measure that also captures high-order dependencies. However, CCA only accounts for linear dependence, which may be insufficient for certain applications, while mutual information is often infeasible to compute/estimate in high dimensions. This work proposes a middle ground in the form of a scalable information-theoretic generalization of CCA, termed max-sliced mutual information (mSMI). mSMI equals the maximal mutual information between low-dimensional projections of the high-dimensional variables, which reduces back to CCA in the Gaussian case. It enjoys the best of both worlds: capturing intricate dependencies in the data while being amenable to fast computation and scalable estimation from samples. We show that mSMI retains favorable structural properties of Shannon's mutual information, like variational forms and identification of independence. We then study statistical estimation of mSMI, propose an efficiently computable neural estimator, and couple it with formal non-asymptotic error bounds. We present experiments that demonstrate the utility of mSMI for several tasks, encompassing independence testing, multi-view representation learning, algorithmic fairness, and generative modeling. We observe that mSMI consistently outperforms competing methods with little-to-no computational overhead.Comment: Accepted at NeurIPS 202

arXiv.org e-Print Archive

Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data

Author: Daub Carsten O
Kloska Sebastian
Selbig Joachim
Steuer Ralf
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: The information theoretic concept of mutual information provides a general framework to evaluate dependencies between variables. In the context of the clustering of genes with similar patterns of expression it has been suggested as a general quantity of similarity to extend commonly used linear measures. Since mutual information is defined in terms of discrete variables, its application to continuous data requires the use of binning procedures, which can lead to significant numerical errors for datasets of small or moderate size. RESULTS: In this work, we propose a method for the numerical estimation of mutual information from continuous data. We investigate the characteristic properties arising from the application of our algorithm and show that our approach outperforms commonly used algorithms: The significance, as a measure of the power of distinction from random correlation, is significantly increased. This concept is subsequently illustrated on two large-scale gene expression datasets and the results are compared to those obtained using other similarity measures. A C++ source code of our algorithm is available for non-commercial use from [email protected] upon request. CONCLUSION: The utilisation of mutual information as similarity measure enables the detection of non-linear correlations in gene expression datasets. Frequently applied linear correlation measures, which are often used on an ad-hoc basis without further justification, are thereby extended

Springer - Publisher Connector

Directory of Open Access Journals

Detecting and Estimating Signals in Noisy Cable Structures, II: Information Theoretical Analysis

Author: Koch Christof
Manwani Amit
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/1999
Field of study

This is the second in a series of articles that seek to recast classical single-neuron biophysics in information-theoretical terms. Classical cable theory focuses on analyzing the voltage or current attenuation of a synaptic signal as it propagates from its dendritic input location to the spike initiation zone. On the other hand, we are interested in analyzing the amount of information lost about the signal in this process due to the presence of various noise sources distributed throughout the neuronal membrane. We use a stochastic version of the linear one-dimensional cable equation to derive closed-form expressions for the second-order moments of the fluctuations of the membrane potential associated with different membrane current noise sources: thermal noise, noise due to the random opening and closing of sodium and potassium channels, and noise due to the presence of “spontaneous” synaptic input. We consider two different scenarios. In the signal estimation paradigm, the time course of the membrane potential at a location on the cable is used to reconstruct the detailed time course of a random, band-limited current injected some distance away. Estimation performance is characterized in terms of the coding fraction and the mutual information. In the signal detection paradigm, the membrane potential is used to determine whether a distant synaptic event occurred within a given observation interval. In the light of our analytical results, we speculate that the length of weakly active apical dendrites might be limited by the information loss due to the accumulated noise between distal synaptic input sites and the soma and that the presence of dendritic nonlinearities probably serves to increase dendritic information transfer

CiteSeerX

Caltech Authors