8,048 research outputs found
On estimation of entropy and mutual information of continuous distributions
Mutual information is used in a procedure to estimate time-delays between recordings of electroencephalogram (EEG) signals originating from epileptic animals and patients. We present a simple and reliable histogram-based method to estimate mutual information. The accuracies of this mutual information estimator and of a similar entropy estimator are discussed. The bias and variance calculations presented can also be applied to discrete valued systems. Finally, we present some simulation results, which are compared with earlier work
Pointwise Relations between Information and Estimation in Gaussian Noise
Many of the classical and recent relations between information and estimation
in the presence of Gaussian noise can be viewed as identities between
expectations of random quantities. These include the I-MMSE relationship of Guo
et al.; the relative entropy and mismatched estimation relationship of
Verd\'{u}; the relationship between causal estimation and mutual information of
Duncan, and its extension to the presence of feedback by Kadota et al.; the
relationship between causal and non-casual estimation of Guo et al., and its
mismatched version of Weissman. We dispense with the expectations and explore
the nature of the pointwise relations between the respective random quantities.
The pointwise relations that we find are as succinctly stated as - and give
considerable insight into - the original expectation identities.
As an illustration of our results, consider Duncan's 1970 discovery that the
mutual information is equal to the causal MMSE in the AWGN channel, which can
equivalently be expressed saying that the difference between the input-output
information density and half the causal estimation error is a zero mean random
variable (regardless of the distribution of the channel input). We characterize
this random variable explicitly, rather than merely its expectation. Classical
estimation and information theoretic quantities emerge with new and surprising
roles. For example, the variance of this random variable turns out to be given
by the causal MMSE (which, in turn, is equal to the mutual information by
Duncan's result).Comment: 31 pages, 2 figures, submitted to IEEE Transactions on Information
Theor
Optimal model-free prediction from multivariate time series
Forecasting a time series from multivariate predictors constitutes a
challenging problem, especially using model-free approaches. Most techniques,
such as nearest-neighbor prediction, quickly suffer from the curse of
dimensionality and overfitting for more than a few predictors which has limited
their application mostly to the univariate case. Therefore, selection
strategies are needed that harness the available information as efficiently as
possible. Since often the right combination of predictors matters, ideally all
subsets of possible predictors should be tested for their predictive power, but
the exponentially growing number of combinations makes such an approach
computationally prohibitive. Here a prediction scheme that overcomes this
strong limitation is introduced utilizing a causal pre-selection step which
drastically reduces the number of possible predictors to the most predictive
set of causal drivers making a globally optimal search scheme tractable. The
information-theoretic optimality is derived and practical selection criteria
are discussed. As demonstrated for multivariate nonlinear stochastic delay
processes, the optimal scheme can even be less computationally expensive than
commonly used sub-optimal schemes like forward selection. The method suggests a
general framework to apply the optimal model-free approach to select variables
and subsequently fit a model to further improve a prediction or learn
statistical dependencies. The performance of this framework is illustrated on a
climatological index of El Ni\~no Southern Oscillation.Comment: 14 pages, 9 figure
Maximum Entropy Vector Kernels for MIMO system identification
Recent contributions have framed linear system identification as a
nonparametric regularized inverse problem. Relying on -type
regularization which accounts for the stability and smoothness of the impulse
response to be estimated, these approaches have been shown to be competitive
w.r.t classical parametric methods. In this paper, adopting Maximum Entropy
arguments, we derive a new penalty deriving from a vector-valued
kernel; to do so we exploit the structure of the Hankel matrix, thus
controlling at the same time complexity, measured by the McMillan degree,
stability and smoothness of the identified models. As a special case we recover
the nuclear norm penalty on the squared block Hankel matrix. In contrast with
previous literature on reweighted nuclear norm penalties, our kernel is
described by a small number of hyper-parameters, which are iteratively updated
through marginal likelihood maximization; constraining the structure of the
kernel acts as a (hyper)regularizer which helps controlling the effective
degrees of freedom of our estimator. To optimize the marginal likelihood we
adapt a Scaled Gradient Projection (SGP) algorithm which is proved to be
significantly computationally cheaper than other first and second order
off-the-shelf optimization methods. The paper also contains an extensive
comparison with many state-of-the-art methods on several Monte-Carlo studies,
which confirms the effectiveness of our procedure
- …