1,464 research outputs found
The equivalence of information-theoretic and likelihood-based methods for neural dimensionality reduction
Stimulus dimensionality-reduction methods in neuroscience seek to identify a
low-dimensional space of stimulus features that affect a neuron's probability
of spiking. One popular method, known as maximally informative dimensions
(MID), uses an information-theoretic quantity known as "single-spike
information" to identify this space. Here we examine MID from a model-based
perspective. We show that MID is a maximum-likelihood estimator for the
parameters of a linear-nonlinear-Poisson (LNP) model, and that the empirical
single-spike information corresponds to the normalized log-likelihood under a
Poisson model. This equivalence implies that MID does not necessarily find
maximally informative stimulus dimensions when spiking is not well described as
Poisson. We provide several examples to illustrate this shortcoming, and derive
a lower bound on the information lost when spiking is Bernoulli in discrete
time bins. To overcome this limitation, we introduce model-based dimensionality
reduction methods for neurons with non-Poisson firing statistics, and show that
they can be framed equivalently in likelihood-based or information-theoretic
terms. Finally, we show how to overcome practical limitations on the number of
stimulus dimensions that MID can estimate by constraining the form of the
non-parametric nonlinearity in an LNP model. We illustrate these methods with
simulations and data from primate visual cortex
Methods for Estimation of Intrinsic Dimensionality
Dimension reduction is an important tool used to describe the structure of complex data (explicitly or implicitly) through a small but sufficient number of variables, and
thereby make data analysis more efficient. It is also useful for visualization purposes. Dimension reduction helps statisticians to overcome the âcurse of dimensionalityâ. However, most dimension reduction techniques require the intrinsic dimension of the low-dimensional subspace to be fixed in advance.
The availability of reliable intrinsic dimension (ID) estimation techniques is of major importance. The main goal of this thesis is to develop algorithms for determining the intrinsic dimensions of recorded data sets in a nonlinear context. Whilst this is a well-researched topic for linear planes, based mainly on principal components analysis, relatively little attention has been paid to ways of estimating this number for nonâlinear variable interrelationships. The proposed algorithms here are based on existing concepts that can be categorized into local methods, relying on randomly selected subsets of a recorded variable set, and global methods, utilizing the entire data set.
This thesis provides an overview of ID estimation techniques, with special consideration given to recent developments in nonâlinear techniques, such as charting
manifold and fractalâbased methods. Despite their nominal existence, the practical implementation of these techniques is far from straightforward.
The intrinsic dimension is estimated via Brandâs algorithm by examining the growth point process, which counts the number of points in hyper-spheres. The estimation needs to determine the starting point for each hyper-sphere. In this thesis we provide settings for selecting starting points which work well for most data sets. Additionally we propose approaches for estimating dimensionality via Brandâs algorithm, the Dip method and the Regression method.
Other approaches are proposed for estimating the intrinsic dimension by fractal dimension estimation methods, which exploit the intrinsic geometry of a data set. The most popular concept from this family of methods is the correlation dimension, which requires the estimation of the correlation integral for a ball of radius tending to
0. In this thesis we propose new approaches to approximate the correlation integral in this limit. The new approaches are the Intercept method, the Slop method and the Polynomial method.
In addition we propose a new approach, a localized global method, which could be defined as a local version of global ID methods. The objective of the localized global approach is to improve the algorithm based on a local ID method, which could significantly reduce the negative bias.
Experimental results on real world and simulated data are used to demonstrate the algorithms and compare them to other methodology. A simulation study which verifies the effectiveness of the proposed methods is also provided. Finally, these algorithms are contrasted using a recorded data set from an industrial melter process
OFTER: An Online Pipeline for Time Series Forecasting
We introduce OFTER, a time series forecasting pipeline tailored for mid-sized
multivariate time series. OFTER utilizes the non-parametric models of k-nearest
neighbors and Generalized Regression Neural Networks, integrated with a
dimensionality reduction component. To circumvent the curse of dimensionality,
we employ a weighted norm based on a modified version of the maximal
correlation coefficient. The pipeline we introduce is specifically designed for
online tasks, has an interpretable output, and is able to outperform several
state-of-the art baselines. The computational efficacy of the algorithm, its
online nature, and its ability to operate in low signal-to-noise regimes,
render OFTER an ideal approach for financial multivariate time series problems,
such as daily equity forecasting. Our work demonstrates that while deep
learning models hold significant promise for time series forecasting,
traditional methods carefully integrating mainstream tools remain very
competitive alternatives with the added benefits of scalability and
interpretability.Comment: 26 pages, 12 figure
Sensitivity analysis of circadian entrainment in the space of phase response curves
Sensitivity analysis is a classical and fundamental tool to evaluate the role
of a given parameter in a given system characteristic. Because the phase
response curve is a fundamental input--output characteristic of oscillators, we
developed a sensitivity analysis for oscillator models in the space of phase
response curves. The proposed tool can be applied to high-dimensional
oscillator models without facing the curse of dimensionality obstacle
associated with numerical exploration of the parameter space. Application of
this tool to a state-of-the-art model of circadian rhythms suggests that it can
be useful and instrumental to biological investigations.Comment: 22 pages, 8 figures. Correction of a mistake in Definition 2.1. arXiv
admin note: text overlap with arXiv:1206.414
Mini-Workshop: Semiparametric Modelling of Multivariate Economic Time Series With Changing Dynamics
Modelling multivariate time series of possibly high dimension calls for appropriate dimension-reduction, e.g. by some factor modelling, additive modelling, or some simplified parametric structure for the dynamics (i.e. the serial dependence) of the time series. This workshop aimed to bring together experts in this field in order to discuss recent methodology for multivariate time series dynamics which are changing over time: by an abrupt switch between two (or more) different regimes or rather smoothly evolving over time. The emphasis has been on mathematical methods for semiparametric modelling and estimation, where âsemiparametricâ is to be understood in a rather broad sense: parametric models where the parameters are themselves nonparametric functions (of time), regime-switching nonparametric
models with a parametric specification of the transition mechanism, and alike. An ultimate goal of these models to be applied to economic and financial time series is prediction. Another emphasis has been on comparing Bayesian with frequentist approaches, and to cover both theoretical aspects of estimation, such as consistency and efficiency, and computational aspects
- âŚ