Search CORE

77 research outputs found

Online sufficient dimensionality reduction for sequential high-dimensional time-series

Author: Li Qingbin
Publication venue: Georgia Institute of Technology
Publication date: 27/08/2018
Field of study

In this thesis, we present Online Sufficient Dimensionality Reduction (OSDR) algorithm for real-time high-dimensional sequential data analysis.M.S

Scholarly Materials And Research @ Georgia Tech

Conditional Density Estimation with Dimensionality Reduction via Squared-Loss Conditional Entropy Minimization

Author: Sugiyama Masashi
Tangkaratt Voot
Xie Ning
Publication venue
Publication date: 28/04/2014
Field of study

Regression aims at estimating the conditional mean of output given input. However, regression is not informative enough if the conditional density is multimodal, heteroscedastic, and asymmetric. In such a case, estimating the conditional density itself is preferable, but conditional density estimation (CDE) is challenging in high-dimensional space. A naive approach to coping with high-dimensionality is to first perform dimensionality reduction (DR) and then execute CDE. However, such a two-step process does not perform well in practice because the error incurred in the first DR step can be magnified in the second CDE step. In this paper, we propose a novel single-shot procedure that performs CDE and DR simultaneously in an integrated way. Our key idea is to formulate DR as the problem of minimizing a squared-loss variant of conditional entropy, and this is solved via CDE. Thus, an additional CDE step is not needed after DR. We demonstrate the usefulness of the proposed method through extensive experiments on various datasets including humanoid robot transition and computer art

arXiv.org e-Print Archive

CiteSeerX

Bayesian Compressed Regression

Author: Dunson David B.
Guhaniyogi Rajarshi
Publication venue
Publication date: 22/03/2013
Field of study

As an alternative to variable selection or shrinkage in high dimensional regression, we propose to randomly compress the predictors prior to analysis. This dramatically reduces storage and computational bottlenecks, performing well when the predictors can be projected to a low dimensional linear subspace with minimal loss of information about the response. As opposed to existing Bayesian dimensionality reduction approaches, the exact posterior distribution conditional on the compressed data is available analytically, speeding up computation by many orders of magnitude while also bypassing robustness issues due to convergence and mixing problems with MCMC. Model averaging is used to reduce sensitivity to the random projection matrix, while accommodating uncertainty in the subspace dimension. Strong theoretical support is provided for the approach by showing near parametric convergence rates for the predictive density in the large p small n asymptotic paradigm. Practical performance relative to competitors is illustrated in simulations and real data applications.Comment: 29 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Deformed Statistics Kullback-Leibler Divergence Minimization within a Scaled Bregman Framework

Author: A. Plastino
Abe
Abe
Amari
Azoury
Banerjee
Banerjee
Bağci
Bhattacharyya
Borges
Borland
Bregman
Cover
Csiszár
Csiszár
Csiszár
Csiszár
Curado
Dhillon
Dukkipati
Dukkipati
Ferri
Gelfand
Globerson
Kullback
Kullback
Madeira
Martins
Martínez
Naudts
Pinsker
R.C. Venkatesan
Shore
Shore
Shore
Shore
Stummer
Stummer
Thouless
Tsallis
Tsallis
Venkatesan
Venkatesan
Venkatesan
Venkatesan
Venkatesan
Publication venue: 'Elsevier BV'
Publication date: 08/09/2011
Field of study

The generalized Kullback-Leibler divergence (K-Ld) in Tsallis statistics [constrained by the additive duality of generalized statistics (dual generalized K-Ld)] is here reconciled with the theory of Bregman divergences for expectations defined by normal averages, within a measure-theoretic framework. Specifically, it is demonstrated that the dual generalized K-Ld is a scaled Bregman divergence. The Pythagorean theorem is derived from the minimum discrimination information-principle using the dual generalized K-Ld as the measure of uncertainty, with constraints defined by normal averages. The minimization of the dual generalized K-Ld, with normal averages constraints, is shown to exhibit distinctly unique features.Comment: 16 pages. Iterative corrections and expansion

arXiv.org e-Print Archive

Crossref

On Stein's Identity and Near-Optimal Estimation in High-dimensional Index Models

Author: Han-Na Kim (530756)
Hocheol Shin (748137)
Hyung-Lae Kim (241902)
Seong Heo (3807025)
Seungho Ryu (164515)
Song Kim (3503126)
Yeojun Yun (668867)
Yoosoo Chang (164509)
Publication venue
Publication date: 17/07/2018
Field of study

We consider estimating the parametric components of semi-parametric multiple index models in a high-dimensional and non-Gaussian setting. Such models form a rich class of non-linear models with applications to signal processing, machine learning and statistics. Our estimators leverage the score function based first and second-order Stein's identities and do not require the covariates to satisfy Gaussian or elliptical symmetry assumptions common in the literature. Moreover, to handle score functions and responses that are heavy-tailed, our estimators are constructed via carefully thresholding their empirical counterparts. We show that our estimator achieves near-optimal statistical rate of convergence in several settings. We supplement our theoretical results via simulation experiments that confirm the theory

arXiv.org e-Print Archive

FigShare