488 research outputs found
Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families
We study online learning under logarithmic loss with regular parametric
models. Hedayati and Bartlett (2012b) showed that a Bayesian prediction
strategy with Jeffreys prior and sequential normalized maximum likelihood
(SNML) coincide and are optimal if and only if the latter is exchangeable, and
if and only if the optimal strategy can be calculated without knowing the time
horizon in advance. They put forward the question what families have
exchangeable SNML strategies. This paper fully answers this open problem for
one-dimensional exponential families. The exchangeability can happen only for
three classes of natural exponential family distributions, namely the Gaussian,
Gamma, and the Tweedie exponential family of order 3/2. Keywords: SNML
Exchangeability, Exponential Family, Online Learning, Logarithmic Loss,
Bayesian Strategy, Jeffreys Prior, Fisher Information1Comment: 23 page
On Stein's Identity and Near-Optimal Estimation in High-dimensional Index Models
We consider estimating the parametric components of semi-parametric multiple
index models in a high-dimensional and non-Gaussian setting. Such models form a
rich class of non-linear models with applications to signal processing, machine
learning and statistics. Our estimators leverage the score function based first
and second-order Stein's identities and do not require the covariates to
satisfy Gaussian or elliptical symmetry assumptions common in the literature.
Moreover, to handle score functions and responses that are heavy-tailed, our
estimators are constructed via carefully thresholding their empirical
counterparts. We show that our estimator achieves near-optimal statistical rate
of convergence in several settings. We supplement our theoretical results via
simulation experiments that confirm the theory
Minimax Estimation of the Scale Parameter of Laplace Distribution under Squared-Log Error Loss Function
In this paper, we obtained Minimax estimator of the scale parameter for the Laplace distribution under the Squared log error loss function by applying the theorem of Lehmann [1950], and compared it with Minimax estimator under Quadratic loss function in addition of Maximum Likelihood Estimator according to Monte-Carlo simulation study. The performance of these estimators is compared depending on the mean squared errors (MSE’s). Keywords: Minimax estimator, Laplace distribution, Bayes estimator, Squared-log error loss function, Jeffery prior, Mean squared error
An admissible estimator for the rth power of a bounded scale parameter in a subclass of the exponential family under entropy loss function
We consider an admissible estimator for the rth power of a scale parameter that is lower or upper bounded in a subclass of the scale-parameter exponential family under the entropy loss function. An admissible estimator for a bounded parameter in the family of transformed chi-square distributions is also given.Розглянуто допустиму оцiнку для r-го степеня параметра масштабу, обмеженого зверху або знизу у пiдкласi експоненцiальної сiм’ї параметрiв масштабу з ентропiйною функцiєю втрат. Наведено також допустиму оцiнку обмеженого параметра у сiм’ї трансформованих розподiлiв хi-квадрат
An admissible estimator for the rth power of a bounded scale parameter in a subclass of the exponential family under entropy loss function
We consider an admissible estimator for the rth power of a scale parameter that is lower or upper bounded in a subclass of the scale-parameter exponential family under the entropy loss function. An admissible estimator for a bounded parameter in the family of transformed chi-square distributions is also given.Розглянуто допустиму оцiнку для r-го степеня параметра масштабу, обмеженого зверху або знизу у пiдкласi експоненцiальної сiм’ї параметрiв масштабу з ентропiйною функцiєю втрат. Наведено також допустиму оцiнку обмеженого параметра у сiм’ї трансформованих розподiлiв хi-квадрат
Orthogonal Codes for Robust Low-Cost Communication
Orthogonal coding schemes, known to asymptotically achieve the capacity per
unit cost (CPUC) for single-user ergodic memoryless channels with a zero-cost
input symbol, are investigated for single-user compound memoryless channels,
which exhibit uncertainties in their input-output statistical relationships. A
minimax formulation is adopted to attain robustness. First, a class of
achievable rates per unit cost (ARPUC) is derived, and its utility is
demonstrated through several representative case studies. Second, when the
uncertainty set of channel transition statistics satisfies a convexity
property, optimization is performed over the class of ARPUC through utilizing
results of minimax robustness. The resulting CPUC lower bound indicates the
ultimate performance of the orthogonal coding scheme, and coincides with the
CPUC under certain restrictive conditions. Finally, still under the convexity
property, it is shown that the CPUC can generally be achieved, through
utilizing a so-called mixed strategy in which an orthogonal code contains an
appropriate composition of different nonzero-cost input symbols.Comment: 2nd revision, accepted for publicatio
Testing uniformity on high-dimensional spheres against monotone rotationally symmetric alternatives
We consider the problem of testing uniformity on high-dimensional unit
spheres. We are primarily interested in non-null issues. We show that
rotationally symmetric alternatives lead to two Local Asymptotic Normality
(LAN) structures. The first one is for fixed modal location and allows
to derive locally asymptotically most powerful tests under specified .
The second one, that addresses the Fisher-von Mises-Langevin (FvML) case,
relates to the unspecified- problem and shows that the high-dimensional
Rayleigh test is locally asymptotically most powerful invariant. Under mild
assumptions, we derive the asymptotic non-null distribution of this test, which
allows to extend away from the FvML case the asymptotic powers obtained there
from Le Cam's third lemma. Throughout, we allow the dimension to go to
infinity in an arbitrary way as a function of the sample size . Some of our
results also strengthen the local optimality properties of the Rayleigh test in
low dimensions. We perform a Monte Carlo study to illustrate our asymptotic
results. Finally, we treat an application related to testing for sphericity in
high dimensions
Analyzing sparse dictionaries for online learning with kernels
Many signal processing and machine learning methods share essentially the
same linear-in-the-parameter model, with as many parameters as available
samples as in kernel-based machines. Sparse approximation is essential in many
disciplines, with new challenges emerging in online learning with kernels. To
this end, several sparsity measures have been proposed in the literature to
quantify sparse dictionaries and constructing relevant ones, the most prolific
ones being the distance, the approximation, the coherence and the Babel
measures. In this paper, we analyze sparse dictionaries based on these
measures. By conducting an eigenvalue analysis, we show that these sparsity
measures share many properties, including the linear independence condition and
inducing a well-posed optimization problem. Furthermore, we prove that there
exists a quasi-isometry between the parameter (i.e., dual) space and the
dictionary's induced feature space.Comment: 10 page
- …