488 research outputs found

    Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families

    Full text link
    We study online learning under logarithmic loss with regular parametric models. Hedayati and Bartlett (2012b) showed that a Bayesian prediction strategy with Jeffreys prior and sequential normalized maximum likelihood (SNML) coincide and are optimal if and only if the latter is exchangeable, and if and only if the optimal strategy can be calculated without knowing the time horizon in advance. They put forward the question what families have exchangeable SNML strategies. This paper fully answers this open problem for one-dimensional exponential families. The exchangeability can happen only for three classes of natural exponential family distributions, namely the Gaussian, Gamma, and the Tweedie exponential family of order 3/2. Keywords: SNML Exchangeability, Exponential Family, Online Learning, Logarithmic Loss, Bayesian Strategy, Jeffreys Prior, Fisher Information1Comment: 23 page

    On Stein's Identity and Near-Optimal Estimation in High-dimensional Index Models

    Full text link
    We consider estimating the parametric components of semi-parametric multiple index models in a high-dimensional and non-Gaussian setting. Such models form a rich class of non-linear models with applications to signal processing, machine learning and statistics. Our estimators leverage the score function based first and second-order Stein's identities and do not require the covariates to satisfy Gaussian or elliptical symmetry assumptions common in the literature. Moreover, to handle score functions and responses that are heavy-tailed, our estimators are constructed via carefully thresholding their empirical counterparts. We show that our estimator achieves near-optimal statistical rate of convergence in several settings. We supplement our theoretical results via simulation experiments that confirm the theory

    Minimax Estimation of the Scale Parameter of Laplace Distribution under Squared-Log Error Loss Function

    Get PDF
    In this paper, we obtained Minimax estimator of the scale parameter  for the Laplace distribution under the Squared log error loss function by applying the theorem of Lehmann [1950], and compared it with Minimax estimator under Quadratic loss function in addition of Maximum Likelihood Estimator according to Monte-Carlo simulation study. The performance of these estimators is compared depending on the mean squared errors (MSE’s). Keywords: Minimax estimator, Laplace distribution, Bayes estimator, Squared-log error loss function, Jeffery prior, Mean squared error

    An admissible estimator for the rth power of a bounded scale parameter in a subclass of the exponential family under entropy loss function

    No full text
    We consider an admissible estimator for the rth power of a scale parameter that is lower or upper bounded in a subclass of the scale-parameter exponential family under the entropy loss function. An admissible estimator for a bounded parameter in the family of transformed chi-square distributions is also given.Розглянуто допустиму оцiнку для r-го степеня параметра масштабу, обмеженого зверху або знизу у пiдкласi експоненцiальної сiм’ї параметрiв масштабу з ентропiйною функцiєю втрат. Наведено також допустиму оцiнку обмеженого параметра у сiм’ї трансформованих розподiлiв хi-квадрат

    An admissible estimator for the rth power of a bounded scale parameter in a subclass of the exponential family under entropy loss function

    Get PDF
    We consider an admissible estimator for the rth power of a scale parameter that is lower or upper bounded in a subclass of the scale-parameter exponential family under the entropy loss function. An admissible estimator for a bounded parameter in the family of transformed chi-square distributions is also given.Розглянуто допустиму оцiнку для r-го степеня параметра масштабу, обмеженого зверху або знизу у пiдкласi експоненцiальної сiм’ї параметрiв масштабу з ентропiйною функцiєю втрат. Наведено також допустиму оцiнку обмеженого параметра у сiм’ї трансформованих розподiлiв хi-квадрат

    Orthogonal Codes for Robust Low-Cost Communication

    Full text link
    Orthogonal coding schemes, known to asymptotically achieve the capacity per unit cost (CPUC) for single-user ergodic memoryless channels with a zero-cost input symbol, are investigated for single-user compound memoryless channels, which exhibit uncertainties in their input-output statistical relationships. A minimax formulation is adopted to attain robustness. First, a class of achievable rates per unit cost (ARPUC) is derived, and its utility is demonstrated through several representative case studies. Second, when the uncertainty set of channel transition statistics satisfies a convexity property, optimization is performed over the class of ARPUC through utilizing results of minimax robustness. The resulting CPUC lower bound indicates the ultimate performance of the orthogonal coding scheme, and coincides with the CPUC under certain restrictive conditions. Finally, still under the convexity property, it is shown that the CPUC can generally be achieved, through utilizing a so-called mixed strategy in which an orthogonal code contains an appropriate composition of different nonzero-cost input symbols.Comment: 2nd revision, accepted for publicatio

    Testing uniformity on high-dimensional spheres against monotone rotationally symmetric alternatives

    Full text link
    We consider the problem of testing uniformity on high-dimensional unit spheres. We are primarily interested in non-null issues. We show that rotationally symmetric alternatives lead to two Local Asymptotic Normality (LAN) structures. The first one is for fixed modal location θ\theta and allows to derive locally asymptotically most powerful tests under specified θ\theta. The second one, that addresses the Fisher-von Mises-Langevin (FvML) case, relates to the unspecified-θ\theta problem and shows that the high-dimensional Rayleigh test is locally asymptotically most powerful invariant. Under mild assumptions, we derive the asymptotic non-null distribution of this test, which allows to extend away from the FvML case the asymptotic powers obtained there from Le Cam's third lemma. Throughout, we allow the dimension pp to go to infinity in an arbitrary way as a function of the sample size nn. Some of our results also strengthen the local optimality properties of the Rayleigh test in low dimensions. We perform a Monte Carlo study to illustrate our asymptotic results. Finally, we treat an application related to testing for sphericity in high dimensions

    Analyzing sparse dictionaries for online learning with kernels

    Full text link
    Many signal processing and machine learning methods share essentially the same linear-in-the-parameter model, with as many parameters as available samples as in kernel-based machines. Sparse approximation is essential in many disciplines, with new challenges emerging in online learning with kernels. To this end, several sparsity measures have been proposed in the literature to quantify sparse dictionaries and constructing relevant ones, the most prolific ones being the distance, the approximation, the coherence and the Babel measures. In this paper, we analyze sparse dictionaries based on these measures. By conducting an eigenvalue analysis, we show that these sparsity measures share many properties, including the linear independence condition and inducing a well-posed optimization problem. Furthermore, we prove that there exists a quasi-isometry between the parameter (i.e., dual) space and the dictionary's induced feature space.Comment: 10 page
    corecore