Search CORE

8,578 research outputs found

On nonparametric maximum likelihood for a class of stochastic inverse problems

Author: Cavalier
Comets
Delyon
Dempster
Djalil Chafaı¨
James
Jean-Michel Loubes
Kiefer
Kuhn
Lai
Lindsay
Lindsay
Lindsay
Mallat
Mallet
Maz’ja
Mentré
Mentré
O’Sullivan
Pfanzagl
Pfanzagl
Priebe
Stone
van de Geer
Publication venue: 'Elsevier BV'
Publication date: 23/11/2004
Field of study

We establish the consistency of a nonparametric maximum likelihood estimator for a class of stochastic inverse problems. We proceed by embedding the framework into the general settings of early results of Pfanzagl related to mixtures

arXiv.org e-Print Archive

Crossref

HAL-INSA Toulouse

Hal-Diderot

Limits of Learning about a Categorical Latent Variable under Prior Near-Ignorance

Author: Hutter Marcus
Piatti Alberto
Trojani Fabio
Zaffalon Marco
Publication venue
Publication date: 01/01/2009
Field of study

In this paper, we consider the coherent theory of (epistemic) uncertainty of Walley, in which beliefs are represented through sets of probability distributions, and we focus on the problem of modeling prior ignorance about a categorical random variable. In this setting, it is a known result that a state of prior ignorance is not compatible with learning. To overcome this problem, another state of beliefs, called \emph{near-ignorance}, has been proposed. Near-ignorance resembles ignorance very closely, by satisfying some principles that can arguably be regarded as necessary in a state of ignorance, and allows learning to take place. What this paper does, is to provide new and substantial evidence that also near-ignorance cannot be really regarded as a way out of the problem of starting statistical inference in conditions of very weak beliefs. The key to this result is focusing on a setting characterized by a variable of interest that is \emph{latent}. We argue that such a setting is by far the most common case in practice, and we provide, for the case of categorical latent variables (and general \emph{manifest} variables) a condition that, if satisfied, prevents learning to take place under prior near-ignorance. This condition is shown to be easily satisfied even in the most common statistical problems. We regard these results as a strong form of evidence against the possibility to adopt a condition of prior near-ignorance in real statistical problems.Comment: 27 LaTeX page

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

The Australian National University

Fractional norms and quasinorms do not help to overcome the curse of dimensionality

Author: Allohibi Jeza
Gorban Alexander N.
Mirkes Evgeny M.
Publication venue: 'MDPI AG'
Publication date: 29/04/2020
Field of study

The curse of dimensionality causes the well-known and widely discussed problems for machine learning methods. There is a hypothesis that using of the Manhattan distance and even fractional quasinorms lp (for p less than 1) can help to overcome the curse of dimensionality in classification problems. In this study, we systematically test this hypothesis. We confirm that fractional quasinorms have a greater relative contrast or coefficient of variation than the Euclidean norm l2, but we also demonstrate that the distance concentration shows qualitatively the same behaviour for all tested norms and quasinorms and the difference between them decays as dimension tends to infinity. Estimation of classification quality for kNN based on different norms and quasinorms shows that a greater relative contrast does not mean better classifier performance and the worst performance for different databases was shown by different norms (quasinorms). A systematic comparison shows that the difference of the performance of kNN based on lp for p=2, 1, and 0.5 is statistically insignificant

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Direct Ensemble Estimation of Density Functionals

Author: Berisha Visar
Moon Kevin
Wisler Alan
Publication venue
Publication date: 17/05/2017
Field of study

Estimating density functionals of analog sources is an important problem in statistical signal processing and information theory. Traditionally, estimating these quantities requires either making parametric assumptions about the underlying distributions or using non-parametric density estimation followed by integration. In this paper we introduce a direct nonparametric approach which bypasses the need for density estimation by using the error rates of k-NN classifiers asdata-driven basis functions that can be combined to estimate a range of density functionals. However, this method is subject to a non-trivial bias that dramatically slows the rate of convergence in higher dimensions. To overcome this limitation, we develop an ensemble method for estimating the value of the basis function which, under some minor constraints on the smoothness of the underlying distributions, achieves the parametric rate of convergence regardless of data dimension.Comment: 5 page

arXiv.org e-Print Archive

Crossref

Partially-Latent Class Models (pLCM) for Case-Control Studies of Childhood Pneumonia Etiology

Author: Deloria-Knoll Maria
Hammitt Laura L.
Wu Zhenke
Zeger Scott L.
Publication venue
Publication date: 31/05/2014
Field of study

In population studies on the etiology of disease, one goal is the estimation of the fraction of cases attributable to each of several causes. For example, pneumonia is a clinical diagnosis of lung infection that may be caused by viral, bacterial, fungal, or other pathogens. The study of pneumonia etiology is challenging because directly sampling from the lung to identify the etiologic pathogen is not standard clinical practice in most settings. Instead, measurements from multiple peripheral specimens are made. This paper introduces the statistical methodology designed for estimating the population etiology distribution and the individual etiology probabilities in the Pneumonia Etiology Research for Child Health (PERCH) study of 9; 500 children for 7 sites around the world. We formulate the scientific problem in statistical terms as estimating the mixing weights and latent class indicators under a partially-latent class model (pLCM) that combines heterogeneous measurements with different error rates obtained from a case-control study. We introduce the pLCM as an extension of the latent class model. We also introduce graphical displays of the population data and inferred latent-class frequencies. The methods are tested with simulated data, and then applied to PERCH data. The paper closes with a brief description of extensions of the pLCM to the regression setting and to the case where conditional independence among the measures is relaxed.Comment: 25 pages, 4 figures, 1 supplementary materia

arXiv.org e-Print Archive

Collection Of Biostatistics Research Archive