31,449 research outputs found

    Rho-estimators revisited: General theory and applications

    Get PDF
    Following Baraud, Birg\'e and Sart (2017), we pursue our attempt to design a robust universal estimator of the joint ditribution of nn independent (but not necessarily i.i.d.) observations for an Hellinger-type loss. Given such observations with an unknown joint distribution P\mathbf{P} and a dominated model Q\mathscr{Q} for P\mathbf{P}, we build an estimator P^\widehat{\mathbf{P}} based on Q\mathscr{Q} and measure its risk by an Hellinger-type distance. When P\mathbf{P} does belong to the model, this risk is bounded by some quantity which relies on the local complexity of the model in a vicinity of P\mathbf{P}. In most situations this bound corresponds to the minimax risk over the model (up to a possible logarithmic factor). When P\mathbf{P} does not belong to the model, its risk involves an additional bias term proportional to the distance between P\mathbf{P} and Q\mathscr{Q}, whatever the true distribution P\mathbf{P}. From this point of view, this new version of ρ\rho-estimators improves upon the previous one described in Baraud, Birg\'e and Sart (2017) which required that P\mathbf{P} be absolutely continuous with respect to some known reference measure. Further additional improvements have been brought as compared to the former construction. In particular, it provides a very general treatment of the regression framework with random design as well as a computationally tractable procedure for aggregating estimators. We also give some conditions for the Maximum Likelihood Estimator to be a ρ\rho-estimator. Finally, we consider the situation where the Statistician has at disposal many different models and we build a penalized version of the ρ\rho-estimator for model selection and adaptation purposes. In the regression setting, this penalized estimator not only allows to estimate the regression function but also the distribution of the errors.Comment: 73 page

    Asymptotics of Fingerprinting and Group Testing: Capacity-Achieving Log-Likelihood Decoders

    Get PDF
    We study the large-coalition asymptotics of fingerprinting and group testing, and derive explicit decoders that provably achieve capacity for many of the considered models. We do this both for simple decoders (fast but suboptimal) and for joint decoders (slow but optimal), and both for informed and uninformed settings. For fingerprinting, we show that if the pirate strategy is known, the Neyman-Pearson-based log-likelihood decoders provably achieve capacity, regardless of the strategy. The decoder built against the interleaving attack is further shown to be a universal decoder, able to deal with arbitrary attacks and achieving the uninformed capacity. This universal decoder is shown to be closely related to the Lagrange-optimized decoder of Oosterwijk et al. and the empirical mutual information decoder of Moulin. Joint decoders are also proposed, and we conjecture that these also achieve the corresponding joint capacities. For group testing, the simple decoder for the classical model is shown to be more efficient than the one of Chan et al. and it provably achieves the simple group testing capacity. For generalizations of this model such as noisy group testing, the resulting simple decoders also achieve the corresponding simple capacities.Comment: 14 pages, 2 figure

    Spurious correlation in estimation of the health production function: A note

    Get PDF
    In this paper, we address the issue of spurious correlation in the production of health in a systematic way. Spurious correlation entails the risk of linking health status to medical (and nonmedical) inputs when no links exist. This note first presents the bounds testing procedure as a method to detect and avoid spurious correlation. It then applies it to a recent contribution by Lichtenberg (2004), which relates longevity in the United States to pharmaceutical innovation and public health care expenditure. The results of the bounds testing procedure show longevity to be related to these two factors. Therefore, the estimates reported by Lichtenberg (2004) cannot be said to be result of spurious correlation, to the contrary, they very likely reflect an effective relationship, at least for the United States.Health; Life expectancy; Innovation; Pharmaceuticals; Health care expenditure; Cointegration

    Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization'' by V. Koltchinskii

    Full text link
    Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization'' by V. Koltchinskii [arXiv:0708.0083]Comment: Published at http://dx.doi.org/10.1214/009053606000001037 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On the Brittleness of Bayesian Inference

    Get PDF
    With the advent of high-performance computing, Bayesian methods are increasingly popular tools for the quantification of uncertainty throughout science and industry. Since these methods impact the making of sometimes critical decisions in increasingly complicated contexts, the sensitivity of their posterior conclusions with respect to the underlying models and prior beliefs is a pressing question for which there currently exist positive and negative results. We report new results suggesting that, although Bayesian methods are robust when the number of possible outcomes is finite or when only a finite number of marginals of the data-generating distribution are unknown, they could be generically brittle when applied to continuous systems (and their discretizations) with finite information on the data-generating distribution. If closeness is defined in terms of the total variation metric or the matching of a finite system of generalized moments, then (1) two practitioners who use arbitrarily close models and observe the same (possibly arbitrarily large amount of) data may reach opposite conclusions; and (2) any given prior and model can be slightly perturbed to achieve any desired posterior conclusions. The mechanism causing brittlenss/robustness suggests that learning and robustness are antagonistic requirements and raises the question of a missing stability condition for using Bayesian Inference in a continuous world under finite information.Comment: 20 pages, 2 figures. To appear in SIAM Review (Research Spotlights). arXiv admin note: text overlap with arXiv:1304.677

    Low-Complexity Joint Channel Estimation and List Decoding of Short Codes

    Get PDF
    A pilot-assisted transmission (PAT) scheme is proposed for short blocklengths, where the pilots are used only to derive an initial channel estimate for the list construction step. The final decision of the message is obtained by applying a non-coherent decoding metric to the codewords composing the list. This allows one to use very few pilots, thus reducing the channel estimation overhead. The method is applied to an ordered statistics decoder for communication over a Rayleigh block-fading channel. Gains of up to 1.21.2 dB as compared to traditional PAT schemes are demonstrated for short codes with QPSK signaling. The approach can be generalized to other list decoders, e.g., to list decoding of polar codes.Comment: Accepted at the 12th International ITG Conference on Systems, Communications and Coding (SCC 2019), Rostock, German
    • 

    corecore