31,449 research outputs found
Rho-estimators revisited: General theory and applications
Following Baraud, Birg\'e and Sart (2017), we pursue our attempt to design a
robust universal estimator of the joint ditribution of independent (but not
necessarily i.i.d.) observations for an Hellinger-type loss. Given such
observations with an unknown joint distribution and a dominated
model for , we build an estimator
based on and measure its risk by an
Hellinger-type distance. When does belong to the model, this risk
is bounded by some quantity which relies on the local complexity of the model
in a vicinity of . In most situations this bound corresponds to the
minimax risk over the model (up to a possible logarithmic factor). When
does not belong to the model, its risk involves an additional bias
term proportional to the distance between and ,
whatever the true distribution . From this point of view, this new
version of -estimators improves upon the previous one described in
Baraud, Birg\'e and Sart (2017) which required that be absolutely
continuous with respect to some known reference measure. Further additional
improvements have been brought as compared to the former construction. In
particular, it provides a very general treatment of the regression framework
with random design as well as a computationally tractable procedure for
aggregating estimators. We also give some conditions for the Maximum Likelihood
Estimator to be a -estimator. Finally, we consider the situation where
the Statistician has at disposal many different models and we build a penalized
version of the -estimator for model selection and adaptation purposes. In
the regression setting, this penalized estimator not only allows to estimate
the regression function but also the distribution of the errors.Comment: 73 page
Asymptotics of Fingerprinting and Group Testing: Capacity-Achieving Log-Likelihood Decoders
We study the large-coalition asymptotics of fingerprinting and group testing,
and derive explicit decoders that provably achieve capacity for many of the
considered models. We do this both for simple decoders (fast but suboptimal)
and for joint decoders (slow but optimal), and both for informed and uninformed
settings.
For fingerprinting, we show that if the pirate strategy is known, the
Neyman-Pearson-based log-likelihood decoders provably achieve capacity,
regardless of the strategy. The decoder built against the interleaving attack
is further shown to be a universal decoder, able to deal with arbitrary attacks
and achieving the uninformed capacity. This universal decoder is shown to be
closely related to the Lagrange-optimized decoder of Oosterwijk et al. and the
empirical mutual information decoder of Moulin. Joint decoders are also
proposed, and we conjecture that these also achieve the corresponding joint
capacities.
For group testing, the simple decoder for the classical model is shown to be
more efficient than the one of Chan et al. and it provably achieves the simple
group testing capacity. For generalizations of this model such as noisy group
testing, the resulting simple decoders also achieve the corresponding simple
capacities.Comment: 14 pages, 2 figure
Spurious correlation in estimation of the health production function: A note
In this paper, we address the issue of spurious correlation in the production of health in a systematic way. Spurious correlation entails the risk of linking health status to medical (and nonmedical) inputs when no links exist. This note first presents the bounds testing procedure as a method to detect and avoid spurious correlation. It then applies it to a recent contribution by Lichtenberg (2004), which relates longevity in the United States to pharmaceutical innovation and public health care expenditure. The results of the bounds testing procedure show longevity to be related to these two factors. Therefore, the estimates reported by Lichtenberg (2004) cannot be said to be result of spurious correlation, to the contrary, they very likely reflect an effective relationship, at least for the United States.Health; Life expectancy; Innovation; Pharmaceuticals; Health care expenditure; Cointegration
Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization'' by V. Koltchinskii
Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and
oracle inequalities in risk minimization'' by V. Koltchinskii [arXiv:0708.0083]Comment: Published at http://dx.doi.org/10.1214/009053606000001037 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
On the Brittleness of Bayesian Inference
With the advent of high-performance computing, Bayesian methods are
increasingly popular tools for the quantification of uncertainty throughout
science and industry. Since these methods impact the making of sometimes
critical decisions in increasingly complicated contexts, the sensitivity of
their posterior conclusions with respect to the underlying models and prior
beliefs is a pressing question for which there currently exist positive and
negative results. We report new results suggesting that, although Bayesian
methods are robust when the number of possible outcomes is finite or when only
a finite number of marginals of the data-generating distribution are unknown,
they could be generically brittle when applied to continuous systems (and their
discretizations) with finite information on the data-generating distribution.
If closeness is defined in terms of the total variation metric or the matching
of a finite system of generalized moments, then (1) two practitioners who use
arbitrarily close models and observe the same (possibly arbitrarily large
amount of) data may reach opposite conclusions; and (2) any given prior and
model can be slightly perturbed to achieve any desired posterior conclusions.
The mechanism causing brittlenss/robustness suggests that learning and
robustness are antagonistic requirements and raises the question of a missing
stability condition for using Bayesian Inference in a continuous world under
finite information.Comment: 20 pages, 2 figures. To appear in SIAM Review (Research Spotlights).
arXiv admin note: text overlap with arXiv:1304.677
Low-Complexity Joint Channel Estimation and List Decoding of Short Codes
A pilot-assisted transmission (PAT) scheme is proposed for short
blocklengths, where the pilots are used only to derive an initial channel
estimate for the list construction step. The final decision of the message is
obtained by applying a non-coherent decoding metric to the codewords composing
the list. This allows one to use very few pilots, thus reducing the channel
estimation overhead. The method is applied to an ordered statistics decoder for
communication over a Rayleigh block-fading channel. Gains of up to dB as
compared to traditional PAT schemes are demonstrated for short codes with QPSK
signaling. The approach can be generalized to other list decoders, e.g., to
list decoding of polar codes.Comment: Accepted at the 12th International ITG Conference on Systems,
Communications and Coding (SCC 2019), Rostock, German
- âŠ