Search CORE

1,671 research outputs found

A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

Author: Grünwald Peter D.
Mehta Nishant A.
Publication venue
Publication date: 20/10/2017
Field of study

We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexity (also known as stochastic or PAC-Bayesian,

\mathrm{KL}(\text{posterior} \operatorname{\|} \text{prior})

complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax log-loss individual-sequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to

L_2(P)

entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and Cesa-Bianchi who did the log-loss case with

L_\infty

. Together, these results recover optimal bounds for VC- and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.Comment: 38 page

arXiv.org e-Print Archive

CWI's Institutional Repository

Regret analysis for performance metrics in multi-label classification: the case of Hamming and subset zero-one loss

Author: B. Taskar
D. McAllester
D.J.C. MacKay
G. Tsoumakas
G. Tsoumakas
G. Tsoumakas
I.H. Witten
J.C. Platt
L. Breiman
M. Boutell
R. Caruana
R. Nelsen
W. Cheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Crossref

Ghent University Academic Bibliography

Kernel conditional quantile estimation via reduction revisited

Author: Buntine WL
Caetano TS
Kersting K
Quadrianto N
Reid MD
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Quantile regression refers to the process of estimating the quantiles of a conditional distribution and has many important applications within econometrics and data mining, among other domains. In this paper, we show how to estimate these conditional quantile functions within a Bayes risk minimization framework using a Gaussian process prior. The resulting non-parametric probabilistic model is easy to implement and allows non-crossing quantile functions to be enforced. Moreover, it can directly be used in combination with tools and extensions of standard Gaussian Processes such as principled hyperparameter estimation, sparsification, and quantile regression with input-dependent noise rates. No existing approach enjoys all of these desirable properties. Experiments on benchmark datasets show that our method is competitive with state-of-the-art approaches.

CiteSeerX

Crossref

Fraunhofer-ePrints

The Australian National University

Sussex Research Online

CUED - Cambridge University Engineering Department

On the Bayes-optimality of F-measure maximizers

Author: Cheng Weiwei
Dembczynski Krzysztof
Hullermeier Eyke
Jachnik Arkadiusz
Waegeman Willem
Publication venue
Publication date: 01/01/2014
Field of study

The F-measure, which has originally been introduced in information retrieval, is nowadays routinely used as a performance metric for problems such as binary classification, multi-label classification, and structured output prediction. Optimizing this measure is a statistically and computationally challenging problem, since no closed-form solution exists. Adopting a decision-theoretic perspective, this article provides a formal and experimental analysis of different approaches for maximizing the F-measure. We start with a Bayes-risk analysis of related loss functions, such as Hamming loss and subset zero-one loss, showing that optimizing such losses as a surrogate of the F-measure leads to a high worst-case regret. Subsequently, we perform a similar type of analysis for F-measure maximizing algorithms, showing that such algorithms are approximate, while relying on additional assumptions regarding the statistical distribution of the binary response variables. Furthermore, we present a new algorithm which is not only computationally efficient but also Bayes-optimal, regardless of the underlying distribution. To this end, the algorithm requires only a quadratic (with respect to the number of binary responses) number of parameters of the joint distribution. We illustrate the practical performance of all analyzed methods by means of experiments with multi-label classification problems

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Archivsystem Ask23

Valuation Compressions in VCG-Based Combinatorial Auctions

Author: Duetting Paul
Henzinger Monika
Starnberger Martin
Publication venue
Publication date: 01/01/2013
Field of study

The focus of classic mechanism design has been on truthful direct-revelation mechanisms. In the context of combinatorial auctions the truthful direct-revelation mechanism that maximizes social welfare is the VCG mechanism. For many valuation spaces computing the allocation and payments of the VCG mechanism, however, is a computationally hard problem. We thus study the performance of the VCG mechanism when bidders are forced to choose bids from a subspace of the valuation space for which the VCG outcome can be computed efficiently. We prove improved upper bounds on the welfare loss for restrictions to additive bids and upper and lower bounds for restrictions to non-additive bids. These bounds show that the welfare loss increases in expressiveness. All our bounds apply to equilibrium concepts that can be computed in polynomial time as well as to learning outcomes

arXiv.org e-Print Archive

LSE Research Online