4,827 research outputs found
An analysis of chaining in multi-label classification
The idea of classifier chains has recently been introduced as a promising technique for multi-label classification. However, despite being intuitively appealing and showing strong performance in empirical studies, still very little is known about the main principles underlying this type of method. In this paper, we provide a detailed probabilistic analysis of classifier chains from a risk minimization perspective, thereby helping to gain a better understanding of this approach. As a main result, we clarify that the original chaining method seeks to approximate the joint mode of the conditional distribution of label vectors in a greedy manner. As a result of a theoretical regret analysis, we conclude that this approach can perform quite poorly in terms of subset 0/1 loss. Therefore, we present an enhanced inference procedure for which the worst-case regret can be upper-bounded far more tightly. In addition, we show that a probabilistic variant of chaining, which can be utilized for any loss function, becomes tractable by using Monte Carlo sampling. Finally, we present experimental results confirming the validity of our theoretical findings
Four Facets of Forecast Felicity: Calibration, Predictiveness, Randomness and Regret
Machine learning is about forecasting. Forecasts, however, obtain their
usefulness only through their evaluation. Machine learning has traditionally
focused on types of losses and their corresponding regret. Currently, the
machine learning community regained interest in calibration. In this work, we
show the conceptual equivalence of calibration and regret in evaluating
forecasts. We frame the evaluation problem as a game between a forecaster, a
gambler and nature. Putting intuitive restrictions on gambler and forecaster,
calibration and regret naturally fall out of the framework. In addition, this
game links evaluation of forecasts to randomness of outcomes. Random outcomes
with respect to forecasts are equivalent to good forecasts with respect to
outcomes. We call those dual aspects, calibration and regret, predictiveness
and randomness, the four facets of forecast felicity
Automatic Document Image Binarization using Bayesian Optimization
Document image binarization is often a challenging task due to various forms
of degradation. Although there exist several binarization techniques in
literature, the binarized image is typically sensitive to control parameter
settings of the employed technique. This paper presents an automatic document
image binarization algorithm to segment the text from heavily degraded document
images. The proposed technique uses a two band-pass filtering approach for
background noise removal, and Bayesian optimization for automatic
hyperparameter selection for optimal results. The effectiveness of the proposed
binarization technique is empirically demonstrated on the Document Image
Binarization Competition (DIBCO) and the Handwritten Document Image
Binarization Competition (H-DIBCO) datasets
Statistical Learning Theory for Control: A Finite Sample Perspective
This tutorial survey provides an overview of recent non-asymptotic advances
in statistical learning theory as relevant to control and system
identification. While there has been substantial progress across all areas of
control, the theory is most well-developed when it comes to linear system
identification and learning for the linear quadratic regulator, which are the
focus of this manuscript. From a theoretical perspective, much of the labor
underlying these advances has been in adapting tools from modern
high-dimensional statistics and learning theory. While highly relevant to
control theorists interested in integrating tools from machine learning, the
foundational material has not always been easily accessible. To remedy this, we
provide a self-contained presentation of the relevant material, outlining all
the key ideas and the technical machinery that underpin recent results. We also
present a number of open problems and future directions.Comment: Survey Paper, Submitted to Control Systems Magazine. Second version
contains additional motivation for finite sample statistics and more detailed
comparison with classical literatur
- …