3 research outputs found
An Analysis of Active Learning With Uniform Feature Noise
In active learning, the user sequentially chooses values for feature and
an oracle returns the corresponding label . In this paper, we consider the
effect of feature noise in active learning, which could arise either because
itself is being measured, or it is corrupted in transmission to the oracle,
or the oracle returns the label of a noisy version of the query point. In
statistics, feature noise is known as "errors in variables" and has been
studied extensively in non-active settings. However, the effect of feature
noise in active learning has not been studied before. We consider the
well-known Berkson errors-in-variables model with additive uniform noise of
width .
Our simple but revealing setting is that of one-dimensional binary
classification setting where the goal is to learn a threshold (point where the
probability of a label crosses half). We deal with regression functions
that are antisymmetric in a region of size around the threshold and
also satisfy Tsybakov's margin condition around the threshold. We prove minimax
lower and upper bounds which demonstrate that when is smaller than the
minimiax active/passive noiseless error derived in \cite{CN07}, then noise has
no effect on the rates and one achieves the same noiseless rates. For larger
, the \textit{unflattening} of the regression function on convolution
with uniform noise, along with its local antisymmetry around the threshold,
together yield a behaviour where noise \textit{appears} to be beneficial. Our
key result is that active learning can buy significant improvement over a
passive strategy even in the presence of feature noise.Comment: 24 pages, 2 figures, published in the proceedings of the 17th
International Conference on Artificial Intelligence and Statistics (AISTATS),
201
Noise simulation in classification with the noisemodel R package: Applications analyzing the impact of errors with chemical data
Classification datasets created from chemical processes can be affected by
errors, which impair the accuracy of the models built. This fact highlights the
importance of analyzing the robustness of classifiers against different types
and levels of noise to know their behavior against potential errors. In this con-
text, noise models have been proposed to study noise-related phenomenology
in a controlled environment, allowing errors to be introduced into the data in
a supervised manner. This paper introduces the noisemodel R package, which
contains the first extensive implementation of noise models for classification
datasets, proposing it as support tool to analyze the impact of errors related to
chemical data. It provides 72 noise models found in the specialized literature
that allow errors to be introduced in different ways in classes and attributes.
Each of them is properly documented and referenced, unifying their results
through a specific S3 class, which benefits from customized print, summary
and plot methods. The usage of the package is illustrated through four applica-
tion examples considering real-world chemical datasets, where errors are
prone to occur. The software presented will help to deepen the understanding
of the problem of noisy chemical data, as well as to develop new robust algo-
rithms and noise preprocessing methods properly adapted to different types of
errors in this scenario.University of Granada/CBU
Noise Models in Classification: Unified Nomenclature, Extended Taxonomy and Pragmatic Categorization
This paper presents the first review of noise models in classification covering both label and
attribute noise. Their study reveals the lack of a unified nomenclature in this field. In order to address
this problem, a tripartite nomenclature based on the structural analysis of existing noise models is
proposed. Additionally, a revision of their current taxonomies is carried out, which are combined
and updated to better reflect the nature of any model. Finally, a categorization of noise models is
proposed from a practical point of view depending on the characteristics of noise and the study
purpose. These contributions provide a variety of models to introduce noise, their characteristics
according to the proposed taxonomy and a unified way of naming them, which will facilitate their
identification and study, as well as the reproducibility of future research