523 research outputs found
Solving Multiclass Learning Problems via Error-Correcting Output Codes
Multiclass learning problems involve finding a definition for an unknown
function f(x) whose range is a discrete set containing k > 2 values (i.e., k
``classes''). The definition is acquired by studying collections of training
examples of the form [x_i, f (x_i)]. Existing approaches to multiclass learning
problems include direct application of multiclass algorithms such as the
decision-tree algorithms C4.5 and CART, application of binary concept learning
algorithms to learn individual binary functions for each of the k classes, and
application of binary concept learning algorithms with distributed output
representations. This paper compares these three approaches to a new technique
in which error-correcting codes are employed as a distributed output
representation. We show that these output representations improve the
generalization performance of both C4.5 and backpropagation on a wide range of
multiclass learning tasks. We also demonstrate that this approach is robust
with respect to changes in the size of the training sample, the assignment of
distributed representations to particular classes, and the application of
overfitting avoidance techniques such as decision-tree pruning. Finally, we
show that---like the other methods---the error-correcting code technique can
provide reliable class probability estimates. Taken together, these results
demonstrate that error-correcting output codes provide a general-purpose method
for improving the performance of inductive learning programs on multiclass
problems.Comment: See http://www.jair.org/ for any accompanying file
Integrating Learning from Examples into the Search for Diagnostic Policies
This paper studies the problem of learning diagnostic policies from training
examples. A diagnostic policy is a complete description of the decision-making
actions of a diagnostician (i.e., tests followed by a diagnostic decision) for
all possible combinations of test results. An optimal diagnostic policy is one
that minimizes the expected total cost, which is the sum of measurement costs
and misdiagnosis costs. In most diagnostic settings, there is a tradeoff
between these two kinds of costs. This paper formalizes diagnostic decision
making as a Markov Decision Process (MDP). The paper introduces a new family of
systematic search algorithms based on the AO* algorithm to solve this MDP. To
make AO* efficient, the paper describes an admissible heuristic that enables
AO* to prune large parts of the search space. The paper also introduces several
greedy algorithms including some improvements over previously-published
methods. The paper then addresses the question of learning diagnostic policies
from examples. When the probabilities of diseases and test results are computed
from training data, there is a great danger of overfitting. To reduce
overfitting, regularizers are integrated into the search algorithms. Finally,
the paper compares the proposed methods on five benchmark diagnostic data sets.
The studies show that in most cases the systematic search methods produce
better diagnostic policies than the greedy methods. In addition, the studies
show that for training sets of realistic size, the systematic search algorithms
are practical on todays desktop computers
The use of provenance in information retrieval
The volume of electronic information that users accumulate is steadily rising. A recent study [2] found that there were on average 32,000 pieces of information (e-mails, web pages, documents, etc.) for each user. The problem of organizin
Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification
Mammogram classification is directly related to computer-aided diagnosis of
breast cancer. Traditional methods rely on regions of interest (ROIs) which
require great efforts to annotate. Inspired by the success of using deep
convolutional features for natural image analysis and multi-instance learning
(MIL) for labeling a set of instances/patches, we propose end-to-end trained
deep multi-instance networks for mass classification based on whole mammogram
without the aforementioned ROIs. We explore three different schemes to
construct deep multi-instance networks for whole mammogram classification.
Experimental results on the INbreast dataset demonstrate the robustness of
proposed networks compared to previous work using segmentation and detection
annotations.Comment: MICCAI 2017 Camera Read
Local Modelling in Classification on Different Feature Subspaces
Sometimes one may be confronted with classification problems where classes are
constituted of several subclasses that possess different distributions and therefore destroy accurate models of the entire classes as one similar group. An issue is modelling via local models of several subclasses.
In this paper, a method is presented of how to handle such classification problems where the subclasses are furthermore characterized by different subsets of the variables. Situations are outlined and tested where such local models in different variable subspaces dramatically improve the classification error
Fast Reinforcement Learning with Large Action Sets Using Error-Correcting Output Codes for MDP Factorization
International audienceThe use of Reinforcement Learning in real-world scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many real-world problems. We consider the RL problem in the supervised classification framework where the optimal policy is obtained through a multiclass classifier, the set of classes being the set of actions of the problem. We introduce error-correcting output codes (ECOCs) in this setting and propose two new methods for reducing complexity when using rollouts-based approaches. The first method consists in using an ECOC-based classifier as the multiclass classifier, reducing the learning complexity from O(A2) to O(Alog(A)) . We then propose a novel method that profits from the ECOC's coding dictionary to split the initial MDP into O(log(A)) separate two-action MDPs. This second method reduces learning complexity even further, from O(A2) to O(log(A)) , thus rendering problems with large action sets tractable. We finish by experimentally demonstrating the advantages of our approach on a set of benchmark problems, both in speed and performance
Adapting Quality Assurance to Adaptive Systems: The Scenario Coevolution Paradigm
From formal and practical analysis, we identify new challenges that
self-adaptive systems pose to the process of quality assurance. When tackling
these, the effort spent on various tasks in the process of software engineering
is naturally re-distributed. We claim that all steps related to testing need to
become self-adaptive to match the capabilities of the self-adaptive
system-under-test. Otherwise, the adaptive system's behavior might elude
traditional variants of quality assurance. We thus propose the paradigm of
scenario coevolution, which describes a pool of test cases and other
constraints on system behavior that evolves in parallel to the (in part
autonomous) development of behavior in the system-under-test. Scenario
coevolution offers a simple structure for the organization of adaptive testing
that allows for both human-controlled and autonomous intervention, supporting
software engineering for adaptive systems on a procedural as well as technical
level.Comment: 17 pages, published at ISOLA 201
Over-Fitting in Model Selection with Gaussian Process Regression
Model selection in Gaussian Process Regression (GPR) seeks to determine the optimal values of the hyper-parameters governing the covariance function, which allows flexible customization of the GP to the problem at hand. An oft-overlooked issue that is often encountered in the model process is over-fitting the model selection criterion, typically the marginal likelihood. The over-fitting in machine learning refers to the fitting of random noise present in the model selection criterion in addition to features improving the generalisation performance of the statistical model. In this paper, we construct several Gaussian process regression models for a range of high-dimensional datasets from the UCI machine learning repository. Afterwards, we compare both MSE on the test dataset and the negative log marginal likelihood (nlZ), used as the model selection criteria, to find whether the problem of overfitting in model selection also affects GPR. We found that the squared exponential covariance function with Automatic Relevance Determination (SEard) is better than other kernels including squared exponential covariance function with isotropic distance measure (SEiso) according to the nLZ, but it is clearly not the best according to MSE on the test data, and this is an indication of over-fitting problem in model selection
- …