15 research outputs found
Recommended from our members
Learning and reasoning
What is the relationship between learning and reasoning? Much recent work in machine learning has been criticized for focusing on learning and ignoring reasoning. This paper attempts to describe the various ways in which machine learning research has (and has not) incorporated reasoning. The paper argues that there are important computational, statistical, and engineering constraints that have produced the current state of affairs. These reasons are reviewed and assessed in the light of future research directions
Recommended from our members
Proposed metrics for transfer learning
Summary: Four proposed metrics:
[1] average relative reduction in training time (sample size, number of training experiences)
[2] jumpstart (initial advantage of transfer algorithm)
[3] handicap (how long it takes the no-transfer algorithm to overcome the jumpstart)
[4] asymptotic advantage (how much better the transfer learning algorithm does in the limit of large sample sizes)Version
Recommended from our members
Improving SVM accuracy by training on auxiliary data sources
The standard model of supervised learning assumes that training and test data are drawn from the same underlying distribution. This paper explores an application in which a second, auxiliary, source of data is available drawn from a different distribution. This auxiliary data is more plentiful, but of significantly lower quality, than the training and test data. In the SVM framework, a training example has two roles: (a) as a data point to constrain the learning process and (b) as a candidate support vector that can form part of the definition of the classifier. The paper considers using the auxiliary data in either (or both) of these roles. This auxiliary data framework is applied to a problem of classifying images of leaves of maple and oak trees using a kernel derived from the shapes of the leaves. Experiments show that when the training data set is very small, training with auxiliary data can produce large improvements in accuracy, even when the auxiliary data is significantly different from the training (and test) data. The paper also introduces techniques for adjusting the kernel scores of the auxiliary data points to make them more comparable to the training data points
Recommended from our members
A POMDP approximation algorithm that anticipates the need to observe
This paper introduces the even-odd POMDP, an approximation to POMDPs in which the world is assumed to be fully observable every other time step. The even-odd POMDP can be converted into an equivalent MDP, the
2MDP, whose value function, V*[subscript 2MDP], can be combined online with a 2-step lookahead search to provide a good POMDP policy. We
prove that this gives an approximation to the POMDP's optimal value function that is at least as good as methods based on the optimal value function of the underlying MDP. We present experimental evidence that the method gives better policies, and we show that it can find a good policy for a POMDP with 10,000 states and observations.Keywords: Partially Observable Markov Decision Problem, even-odd POMDP, POMDPKeywords: Partially Observable Markov Decision Problem, even-odd POMDP, POMD
Recommended from our members
A POMDP approximation algorithm that anticipates the need to observe
This paper introduces the even-odd POMDP an approximation to POMDPs Partially Observable Markov Decision Problems in which the world is assumed to be fully observable every other time step. This approximation works well for problems with a delayed need to observe. The even-odd POMDP can be converted into an equivalent MDP the 2MDP whose value function, V*[subscript 2MDP], can be combined online with a 2-step lookahead search to provide a good POMDP policy. We prove that this gives an approximation to the POMDPs optimal value function that is at least as good as methods based on the optimal value function of the underlying MDP. We present experimental evidence that the method finds a good policy for a POMDP with states and observations.Keywords: Partially Observable Markov Decision Problems, Even-odd POMDP, POMD
Recommended from our members
Two heuristics for solving POMDPs having a delayed need to observe
A common heuristic for solving Partially Observable Markov Decision Problems POMDPs is to first solve the underlying Markov Decision Process MDP and then construct a POMDP policy by performing a fixed depth lookahead search in the POMDP and evaluating the leaf nodes using the MDP value function. A problem with this approximation is that it does not account for the need to choose actions in order to gain information about the state of the world particularly when those observation actions are needed at some point in the future. This paper proposes two heuristics that are better than the MDP approximation in POMDPs where there is a delayed need to observe. The first approximation introduced in [2] is the even-odd POMDP in which the world is assumed to be fully observable every other time step. The even-odd POMDP can be converted into an equivalent MDP the even-MDP whose value function captures some of the sensing costs of the original POMDP. An online policy consisting in a step lookahead search combined with the value function of the even-MDP gives an approximation to the POMDPs value function that is at least as good as the method based on the value function of the underlying MDP. The second POMDP approximation is applicable to a special kind of POMDP which we call the Cost Observable Markov Decision Problem COMDP. In a COMDP the actions are partitioned into those that change the state of the world and those that are pure observation actions. For such problems we describe the chain-MDP algorithm which in many cases is able to capture more of the sensing costs than the even-odd POMDP approximation. We prove that both heuristics compute value functions that are upper bounded by (i.e., better than) the value function of the underlying MDP and in the case of the even-MDP also lower bounded by the POMDPs optimal value function. We show cases where the chain-MDP online policy is better equal or worse than the even-MDP online policy.Keywords: Cost Observable Markov Decision Problem, POMDP, COMDP, Partially Observable Markov Decision ProblemsKeywords: Cost Observable Markov Decision Problem, POMDP, COMDP, Partially Observable Markov Decision Problem
Recommended from our members
Integrating learning from examples into the search for diagnostic policies
This paper studies the problem of learning diagnostic policies from training examples. A diagnostic policy is a complete description of the decision-making actions of a diagnostician (i.e., tests followed by a diagnostic decision) for all possible combinations of test results. An optimal diagnostic policy is one that minimizes the expected total cost of diagnosing a patient, where the cost is the sum of two components: (a) measurement costs (the costs of performing various diagnostic tests) and (b) misdiagnosis costs (the costs incurred when the patient is incorrectly diagnosed). In most diagnostic settings, there is a tradeoff between these two kinds of costs. A diagnostic policy that minimizes measurement costs usually performs fewer tests and tends to make more diagnostic errors, which are expensive. Conversely, a policy that minimizes misdiagnosis costs usually makes more measurements. This paper formalizes diagnostic decision making as a Markov Decision Process (MDP). It then presents a range of algorithms for solving this MDP. These algorithms can be divided into methods based on systematic search and methods based on greedy search. The paper introduces a new family of systematic algorithms based on the AO* algorithm. To make AO* efficient, the paper describes an admissible heuristic that enables AO* to prune large parts of the search space. The paper also introduces several greedy algorithms including some improvements over previously-published methods. The paper then addresses the question of learning diagnostic policies from examples. When the probabilities of diseases and test results are computed from training data, there is a great danger of overfitting. The paper introduces a range of regularization methods to reduce overfitting. An interesting aspect of these regularizers is that they are integrated into the search algorithms rather than being isolated in a separate learning step prior to searching for a good diagnostic policy. Finally, the paper compares the proposed methods on five benchmark diagnostic data sets. The studies show that in most cases the systematic search methods produce better diagnostic policies than the greedy methods. In addition, the studies show that for training sets of realistic size, the systematic search algorithms are practical on today's desktop computers. Hence, these AO*-based methods are recommended for learning diagnostic policies that seek to minimize the expected total cost of diagnosis.Keywords: diagnostic policy, AO*, Markov decision process, diagnostic decision makin
Recommended from our members
Exploiting monotonicity via logistic regression in Bayesian network learning
An important challenge in machine learning is to find ways of learning quickly from very small amounts of training data. The only way to learn from small data samples is to constrain the learning process by exploiting background knowledge. In this report, we present a theoretical analysis on the use of constrained logistic regression for estimating conditional probability distribution in Bayesian Networks (BN) by using background knowledge in the form of qualitative monotonicity statements. Such background knowledge is treated as a set of constraints on the parameters of a logistic function during training. Our goal of finding the appropriate BN model is two-fold: (a) we want to exploit any monotonic relationship between random variables that may generally exist as domain knowledge and (b) we want to be able to address the problem of estimating the conditional distribution of a random variable with a large number of parents. We discuss variants of the logistic regression model and present an analysis on the corresponding constraints required to implement monotonicity. More importantly, we outline the problem in some of these variants in terms of the number of parameters and constraints which, in some cases, can grow exponentially with the number of parent variables. To address this problem, we present two variants of the constrained logistic regression model, M[superscipt 2b][subscript CLR] and M鲁[subscript CLR], in which the number of constraints required to implement monotonicity does not grow exponentially with the number of parents hence providing a practicable method for estimating conditional probabilities with very sparse data.Keywords: logistic regression, Bayesian network learning, monotonicit
Recommended from our members
Bootstrap methods for the cost-sensitive evaluation of classifiers
Many machine learning applications require
classifiers that minimize an asymmetric cost
function rather than the misclassification
rate, and several recent papers have addressed
this problem. However, these papers
have either applied no statistical testing
or have applied statistical methods that are
not appropriate for the cost-sensitive setting.
Without good statistical methods, it is difficult to tell whether these new cost-sensitive
methods are better than existing methods
that ignore costs, and it is also difficult to tell
whether one cost-sensitive method is better
than another. To rectify this problem, this
paper presents two statistical methods for the
cost-sensitive setting. The first constructs a
confidence interval for the expected cost of a
single classifier. The second constructs a confidence interval for the expected difference in
costs of two classifiers. In both cases, the
basic idea is to separate the problem of estimating
the probabilities of each cell in the
confusion matrix (which is independent of the
cost matrix) from the problem of computing
the expected cost. We show experimentally
that these bootstrap tests work better than
applying standard z tests based on the normal
distribution
Recommended from our members
Toward harnessing user feedback for machine learning
There has been little research into how end users might be able to communicate advice to machine learning systems. If this resource--the users themselves--could somehow work hand-in-hand with machine learning systems, the accuracy of learning systems could be improved and the users' understanding and trust of the system could improve as well. We conducted a think-aloud study to see how willing users were to provide feedback and to understand what kinds of feedback users could give. Users were shown explanations of machine learning predictions and asked to provide feedback to improve the predictions. We found that users had no difficulty providing generous amounts of feedback. The kinds of feedback ranged from suggestions for reweighting of features to proposals for new features, feature combinations, relational features, and wholesale changes to the learning algorithm. The results show that user feedback has the potential to significantly improve machine learning systems, but that learning algorithms need to be extended in several ways to be able to assimilate this feedback.Author Keywords:
Machine learning, explanations, user feedback for learnin