Search CORE

15 research outputs found

Recommended from our members

Learning and reasoning

Author: Dietterich Thomas Glen
Oregon State University. Dept. of Computer Science
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

What is the relationship between learning and reasoning? Much recent work in machine learning has been criticized for focusing on learning and ignoring reasoning. This paper attempts to describe the various ways in which machine learning research has (and has not) incorporated reasoning. The paper argues that there are important computational, statistical, and engineering constraints that have produced the current state of affairs. These reasons are reviewed and assessed in the light of future research directions

ScholarsArchive@OSU

Recommended from our members

Proposed metrics for transfer learning

Author: Dietterich Thomas Glen
Oregon State University. Dept. of Computer Science
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

Summary: Four proposed metrics: [1] average relative reduction in training time (sample size, number of training experiences) [2] jumpstart (initial advantage of transfer algorithm) [3] handicap (how long it takes the no-transfer algorithm to overcome the jumpstart) [4] asymptotic advantage (how much better the transfer learning algorithm does in the limit of large sample sizes)Version

ScholarsArchive@OSU

Recommended from our members

Improving SVM accuracy by training on auxiliary data sources

Author: Dietterich Thomas Glen
Oregon State University. Dept. of Computer Science
Wu Pengcheng
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

The standard model of supervised learning assumes that training and test data are drawn from the same underlying distribution. This paper explores an application in which a second, auxiliary, source of data is available drawn from a different distribution. This auxiliary data is more plentiful, but of significantly lower quality, than the training and test data. In the SVM framework, a training example has two roles: (a) as a data point to constrain the learning process and (b) as a candidate support vector that can form part of the definition of the classifier. The paper considers using the auxiliary data in either (or both) of these roles. This auxiliary data framework is applied to a problem of classifying images of leaves of maple and oak trees using a kernel derived from the shapes of the leaves. Experiments show that when the training data set is very small, training with auxiliary data can produce large improvements in accuracy, even when the auxiliary data is significantly different from the training (and test) data. The paper also introduces techniques for adjusting the kernel scores of the auxiliary data points to make them more comparable to the training data points

ScholarsArchive@OSU

Recommended from our members

A POMDP approximation algorithm that anticipates the need to observe

Author: Bayer Valentina
Dietterich Thomas Glen
Oregon State University. Dept. of Computer Science
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

This paper introduces the even-odd POMDP, an approximation to POMDPs in which the world is assumed to be fully observable every other time step. The even-odd POMDP can be converted into an equivalent MDP, the 2MDP, whose value function, V*[subscript 2MDP], can be combined online with a 2-step lookahead search to provide a good POMDP policy. We prove that this gives an approximation to the POMDP's optimal value function that is at least as good as methods based on the optimal value function of the underlying MDP. We present experimental evidence that the method gives better policies, and we show that it can find a good policy for a POMDP with 10,000 states and observations.Keywords: Partially Observable Markov Decision Problem, even-odd POMDP, POMDPKeywords: Partially Observable Markov Decision Problem, even-odd POMDP, POMD

ScholarsArchive@OSU

Recommended from our members

A POMDP approximation algorithm that anticipates the need to observe

Author: Dietterich Thomas Glen
Oregon State University. Dept. of Computer Science
Zubek Valentina Bayer
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

This paper introduces the even-odd POMDP an approximation to POMDPs Partially Observable Markov Decision Problems in which the world is assumed to be fully observable every other time step. This approximation works well for problems with a delayed need to observe. The even-odd POMDP can be converted into an equivalent MDP the 2MDP whose value function, V*[subscript 2MDP], can be combined online with a 2-step lookahead search to provide a good POMDP policy. We prove that this gives an approximation to the POMDPs optimal value function that is at least as good as methods based on the optimal value function of the underlying MDP. We present experimental evidence that the method finds a good policy for a POMDP with states and observations.Keywords: Partially Observable Markov Decision Problems, Even-odd POMDP, POMD

ScholarsArchive@OSU

Recommended from our members

Two heuristics for solving POMDPs having a delayed need to observe

Author: Dietterich Thomas Glen
Oregon State University. Dept. of Computer Science
Zubek Valentina Bayer
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

A common heuristic for solving Partially Observable Markov Decision Problems POMDPs is to first solve the underlying Markov Decision Process MDP and then construct a POMDP policy by performing a fixed depth lookahead search in the POMDP and evaluating the leaf nodes using the MDP value function. A problem with this approximation is that it does not account for the need to choose actions in order to gain information about the state of the world particularly when those observation actions are needed at some point in the future. This paper proposes two heuristics that are better than the MDP approximation in POMDPs where there is a delayed need to observe. The first approximation introduced in [2] is the even-odd POMDP in which the world is assumed to be fully observable every other time step. The even-odd POMDP can be converted into an equivalent MDP the even-MDP whose value function captures some of the sensing costs of the original POMDP. An online policy consisting in a step lookahead search combined with the value function of the even-MDP gives an approximation to the POMDPs value function that is at least as good as the method based on the value function of the underlying MDP. The second POMDP approximation is applicable to a special kind of POMDP which we call the Cost Observable Markov Decision Problem COMDP. In a COMDP the actions are partitioned into those that change the state of the world and those that are pure observation actions. For such problems we describe the chain-MDP algorithm which in many cases is able to capture more of the sensing costs than the even-odd POMDP approximation. We prove that both heuristics compute value functions that are upper bounded by (i.e., better than) the value function of the underlying MDP and in the case of the even-MDP also lower bounded by the POMDPs optimal value function. We show cases where the chain-MDP online policy is better equal or worse than the even-MDP online policy.Keywords: Cost Observable Markov Decision Problem, POMDP, COMDP, Partially Observable Markov Decision ProblemsKeywords: Cost Observable Markov Decision Problem, POMDP, COMDP, Partially Observable Markov Decision Problem

ScholarsArchive@OSU

Recommended from our members

Integrating learning from examples into the search for diagnostic policies

Author: Bayer-Zubek Valentina
Dietterich Thomas Glen
Oregon State University. Dept. of Computer Science
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

This paper studies the problem of learning diagnostic policies from training examples. A diagnostic policy is a complete description of the decision-making actions of a diagnostician (i.e., tests followed by a diagnostic decision) for all possible combinations of test results. An optimal diagnostic policy is one that minimizes the expected total cost of diagnosing a patient, where the cost is the sum of two components: (a) measurement costs (the costs of performing various diagnostic tests) and (b) misdiagnosis costs (the costs incurred when the patient is incorrectly diagnosed). In most diagnostic settings, there is a tradeoff between these two kinds of costs. A diagnostic policy that minimizes measurement costs usually performs fewer tests and tends to make more diagnostic errors, which are expensive. Conversely, a policy that minimizes misdiagnosis costs usually makes more measurements. This paper formalizes diagnostic decision making as a Markov Decision Process (MDP). It then presents a range of algorithms for solving this MDP. These algorithms can be divided into methods based on systematic search and methods based on greedy search. The paper introduces a new family of systematic algorithms based on the AO* algorithm. To make AO* efficient, the paper describes an admissible heuristic that enables AO* to prune large parts of the search space. The paper also introduces several greedy algorithms including some improvements over previously-published methods. The paper then addresses the question of learning diagnostic policies from examples. When the probabilities of diseases and test results are computed from training data, there is a great danger of overfitting. The paper introduces a range of regularization methods to reduce overfitting. An interesting aspect of these regularizers is that they are integrated into the search algorithms rather than being isolated in a separate learning step prior to searching for a good diagnostic policy. Finally, the paper compares the proposed methods on five benchmark diagnostic data sets. The studies show that in most cases the systematic search methods produce better diagnostic policies than the greedy methods. In addition, the studies show that for training sets of realistic size, the systematic search algorithms are practical on today's desktop computers. Hence, these AO*-based methods are recommended for learning diagnostic policies that seek to minimize the expected total cost of diagnosis.Keywords: diagnostic policy, AO*, Markov decision process, diagnostic decision makin

ScholarsArchive@OSU

Recommended from our members

Exploiting monotonicity via logistic regression in Bayesian network learning

Author: Dietterich Thomas Glen
Oregon State University. Department of Computer Science
Restificar Angelo C.
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

An important challenge in machine learning is to find ways of learning quickly from very small amounts of training data. The only way to learn from small data samples is to constrain the learning process by exploiting background knowledge. In this report, we present a theoretical analysis on the use of constrained logistic regression for estimating conditional probability distribution in Bayesian Networks (BN) by using background knowledge in the form of qualitative monotonicity statements. Such background knowledge is treated as a set of constraints on the parameters of a logistic function during training. Our goal of finding the appropriate BN model is two-fold: (a) we want to exploit any monotonic relationship between random variables that may generally exist as domain knowledge and (b) we want to be able to address the problem of estimating the conditional distribution of a random variable with a large number of parents. We discuss variants of the logistic regression model and present an analysis on the corresponding constraints required to implement monotonicity. More importantly, we outline the problem in some of these variants in terms of the number of parameters and constraints which, in some cases, can grow exponentially with the number of parent variables. To address this problem, we present two variants of the constrained logistic regression model, M[superscipt 2b][subscript CLR] and M³[subscript CLR], in which the number of constraints required to implement monotonicity does not grow exponentially with the number of parents hence providing a practicable method for estimating conditional probabilities with very sparse data.Keywords: logistic regression, Bayesian network learning, monotonicit

ScholarsArchive@OSU

Recommended from our members

Bootstrap methods for the cost-sensitive evaluation of classifiers

Author: Dietterich Thomas Glen
Margineantu Dragos D. (Dragos Dorin)
Oregon State University. Dept. of Computer Science
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

Many machine learning applications require classifiers that minimize an asymmetric cost function rather than the misclassification rate, and several recent papers have addressed this problem. However, these papers have either applied no statistical testing or have applied statistical methods that are not appropriate for the cost-sensitive setting. Without good statistical methods, it is difficult to tell whether these new cost-sensitive methods are better than existing methods that ignore costs, and it is also difficult to tell whether one cost-sensitive method is better than another. To rectify this problem, this paper presents two statistical methods for the cost-sensitive setting. The first constructs a confidence interval for the expected cost of a single classifier. The second constructs a confidence interval for the expected difference in costs of two classifiers. In both cases, the basic idea is to separate the problem of estimating the probabilities of each cell in the confusion matrix (which is independent of the cost matrix) from the problem of computing the expected cost. We show experimentally that these bootstrap tests work better than applying standard z tests based on the normal distribution

ScholarsArchive@OSU

Recommended from our members

Toward harnessing user feedback for machine learning

Author: Burnett Margaret, 1949-
Dietterich Thomas Glen
Drummond Russell
Herlocker Jonathan Lee
Li Lida
Rajaram Vidya
Stumpf Simone
Sullivan Erin
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

There has been little research into how end users might be able to communicate advice to machine learning systems. If this resource--the users themselves--could somehow work hand-in-hand with machine learning systems, the accuracy of learning systems could be improved and the users' understanding and trust of the system could improve as well. We conducted a think-aloud study to see how willing users were to provide feedback and to understand what kinds of feedback users could give. Users were shown explanations of machine learning predictions and asked to provide feedback to improve the predictions. We found that users had no difficulty providing generous amounts of feedback. The kinds of feedback ranged from suggestions for reweighting of features to proposals for new features, feature combinations, relational features, and wholesale changes to the learning algorithm. The results show that user feedback has the potential to significantly improve machine learning systems, but that learning algorithms need to be extended in several ways to be able to assimilate this feedback.Author Keywords: Machine learning, explanations, user feedback for learnin

ScholarsArchive@OSU