1,585 research outputs found
Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search
Bayesian model-based reinforcement learning is a formally elegant approach to
learning optimal behaviour under model uncertainty, trading off exploration and
exploitation in an ideal way. Unfortunately, finding the resulting
Bayes-optimal policies is notoriously taxing, since the search space becomes
enormous. In this paper we introduce a tractable, sample-based method for
approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our
approach outperformed prior Bayesian model-based RL algorithms by a significant
margin on several well-known benchmark problems -- because it avoids expensive
applications of Bayes rule within the search tree by lazily sampling models
from the current beliefs. We illustrate the advantages of our approach by
showing it working in an infinite state space domain which is qualitatively out
of reach of almost all previous work in Bayesian exploration.Comment: 14 pages, 7 figures, includes supplementary material. Advances in
Neural Information Processing Systems (NIPS) 201
Weka: A machine learning workbench for data mining
The Weka workbench is an organized collection of state-of-the-art machine learning algorithms and data preprocessing tools. The basic way of interacting with these methods is by invoking them from the command line. However, convenient interactive graphical user interfaces are provided for data exploration, for setting up large-scale experiments on distributed computing platforms, and for designing configurations for streamed data processing. These interfaces constitute an advanced environment for experimental data mining. The system is written in Java and distributed under the terms of the GNU General Public License
Methods for Ordinal Peer Grading
MOOCs have the potential to revolutionize higher education with their wide
outreach and accessibility, but they require instructors to come up with
scalable alternates to traditional student evaluation. Peer grading -- having
students assess each other -- is a promising approach to tackling the problem
of evaluation at scale, since the number of "graders" naturally scales with the
number of students. However, students are not trained in grading, which means
that one cannot expect the same level of grading skills as in traditional
settings. Drawing on broad evidence that ordinal feedback is easier to provide
and more reliable than cardinal feedback, it is therefore desirable to allow
peer graders to make ordinal statements (e.g. "project X is better than project
Y") and not require them to make cardinal statements (e.g. "project X is a
B-"). Thus, in this paper we study the problem of automatically inferring
student grades from ordinal peer feedback, as opposed to existing methods that
require cardinal peer feedback. We formulate the ordinal peer grading problem
as a type of rank aggregation problem, and explore several probabilistic models
under which to estimate student grades and grader reliability. We study the
applicability of these methods using peer grading data collected from a real
class -- with instructor and TA grades as a baseline -- and demonstrate the
efficacy of ordinal feedback techniques in comparison to existing cardinal peer
grading methods. Finally, we compare these peer-grading techniques to
traditional evaluation techniques.Comment: Submitted to KDD 201
- ā¦