7,138 research outputs found
Sample Efficient Policy Search for Optimal Stopping Domains
Optimal stopping problems consider the question of deciding when to stop an
observation-generating process in order to maximize a return. We examine the
problem of simultaneously learning and planning in such domains, when data is
collected directly from the environment. We propose GFSE, a simple and flexible
model-free policy search method that reuses data for sample efficiency by
leveraging problem structure. We bound the sample complexity of our approach to
guarantee uniform convergence of policy value estimates, tightening existing
PAC bounds to achieve logarithmic dependence on horizon length for our setting.
We also examine the benefit of our method against prevalent model-based and
model-free approaches on 3 domains taken from diverse fields.Comment: To appear in IJCAI-201
Training an adaptive dialogue policy for interactive learning of visually grounded word meanings
We present a multi-modal dialogue system for interactive learning of
perceptually grounded word meanings from a human tutor. The system integrates
an incremental, semantic parsing/generation framework - Dynamic Syntax and Type
Theory with Records (DS-TTR) - with a set of visual classifiers that are
learned throughout the interaction and which ground the meaning representations
that it produces. We use this system in interaction with a simulated human
tutor to study the effects of different dialogue policies and capabilities on
the accuracy of learned meanings, learning rates, and efforts/costs to the
tutor. We show that the overall performance of the learning agent is affected
by (1) who takes initiative in the dialogues; (2) the ability to express/use
their confidence level about visual attributes; and (3) the ability to process
elliptical and incrementally constructed dialogue turns. Ultimately, we train
an adaptive dialogue policy which optimises the trade-off between classifier
accuracy and tutoring costs.Comment: 11 pages, SIGDIAL 2016 Conferenc
Synthesizing Imperative Programs from Examples Guided by Static Analysis
We present a novel algorithm that synthesizes imperative programs for
introductory programming courses. Given a set of input-output examples and a
partial program, our algorithm generates a complete program that is consistent
with every example. Our key idea is to combine enumerative program synthesis
and static analysis, which aggressively prunes out a large search space while
guaranteeing to find, if any, a correct solution. We have implemented our
algorithm in a tool, called SIMPL, and evaluated it on 30 problems used in
introductory programming courses. The results show that SIMPL is able to solve
the benchmark problems in 6.6 seconds on average.Comment: The paper is accepted in Static Analysis Symposium (SAS) '17. The
submission version is somewhat different from the version in arxiv. The final
version will be uploaded after the camera-ready version is read
Program Synthesis using Natural Language
Interacting with computers is a ubiquitous activity for millions of people.
Repetitive or specialized tasks often require creation of small, often one-off,
programs. End-users struggle with learning and using the myriad of
domain-specific languages (DSLs) to effectively accomplish these tasks.
We present a general framework for constructing program synthesizers that
take natural language (NL) inputs and produce expressions in a target DSL. The
framework takes as input a DSL definition and training data consisting of
NL/DSL pairs. From these it constructs a synthesizer by learning optimal
weights and classifiers (using NLP features) that rank the outputs of a
keyword-programming based translation. We applied our framework to three
domains: repetitive text editing, an intelligent tutoring system, and flight
information queries. On 1200+ English descriptions, the respective synthesizers
rank the desired program as the top-1 and top-3 for 80% and 90% descriptions
respectively
Towards a quantitative evaluation of the relationship between the domain knowledge and the ability to assess peer work
In this work we present the preliminary results provided by the statistical modeling of the cognitive relationship between the knowledge about a topic a the ability to assess peer achievements on the same topic. Our starting point is Bloom's taxonomy of educational objectives in the cognitive domain, and our outcomes confirm the hypothesized ranking. A further consideration that can be derived is that meta-cognitive abilities (e.g., assessment) require deeper domain knowledge
Logistic Knowledge Tracing: A Constrained Framework for Learner Modeling
Adaptive learning technology solutions often use a learner model to trace
learning and make pedagogical decisions. The present research introduces a
formalized methodology for specifying learner models, Logistic Knowledge
Tracing (LKT), that consolidates many extant learner modeling methods. The
strength of LKT is the specification of a symbolic notation system for
alternative logistic regression models that is powerful enough to specify many
extant models in the literature and many new models. To demonstrate the
generality of LKT, we fit 12 models, some variants of well-known models and
some newly devised, to 6 learning technology datasets. The results indicated
that no single learner model was best in all cases, further justifying a broad
approach that considers multiple learner model features and the learning
context. The models presented here avoid student-level fixed parameters to
increase generalizability. We also introduce features to stand in for these
intercepts. We argue that to be maximally applicable, a learner model needs to
adapt to student differences, rather than needing to be pre-parameterized with
the level of each student's ability
- …