Search CORE

7,138 research outputs found

Sample Efficient Policy Search for Optimal Stopping Domains

Author: Brunskill Emma
Dann Christoph
Goel Karan
Publication venue
Publication date: 24/05/2017
Field of study

Optimal stopping problems consider the question of deciding when to stop an observation-generating process in order to maximize a return. We examine the problem of simultaneously learning and planning in such domains, when data is collected directly from the environment. We propose GFSE, a simple and flexible model-free policy search method that reuses data for sample efficiency by leveraging problem structure. We bound the sample complexity of our approach to guarantee uniform convergence of policy value estimates, tightening existing PAC bounds to achieve logarithmic dependence on horizon length for our setting. We also examine the benefit of our method against prevalent model-based and model-free approaches on 3 domains taken from diverse fields.Comment: To appear in IJCAI-201

arXiv.org e-Print Archive

Crossref

Training an adaptive dialogue policy for interactive learning of visually grounded word meanings

Author: Eshghi Arash
Lemon Oliver
Yu Yanchao
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 15/09/2016
Field of study

We present a multi-modal dialogue system for interactive learning of perceptually grounded word meanings from a human tutor. The system integrates an incremental, semantic parsing/generation framework - Dynamic Syntax and Type Theory with Records (DS-TTR) - with a set of visual classifiers that are learned throughout the interaction and which ground the meaning representations that it produces. We use this system in interaction with a simulated human tutor to study the effects of different dialogue policies and capabilities on the accuracy of learned meanings, learning rates, and efforts/costs to the tutor. We show that the overall performance of the learning agent is affected by (1) who takes initiative in the dialogues; (2) the ability to express/use their confidence level about visual attributes; and (3) the ability to process elliptical and incrementally constructed dialogue turns. Ultimately, we train an adaptive dialogue policy which optimises the trade-off between classifier accuracy and tutoring costs.Comment: 11 pages, SIGDIAL 2016 Conferenc

arXiv.org e-Print Archive

Heriot Watt Pure

Synthesizing Imperative Programs from Examples Guided by Static Analysis

Author: A Albarghouthi
A Ireland
AV Aho
G Katz
MA Colón
R Singh
T Gvero
Publication venue
Publication date: 13/06/2017
Field of study

We present a novel algorithm that synthesizes imperative programs for introductory programming courses. Given a set of input-output examples and a partial program, our algorithm generates a complete program that is consistent with every example. Our key idea is to combine enumerative program synthesis and static analysis, which aggressively prunes out a large search space while guaranteeing to find, if any, a correct solution. We have implemented our algorithm in a tool, called SIMPL, and evaluated it on 30 problems used in introductory programming courses. The results show that SIMPL is able to solve the benchmark problems in 6.6 seconds on average.Comment: The paper is accepted in Static Analysis Symposium (SAS) '17. The submission version is somewhat different from the version in arxiv. The final version will be uploaded after the camera-ready version is read

arXiv.org e-Print Archive

Crossref

Program Synthesis using Natural Language

Author: Aditya Desai
Amey Karkare
Mark Marron
Msr Redmond
Msr Redmond
Nidhi Jain
Roy
Sailesh R Subhajit
Sumit Gulwani
Vineet Hingorani
Publication venue
Publication date: 01/09/2015
Field of study

Interacting with computers is a ubiquitous activity for millions of people. Repetitive or specialized tasks often require creation of small, often one-off, programs. End-users struggle with learning and using the myriad of domain-specific languages (DSLs) to effectively accomplish these tasks. We present a general framework for constructing program synthesizers that take natural language (NL) inputs and produce expressions in a target DSL. The framework takes as input a DSL definition and training data consisting of NL/DSL pairs. From these it constructs a synthesizer by learning optimal weights and classifiers (using NLP features) that rank the outputs of a keyword-programming based translation. We applied our framework to three domains: repetitive text editing, an intelligent tutoring system, and flight information queries. On 1200+ English descriptions, the respective synthesizers rank the desired program as the top-1 and top-3 for 80% and 90% descriptions respectively

arXiv.org e-Print Archive

CiteSeerX

Towards a quantitative evaluation of the relationship between the domain knowledge and the ability to assess peer work

Author: DE MARSICO Maria
Sterbini Andrea
Temperini Marco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

In this work we present the preliminary results provided by the statistical modeling of the cognitive relationship between the knowledge about a topic a the ability to assess peer achievements on the same topic. Our starting point is Bloom's taxonomy of educational objectives in the cognitive domain, and our outcomes confirm the hypothesized ranking. A further consideration that can be derived is that meta-cognitive abilities (e.g., assessment) require deeper domain knowledge

Archivio della ricerca- Università di Roma La Sapienza

Logistic Knowledge Tracing: A Constrained Framework for Learner Modeling

Author: Eglington Luke G.
Harrell-Williams Leigh M.
Pavlik Jr., Philip I.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/01/2021
Field of study

Adaptive learning technology solutions often use a learner model to trace learning and make pedagogical decisions. The present research introduces a formalized methodology for specifying learner models, Logistic Knowledge Tracing (LKT), that consolidates many extant learner modeling methods. The strength of LKT is the specification of a symbolic notation system for alternative logistic regression models that is powerful enough to specify many extant models in the literature and many new models. To demonstrate the generality of LKT, we fit 12 models, some variants of well-known models and some newly devised, to 6 learning technology datasets. The results indicated that no single learner model was best in all cases, further justifying a broad approach that considers multiple learner model features and the learning context. The models presented here avoid student-level fixed parameters to increase generalizability. We also introduce features to stand in for these intercepts. We argue that to be maximally applicable, a learner model needs to adapt to student differences, rather than needing to be pre-parameterized with the level of each student's ability

arXiv.org e-Print Archive

University of Memphis Digital Commons