Search CORE

5,670 research outputs found

Learning Task Specifications from Demonstrations

Author: Ho Mark K.
Jha Susmit
Seshia Sanjit A.
Tiwari Ashish
Vazquez-Chanlatte Marcell
Publication venue
Publication date: 01/01/2018
Field of study

Real world applications often naturally decompose into several sub-tasks. In many settings (e.g., robotics) demonstrations provide a natural way to specify the sub-tasks. However, most methods for learning from demonstrations either do not provide guarantees that the artifacts learned for the sub-tasks can be safely recombined or limit the types of composition available. Motivated by this deficit, we consider the problem of inferring Boolean non-Markovian rewards (also known as logical trace properties or specifications) from demonstrations provided by an agent operating in an uncertain, stochastic environment. Crucially, specifications admit well-defined composition rules that are typically easy to interpret. In this paper, we formulate the specification inference task as a maximum a posteriori (MAP) probability inference problem, apply the principle of maximum entropy to derive an analytic demonstration likelihood model and give an efficient approach to search for the most likely specification in a large candidate pool of specifications. In our experiments, we demonstrate how learning specifications can help avoid common problems that often arise due to ad-hoc reward composition.Comment: NIPS 201

arXiv.org e-Print Archive

eScholarship - University of California

Hybrid Reinforcement Learning with Expert State Sequences

Author: Campbell Murray
Chang Shiyu
Guo Xiaoxiao
Tesauro Gerald
Yu Mo
Publication venue
Publication date: 10/03/2019
Field of study

Existing imitation learning approaches often require that the complete demonstration data, including sequences of actions and states, are available. In this paper, we consider a more realistic and difficult scenario where a reinforcement learning agent only has access to the state sequences of an expert, while the expert actions are unobserved. We propose a novel tensor-based model to infer the unobserved actions of the expert state sequences. The policy of the agent is then optimized via a hybrid objective combining reinforcement learning and imitation learning. We evaluated our hybrid approach on an illustrative domain and Atari games. The empirical results show that (1) the agents are able to leverage state expert sequences to learn faster than pure reinforcement learning baselines, (2) our tensor-based action inference model is advantageous compared to standard deep neural networks in inferring expert actions, and (3) the hybrid policy optimization objective is robust against noise in expert state sequences.Comment: AAAI 2019; https://github.com/XiaoxiaoGuo/tensor4r

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Neuro-fuzzy knowledge processing in intelligent learning environments for improved student diagnosis

Author: Grigoriadou M.
Magoulas George D.
Samarakou M.
Stathacopoulou R.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

In this paper, a neural network implementation for a fuzzy logic-based model of the diagnostic process is proposed as a means to achieve accurate student diagnosis and updates of the student model in Intelligent Learning Environments. The neuro-fuzzy synergy allows the diagnostic model to some extent "imitate" teachers in diagnosing students' characteristics, and equips the intelligent learning environment with reasoning capabilities that can be further used to drive pedagogical decisions depending on the student learning style. The neuro-fuzzy implementation helps to encode both structured and non-structured teachers' knowledge: when teachers' reasoning is available and well defined, it can be encoded in the form of fuzzy rules; when teachers' reasoning is not well defined but is available through practical examples illustrating their experience, then the networks can be trained to represent this experience. The proposed approach has been tested in diagnosing aspects of student's learning style in a discovery-learning environment that aims to help students to construct the concepts of vectors in physics and mathematics. The diagnosis outcomes of the model have been compared against the recommendations of a group of five experienced teachers, and the results produced by two alternative soft computing methods. The results of our pilot study show that the neuro-fuzzy model successfully manages the inherent uncertainty of the diagnostic process; especially for marginal cases, i.e. where it is very difficult, even for human tutors, to diagnose and accurately evaluate students by directly synthesizing subjective and, some times, conflicting judgments

CiteSeerX

Crossref

Birkbeck Institutional Research Online

Sciduction: Combining Induction, Deduction, and Structure for Verification and Synthesis

Author: Seshia Sanjit A.
Publication venue
Publication date: 01/01/2011
Field of study

Even with impressive advances in automated formal methods, certain problems in system verification and synthesis remain challenging. Examples include the verification of quantitative properties of software involving constraints on timing and energy consumption, and the automatic synthesis of systems from specifications. The major challenges include environment modeling, incompleteness in specifications, and the complexity of underlying decision problems. This position paper proposes sciduction, an approach to tackle these challenges by integrating inductive inference, deductive reasoning, and structure hypotheses. Deductive reasoning, which leads from general rules or concepts to conclusions about specific problem instances, includes techniques such as logical inference and constraint solving. Inductive inference, which generalizes from specific instances to yield a concept, includes algorithmic learning from examples. Structure hypotheses are used to define the class of artifacts, such as invariants or program fragments, generated during verification or synthesis. Sciduction constrains inductive and deductive reasoning using structure hypotheses, and actively combines inductive and deductive reasoning: for instance, deductive techniques generate examples for learning, and inductive reasoning is used to guide the deductive engines. We illustrate this approach with three applications: (i) timing analysis of software; (ii) synthesis of loop-free programs, and (iii) controller synthesis for hybrid systems. Some future applications are also discussed

arXiv.org e-Print Archive

CiteSeerX

Framework to Enhance Teaching and Learning in System Analysis and Unified Modelling Language

Author: Birt James R.
Cowling Michael A.
Munoz Juan Carlos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Cowling, MA ORCiD: 0000-0003-1444-1563; Munoz Carpio, JC ORCiD: 0000-0003-0251-5510Systems Analysis modelling is considered foundational for Information and Communication Technology (ICT) students, with introductory and advanced units included in nearly all ICT and computer science degrees. Yet despite this, novice systems analysts (learners) find modelling and systems thinking quite difficult to learn and master. This makes the process of teaching the fundamentals frustrating and time intensive. This paper will discuss the foundational problems that learners face when learning Systems Analysis modelling. Through a systematic literature review, a framework will be proposed based on the key problems that novice learners experience. In this proposed framework, a sequence of activities has been developed to facilitate understanding of the requirements, solutions and incremental modelling. An example is provided illustrating how the framework could be used to incorporate visualization and gaming elements into a Systems Analysis classroom; therefore, improving motivation and learning. Through this work, a greater understanding of the approach to teaching modelling within the computer science classroom will be provided, as well as a framework to guide future teaching activities

Bond University Research Portal

aCQUIRe

ACQUIRE