13,962 research outputs found
Certified Reinforcement Learning with Logic Guidance
This paper proposes the first model-free Reinforcement Learning (RL)
framework to synthesise policies for unknown, and continuous-state Markov
Decision Processes (MDPs), such that a given linear temporal property is
satisfied. We convert the given property into a Limit Deterministic Buchi
Automaton (LDBA), namely a finite-state machine expressing the property.
Exploiting the structure of the LDBA, we shape a synchronous reward function
on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces
that probabilistically satisfy the linear temporal property. This probability
(certificate) is also calculated in parallel with policy learning when the
state space of the MDP is finite: as such, the RL algorithm produces a policy
that is certified with respect to the property. Under the assumption of finite
state space, theoretical guarantees are provided on the convergence of the RL
algorithm to an optimal policy, maximising the above probability. We also show
that our method produces ''best available'' control policies when the logical
property cannot be satisfied. In the general case of a continuous state space,
we propose a neural network architecture for RL and we empirically show that
the algorithm finds satisfying policies, if there exist such policies. The
performance of the proposed framework is evaluated via a set of numerical
examples and benchmarks, where we observe an improvement of one order of
magnitude in the number of iterations required for the policy synthesis,
compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782
A Boxology of Design Patterns for Hybrid Learning and Reasoning Systems
We propose a set of compositional design patterns to describe a large variety
of systems that combine statistical techniques from machine learning with
symbolic techniques from knowledge representation. As in other areas of
computer science (knowledge engineering, software engineering, ontology
engineering, process mining and others), such design patterns help to
systematize the literature, clarify which combinations of techniques serve
which purposes, and encourage re-use of software components. We have validated
our set of compositional design patterns against a large body of recent
literature.Comment: 12 pages,55 reference
A Tale of Two DRAGGNs: A Hybrid Approach for Interpreting Action-Oriented and Goal-Oriented Instructions
Robots operating alongside humans in diverse, stochastic environments must be
able to accurately interpret natural language commands. These instructions
often fall into one of two categories: those that specify a goal condition or
target state, and those that specify explicit actions, or how to perform a
given task. Recent approaches have used reward functions as a semantic
representation of goal-based commands, which allows for the use of a
state-of-the-art planner to find a policy for the given task. However, these
reward functions cannot be directly used to represent action-oriented commands.
We introduce a new hybrid approach, the Deep Recurrent Action-Goal Grounding
Network (DRAGGN), for task grounding and execution that handles natural
language from either category as input, and generalizes to unseen environments.
Our robot-simulation results demonstrate that a system successfully
interpreting both goal-oriented and action-oriented task specifications brings
us closer to robust natural language understanding for human-robot interaction.Comment: Accepted at the 1st Workshop on Language Grounding for Robotics at
ACL 201
A Tale of Two DRAGGNs: A Hybrid Approach for Interpreting Action-Oriented and Goal-Oriented Instructions
Robots operating alongside humans in diverse, stochastic environments must be
able to accurately interpret natural language commands. These instructions
often fall into one of two categories: those that specify a goal condition or
target state, and those that specify explicit actions, or how to perform a
given task. Recent approaches have used reward functions as a semantic
representation of goal-based commands, which allows for the use of a
state-of-the-art planner to find a policy for the given task. However, these
reward functions cannot be directly used to represent action-oriented commands.
We introduce a new hybrid approach, the Deep Recurrent Action-Goal Grounding
Network (DRAGGN), for task grounding and execution that handles natural
language from either category as input, and generalizes to unseen environments.
Our robot-simulation results demonstrate that a system successfully
interpreting both goal-oriented and action-oriented task specifications brings
us closer to robust natural language understanding for human-robot interaction.Comment: Accepted at the 1st Workshop on Language Grounding for Robotics at
ACL 201
Recommended from our members
Neurons and symbols: a manifesto
We discuss the purpose of neural-symbolic integration including its principles, mechanisms and applications. We outline a cognitive computational model for neural-symbolic integration, position the model in the broader context of multi-agent systems, machine learning and automated reasoning, and list some of the challenges for the area of
neural-symbolic computation to achieve the promise of effective integration of robust learning and expressive reasoning under uncertainty
- …