Search CORE

1,294 research outputs found

Certified Reinforcement Learning with Logic Guidance

Author: Abate Alessandro
Hasanbeig Mohammadhosein
Kroening Daniel
Publication venue
Publication date: 10/02/2020
Field of study

This paper proposes the first model-free Reinforcement Learning (RL) framework to synthesise policies for unknown, and continuous-state Markov Decision Processes (MDPs), such that a given linear temporal property is satisfied. We convert the given property into a Limit Deterministic Buchi Automaton (LDBA), namely a finite-state machine expressing the property. Exploiting the structure of the LDBA, we shape a synchronous reward function on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces that probabilistically satisfy the linear temporal property. This probability (certificate) is also calculated in parallel with policy learning when the state space of the MDP is finite: as such, the RL algorithm produces a policy that is certified with respect to the property. Under the assumption of finite state space, theoretical guarantees are provided on the convergence of the RL algorithm to an optimal policy, maximising the above probability. We also show that our method produces ''best available'' control policies when the logical property cannot be satisfied. In the general case of a continuous state space, we propose a neural network architecture for RL and we empirically show that the algorithm finds satisfying policies, if there exist such policies. The performance of the proposed framework is evaluated via a set of numerical examples and benchmarks, where we observe an improvement of one order of magnitude in the number of iterations required for the policy synthesis, compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782

arXiv.org e-Print Archive

Dynamics of Internal Models in Game Players

Author: Axelrod
Bray
Ikegami
Ikegami
Kalai
Lindgren
Lorberbaum
Makoto Taiji
Matsushima
Molander
Muller
Pollack
Rashevsky
Rubinstein
Rössler
Takashi Ikegami
Tani
Williams
Publication venue: 'Elsevier BV'
Publication date: 01/01/1998
Field of study

A new approach for the study of social games and communications is proposed. Games are simulated between cognitive players who build the opponent's internal model and decide their next strategy from predictions based on the model. In this paper, internal models are constructed by the recurrent neural network (RNN), and the iterated prisoner's dilemma game is performed. The RNN allows us to express the internal model in a geometrical shape. The complicated transients of actions are observed before the stable mutually defecting equilibrium is reached. During the transients, the model shape also becomes complicated and often experiences chaotic changes. These new chaotic dynamics of internal models reflect the dynamical and high-dimensional rugged landscape of the internal model space.Comment: 19 pages, 6 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Number Sequence Prediction Problems for Evaluating Computational Powers of Neural Networks

Author: Jung Kyomin
Kim Segwang
Nam Hyoungwook
Publication venue
Publication date: 10/11/2018
Field of study

Inspired by number series tests to measure human intelligence, we suggest number sequence prediction tasks to assess neural network models' computational powers for solving algorithmic problems. We define the complexity and difficulty of a number sequence prediction task with the structure of the smallest automaton that can generate the sequence. We suggest two types of number sequence prediction problems: the number-level and the digit-level problems. The number-level problems format sequences as 2-dimensional grids of digits and the digit-level problems provide a single digit input per a time step. The complexity of a number-level sequence prediction can be defined with the depth of an equivalent combinatorial logic, and the complexity of a digit-level sequence prediction can be defined with an equivalent state automaton for the generation rule. Experiments with number-level sequences suggest that CNN models are capable of learning the compound operations of sequence generation rules, but the depths of the compound operations are limited. For the digit-level problems, simple GRU and LSTM models can solve some problems with the complexity of finite state automata. Memory augmented models such as Stack-RNN, Attention, and Neural Turing Machines can solve the reverse-order task which has the complexity of simple pushdown automaton. However, all of above cannot solve general Fibonacci, Arithmetic or Geometric sequence generation problems that represent the complexity of queue automata or Turing machines. The results show that our number sequence prediction problems effectively evaluate machine learning models' computational capabilities.Comment: Accepted to 2019 AAAI Conference on Artificial Intelligenc

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications