11,113 research outputs found
Number Sequence Prediction Problems for Evaluating Computational Powers of Neural Networks
Inspired by number series tests to measure human intelligence, we suggest
number sequence prediction tasks to assess neural network models' computational
powers for solving algorithmic problems. We define the complexity and
difficulty of a number sequence prediction task with the structure of the
smallest automaton that can generate the sequence. We suggest two types of
number sequence prediction problems: the number-level and the digit-level
problems. The number-level problems format sequences as 2-dimensional grids of
digits and the digit-level problems provide a single digit input per a time
step. The complexity of a number-level sequence prediction can be defined with
the depth of an equivalent combinatorial logic, and the complexity of a
digit-level sequence prediction can be defined with an equivalent state
automaton for the generation rule. Experiments with number-level sequences
suggest that CNN models are capable of learning the compound operations of
sequence generation rules, but the depths of the compound operations are
limited. For the digit-level problems, simple GRU and LSTM models can solve
some problems with the complexity of finite state automata. Memory augmented
models such as Stack-RNN, Attention, and Neural Turing Machines can solve the
reverse-order task which has the complexity of simple pushdown automaton.
However, all of above cannot solve general Fibonacci, Arithmetic or Geometric
sequence generation problems that represent the complexity of queue automata or
Turing machines. The results show that our number sequence prediction problems
effectively evaluate machine learning models' computational capabilities.Comment: Accepted to 2019 AAAI Conference on Artificial Intelligenc
Transfer of Temporal Logic Formulas in Reinforcement Learning
Transferring high-level knowledge from a source task to a target task is an
effective way to expedite reinforcement learning (RL). For example,
propositional logic and first-order logic have been used as representations of
such knowledge. We study the transfer of knowledge between tasks in which the
timing of the events matters. We call such tasks temporal tasks. We concretize
similarity between temporal tasks through a notion of logical transferability,
and develop a transfer learning approach between different yet similar temporal
tasks. We first propose an inference technique to extract metric interval
temporal logic (MITL) formulas in sequential disjunctive normal form from
labeled trajectories collected in RL of the two tasks. If logical
transferability is identified through this inference, we construct a timed
automaton for each sequential conjunctive subformula of the inferred MITL
formulas from both tasks. We perform RL on the extended state which includes
the locations and clock valuations of the timed automata for the source task.
We then establish mappings between the corresponding components (clocks,
locations, etc.) of the timed automata from the two tasks, and transfer the
extended Q-functions based on the established mappings. Finally, we perform RL
on the extended state for the target task, starting with the transferred
extended Q-functions. Our results in two case studies show, depending on how
similar the source task and the target task are, that the sampling efficiency
for the target task can be improved by up to one order of magnitude by
performing RL in the extended state space, and further improved by up to
another order of magnitude using the transferred extended Q-functions.Comment: IJCAI'1
Recent Advances in Physical Reservoir Computing: A Review
Reservoir computing is a computational framework suited for
temporal/sequential data processing. It is derived from several recurrent
neural network models, including echo state networks and liquid state machines.
A reservoir computing system consists of a reservoir for mapping inputs into a
high-dimensional space and a readout for pattern analysis from the
high-dimensional states in the reservoir. The reservoir is fixed and only the
readout is trained with a simple method such as linear regression and
classification. Thus, the major advantage of reservoir computing compared to
other recurrent neural networks is fast learning, resulting in low training
cost. Another advantage is that the reservoir without adaptive updating is
amenable to hardware implementation using a variety of physical systems,
substrates, and devices. In fact, such physical reservoir computing has
attracted increasing attention in diverse fields of research. The purpose of
this review is to provide an overview of recent advances in physical reservoir
computing by classifying them according to the type of the reservoir. We
discuss the current issues and perspectives related to physical reservoir
computing, in order to further expand its practical applications and develop
next-generation machine learning systems.Comment: 62 pages, 13 figure
Learning Simple Algorithms from Examples
We present an approach for learning simple algorithms such as copying,
multi-digit addition and single digit multiplication directly from examples.
Our framework consists of a set of interfaces, accessed by a controller.
Typical interfaces are 1-D tapes or 2-D grids that hold the input and output
data. For the controller, we explore a range of neural network-based models
which vary in their ability to abstract the underlying algorithm from training
instances and generalize to test examples with many thousands of digits. The
controller is trained using -learning with several enhancements and we show
that the bottleneck is in the capabilities of the controller rather than in the
search incurred by -learning
An Interdisciplinary Comparison of Sequence Modeling Methods for Next-Element Prediction
Data of sequential nature arise in many application domains in forms of, e.g.
textual data, DNA sequences, and software execution traces. Different research
disciplines have developed methods to learn sequence models from such datasets:
(i) in the machine learning field methods such as (hidden) Markov models and
recurrent neural networks have been developed and successfully applied to a
wide-range of tasks, (ii) in process mining process discovery techniques aim to
generate human-interpretable descriptive models, and (iii) in the grammar
inference field the focus is on finding descriptive models in the form of
formal grammars. Despite their different focuses, these fields share a common
goal - learning a model that accurately describes the behavior in the
underlying data. Those sequence models are generative, i.e, they can predict
what elements are likely to occur after a given unfinished sequence. So far,
these fields have developed mainly in isolation from each other and no
comparison exists. This paper presents an interdisciplinary experimental
evaluation that compares sequence modeling techniques on the task of
next-element prediction on four real-life sequence datasets. The results
indicate that machine learning techniques that generally have no aim at
interpretability in terms of accuracy outperform techniques from the process
mining and grammar inference fields that aim to yield interpretable models
Explaining Black Boxes on Sequential Data using Weighted Automata
Understanding how a learned black box works is of crucial interest for the
future of Machine Learning. In this paper, we pioneer the question of the
global interpretability of learned black box models that assign numerical
values to symbolic sequential data. To tackle that task, we propose a spectral
algorithm for the extraction of weighted automata (WA) from such black boxes.
This algorithm does not require the access to a dataset or to the inner
representation of the black box: the inferred model can be obtained solely by
querying the black box, feeding it with inputs and analyzing its outputs.
Experiments using Recurrent Neural Networks (RNN) trained on a wide collection
of 48 synthetic datasets and 2 real datasets show that the obtained
approximation is of great quality.Comment: Published in the Proceedings of the International Conference in
Grammatical Inference, September 201
Verification Artifacts in Cooperative Verification: Survey and Unifying Component Framework
The goal of cooperative verification is to combine verification approaches in
such a way that they work together to verify a system model. In particular,
cooperative verifiers provide exchangeable information (verification artifacts)
to other verifiers or consume such information from other verifiers with the
goal of increasing the overall effectiveness and efficiency of the verification
process. This paper first gives an overview over approaches for leveraging
strengths of different techniques, algorithms, and tools in order to increase
the power and abilities of the state of the art in software verification.
Second, we specifically outline cooperative verification approaches and discuss
their employed verification artifacts. We formalize all artifacts in a uniform
way, thereby fixing their semantics and providing verifiers with a precise
meaning of the exchanged information.Comment: 22 pages, 12 figure
Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
In order to build efficient deep recurrent neural architectures, it
isessential to analyze the complexity of long distance dependencies(LDDs) of
the dataset being modeled. In this context, in this pa-per, we present detailed
analysis of the complexity and the degreeof LDDs (orLDD characteristics)
exhibited by various sequentialbenchmark datasets. We observe that the datasets
sampled from asimilar process or task (e.g. natural language, or sequential
MNIST,etc) display similar LDD characteristics. Upon analysing the
LDDcharacteristics, we were able to analyze the factors influencingthem; such
as (i) number of unique symbols in a dataset, (ii) sizeof the dataset, (iii)
number of interacting symbols within a givenLDD, and (iv) the distance between
the interacting symbols. Wedemonstrate that analysing LDD characteristics can
inform theselection of optimal hyper-parameters for SOTA deep recurrentneural
architectures. This analysis can directly contribute to thedevelopment of more
accurate and efficient sequential models. Wealso introduce the use of
Strictlyk-Piecewise languages as a pro-cess to generate synthesized datasets
for language modelling. Theadvantage of these synthesized datasets is that they
enable targetedtesting of deep recurrent neural architectures in terms of their
abil-ity to model LDDs with different characteristics. Moreover, usinga variety
of Strictlyk-Piecewise languages we generate a numberof new benchmarking
datasets, and analyse the performance of anumber of SOTA recurrent
architectures on these new benchmarks
Compositional planning in Markov decision processes: Temporal abstraction meets generalized logic composition
In hierarchical planning for Markov decision processes (MDPs), temporal
abstraction allows planning with macro-actions that take place at different
time scale in form of sequential composition. In this paper, we propose a novel
approach to compositional reasoning and hierarchical planning for MDPs under
temporal logic constraints. In addition to sequential composition, we introduce
a composition of policies based on generalized logic composition: Given
sub-policies for sub-tasks and a new task expressed as logic compositions of
subtasks, a semi-optimal policy, which is optimal in planning with only
sub-policies, can be obtained by simply composing sub-polices. Thus, a
synthesis algorithm is developed to compute optimal policies efficiently by
planning with primitive actions, policies for sub-tasks, and the compositions
of sub-policies, for maximizing the probability of satisfying temporal logic
specifications. We demonstrate the correctness and efficiency of the proposed
method in stochastic planning examples with a single agent and multiple task
specifications.Comment: 8 pages, 4 figures, 2 tables, accepted as a conference paper for
presentation at American Control Conference 201
A Survey on Two Dimensional Cellular Automata and Its Application in Image Processing
Parallel algorithms for solving any image processing task is a highly
demanded approach in the modern world. Cellular Automata (CA) are the most
common and simple models of parallel computation. So, CA has been successfully
used in the domain of image processing for the last couple of years. This paper
provides a survey of available literatures of some methodologies employed by
different researchers to utilize the cellular automata for solving some
important problems of image processing. The survey includes some important
image processing tasks such as rotation, zooming, translation, segmentation,
edge detection, compression and noise reduction of images. Finally, the
experimental results of some methodologies are presented.Comment: 10 pages, 10 figures, 4 table
- …