11,113 research outputs found

    Number Sequence Prediction Problems for Evaluating Computational Powers of Neural Networks

    Full text link
    Inspired by number series tests to measure human intelligence, we suggest number sequence prediction tasks to assess neural network models' computational powers for solving algorithmic problems. We define the complexity and difficulty of a number sequence prediction task with the structure of the smallest automaton that can generate the sequence. We suggest two types of number sequence prediction problems: the number-level and the digit-level problems. The number-level problems format sequences as 2-dimensional grids of digits and the digit-level problems provide a single digit input per a time step. The complexity of a number-level sequence prediction can be defined with the depth of an equivalent combinatorial logic, and the complexity of a digit-level sequence prediction can be defined with an equivalent state automaton for the generation rule. Experiments with number-level sequences suggest that CNN models are capable of learning the compound operations of sequence generation rules, but the depths of the compound operations are limited. For the digit-level problems, simple GRU and LSTM models can solve some problems with the complexity of finite state automata. Memory augmented models such as Stack-RNN, Attention, and Neural Turing Machines can solve the reverse-order task which has the complexity of simple pushdown automaton. However, all of above cannot solve general Fibonacci, Arithmetic or Geometric sequence generation problems that represent the complexity of queue automata or Turing machines. The results show that our number sequence prediction problems effectively evaluate machine learning models' computational capabilities.Comment: Accepted to 2019 AAAI Conference on Artificial Intelligenc

    Transfer of Temporal Logic Formulas in Reinforcement Learning

    Full text link
    Transferring high-level knowledge from a source task to a target task is an effective way to expedite reinforcement learning (RL). For example, propositional logic and first-order logic have been used as representations of such knowledge. We study the transfer of knowledge between tasks in which the timing of the events matters. We call such tasks temporal tasks. We concretize similarity between temporal tasks through a notion of logical transferability, and develop a transfer learning approach between different yet similar temporal tasks. We first propose an inference technique to extract metric interval temporal logic (MITL) formulas in sequential disjunctive normal form from labeled trajectories collected in RL of the two tasks. If logical transferability is identified through this inference, we construct a timed automaton for each sequential conjunctive subformula of the inferred MITL formulas from both tasks. We perform RL on the extended state which includes the locations and clock valuations of the timed automata for the source task. We then establish mappings between the corresponding components (clocks, locations, etc.) of the timed automata from the two tasks, and transfer the extended Q-functions based on the established mappings. Finally, we perform RL on the extended state for the target task, starting with the transferred extended Q-functions. Our results in two case studies show, depending on how similar the source task and the target task are, that the sampling efficiency for the target task can be improved by up to one order of magnitude by performing RL in the extended state space, and further improved by up to another order of magnitude using the transferred extended Q-functions.Comment: IJCAI'1

    Recent Advances in Physical Reservoir Computing: A Review

    Full text link
    Reservoir computing is a computational framework suited for temporal/sequential data processing. It is derived from several recurrent neural network models, including echo state networks and liquid state machines. A reservoir computing system consists of a reservoir for mapping inputs into a high-dimensional space and a readout for pattern analysis from the high-dimensional states in the reservoir. The reservoir is fixed and only the readout is trained with a simple method such as linear regression and classification. Thus, the major advantage of reservoir computing compared to other recurrent neural networks is fast learning, resulting in low training cost. Another advantage is that the reservoir without adaptive updating is amenable to hardware implementation using a variety of physical systems, substrates, and devices. In fact, such physical reservoir computing has attracted increasing attention in diverse fields of research. The purpose of this review is to provide an overview of recent advances in physical reservoir computing by classifying them according to the type of the reservoir. We discuss the current issues and perspectives related to physical reservoir computing, in order to further expand its practical applications and develop next-generation machine learning systems.Comment: 62 pages, 13 figure

    Learning Simple Algorithms from Examples

    Full text link
    We present an approach for learning simple algorithms such as copying, multi-digit addition and single digit multiplication directly from examples. Our framework consists of a set of interfaces, accessed by a controller. Typical interfaces are 1-D tapes or 2-D grids that hold the input and output data. For the controller, we explore a range of neural network-based models which vary in their ability to abstract the underlying algorithm from training instances and generalize to test examples with many thousands of digits. The controller is trained using QQ-learning with several enhancements and we show that the bottleneck is in the capabilities of the controller rather than in the search incurred by QQ-learning

    An Interdisciplinary Comparison of Sequence Modeling Methods for Next-Element Prediction

    Full text link
    Data of sequential nature arise in many application domains in forms of, e.g. textual data, DNA sequences, and software execution traces. Different research disciplines have developed methods to learn sequence models from such datasets: (i) in the machine learning field methods such as (hidden) Markov models and recurrent neural networks have been developed and successfully applied to a wide-range of tasks, (ii) in process mining process discovery techniques aim to generate human-interpretable descriptive models, and (iii) in the grammar inference field the focus is on finding descriptive models in the form of formal grammars. Despite their different focuses, these fields share a common goal - learning a model that accurately describes the behavior in the underlying data. Those sequence models are generative, i.e, they can predict what elements are likely to occur after a given unfinished sequence. So far, these fields have developed mainly in isolation from each other and no comparison exists. This paper presents an interdisciplinary experimental evaluation that compares sequence modeling techniques on the task of next-element prediction on four real-life sequence datasets. The results indicate that machine learning techniques that generally have no aim at interpretability in terms of accuracy outperform techniques from the process mining and grammar inference fields that aim to yield interpretable models

    Explaining Black Boxes on Sequential Data using Weighted Automata

    Full text link
    Understanding how a learned black box works is of crucial interest for the future of Machine Learning. In this paper, we pioneer the question of the global interpretability of learned black box models that assign numerical values to symbolic sequential data. To tackle that task, we propose a spectral algorithm for the extraction of weighted automata (WA) from such black boxes. This algorithm does not require the access to a dataset or to the inner representation of the black box: the inferred model can be obtained solely by querying the black box, feeding it with inputs and analyzing its outputs. Experiments using Recurrent Neural Networks (RNN) trained on a wide collection of 48 synthetic datasets and 2 real datasets show that the obtained approximation is of great quality.Comment: Published in the Proceedings of the International Conference in Grammatical Inference, September 201

    Verification Artifacts in Cooperative Verification: Survey and Unifying Component Framework

    Full text link
    The goal of cooperative verification is to combine verification approaches in such a way that they work together to verify a system model. In particular, cooperative verifiers provide exchangeable information (verification artifacts) to other verifiers or consume such information from other verifiers with the goal of increasing the overall effectiveness and efficiency of the verification process. This paper first gives an overview over approaches for leveraging strengths of different techniques, algorithms, and tools in order to increase the power and abilities of the state of the art in software verification. Second, we specifically outline cooperative verification approaches and discuss their employed verification artifacts. We formalize all artifacts in a uniform way, thereby fixing their semantics and providing verifiers with a precise meaning of the exchanged information.Comment: 22 pages, 12 figure

    Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets

    Full text link
    In order to build efficient deep recurrent neural architectures, it isessential to analyze the complexity of long distance dependencies(LDDs) of the dataset being modeled. In this context, in this pa-per, we present detailed analysis of the complexity and the degreeof LDDs (orLDD characteristics) exhibited by various sequentialbenchmark datasets. We observe that the datasets sampled from asimilar process or task (e.g. natural language, or sequential MNIST,etc) display similar LDD characteristics. Upon analysing the LDDcharacteristics, we were able to analyze the factors influencingthem; such as (i) number of unique symbols in a dataset, (ii) sizeof the dataset, (iii) number of interacting symbols within a givenLDD, and (iv) the distance between the interacting symbols. Wedemonstrate that analysing LDD characteristics can inform theselection of optimal hyper-parameters for SOTA deep recurrentneural architectures. This analysis can directly contribute to thedevelopment of more accurate and efficient sequential models. Wealso introduce the use of Strictlyk-Piecewise languages as a pro-cess to generate synthesized datasets for language modelling. Theadvantage of these synthesized datasets is that they enable targetedtesting of deep recurrent neural architectures in terms of their abil-ity to model LDDs with different characteristics. Moreover, usinga variety of Strictlyk-Piecewise languages we generate a numberof new benchmarking datasets, and analyse the performance of anumber of SOTA recurrent architectures on these new benchmarks

    Compositional planning in Markov decision processes: Temporal abstraction meets generalized logic composition

    Full text link
    In hierarchical planning for Markov decision processes (MDPs), temporal abstraction allows planning with macro-actions that take place at different time scale in form of sequential composition. In this paper, we propose a novel approach to compositional reasoning and hierarchical planning for MDPs under temporal logic constraints. In addition to sequential composition, we introduce a composition of policies based on generalized logic composition: Given sub-policies for sub-tasks and a new task expressed as logic compositions of subtasks, a semi-optimal policy, which is optimal in planning with only sub-policies, can be obtained by simply composing sub-polices. Thus, a synthesis algorithm is developed to compute optimal policies efficiently by planning with primitive actions, policies for sub-tasks, and the compositions of sub-policies, for maximizing the probability of satisfying temporal logic specifications. We demonstrate the correctness and efficiency of the proposed method in stochastic planning examples with a single agent and multiple task specifications.Comment: 8 pages, 4 figures, 2 tables, accepted as a conference paper for presentation at American Control Conference 201

    A Survey on Two Dimensional Cellular Automata and Its Application in Image Processing

    Full text link
    Parallel algorithms for solving any image processing task is a highly demanded approach in the modern world. Cellular Automata (CA) are the most common and simple models of parallel computation. So, CA has been successfully used in the domain of image processing for the last couple of years. This paper provides a survey of available literatures of some methodologies employed by different researchers to utilize the cellular automata for solving some important problems of image processing. The survey includes some important image processing tasks such as rotation, zooming, translation, segmentation, edge detection, compression and noise reduction of images. Finally, the experimental results of some methodologies are presented.Comment: 10 pages, 10 figures, 4 table
    • …
    corecore