13 research outputs found

    Deep Learning Architectures for Novel Problems

    Get PDF
    With convolutional neural networks revolutionizing the computer vision field it is important to extend the capabilities of neural-based systems to dynamic and unrestricted data like graphs. Doing so not only expands the applications of such systems, but also provide more insight into improvements to neural-based systems. Currently most implementations of graph neural networks are based on vertex filtering on fixed adjacency matrices. Although important for a lot of applications, vertex filtering restricts the applications to vertex focused graphs and cannot be efficiently extended to edge focused graphs like social networks. Applications of current systems are mostly limited to images and document references. Beyond the graph applications, this work also explored the usage of convolutional neural networks for intelligent character recognition in a novel way. Most systems define Intelligent Character Recognition as either a recurrent classification problem or image classification. This achieves great performance in a limited environment but does not generalize well on real world applications. This work defines intelligent Character Recognition as a segmentation problem which we show to provide many benefits. The goal of this work was to explore alternatives to current graph neural networks implementations as well as exploring new applications of such system. This work also focused on improving Intelligent Character Recognition techniques on isolated words using deep learning techniques. Due to the contrast between these to contributions this documents was divided into Part I focusing on the graph work, and Part II focusing on the intelligent character recognition work

    Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents

    Full text link
    Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e.g. hours vs. days) because they parallelize better. However, many RL problems require directed exploration because they have reward functions that are sparse or deceptive (i.e. contain local optima), and it is unknown how to encourage such exploration with ES. Here we show that algorithms that have been invented to promote directed exploration in small-scale evolved neural networks via populations of exploring agents, specifically novelty search (NS) and quality diversity (QD) algorithms, can be hybridized with ES to improve its performance on sparse or deceptive deep RL tasks, while retaining scalability. Our experiments confirm that the resultant new algorithms, NS-ES and two QD algorithms, NSR-ES and NSRA-ES, avoid local optima encountered by ES to achieve higher performance on Atari and simulated robots learning to walk around a deceptive trap. This paper thus introduces a family of fast, scalable algorithms for reinforcement learning that are capable of directed exploration. It also adds this new family of exploration algorithms to the RL toolbox and raises the interesting possibility that analogous algorithms with multiple simultaneous paths of exploration might also combine well with existing RL algorithms outside ES

    Generalized Hidden Parameter MDPs:Transferable Model-Based RL in a Handful of Trials

    No full text
    There is broad interest in creating RL agents that can solve many (related) tasks and adapt to new tasks and environments after initial training. Model-based RL leverages learned surrogate models that describe dynamics and rewards of individual tasks, such that planning in a good surrogate can lead to good control of the true system. Rather than solving each task individually from scratch, hierarchical models can exploit the fact that tasks are often related by (unobserved) causal factors of variation in order to achieve efficient generalization, as in learning how the mass of an item affects the force required to lift it can generalize to previously unobserved masses. We propose Generalized Hidden Parameter MDPs (GHP-MDPs) that describe a family of MDPs where both dynamics and reward can change as a function of hidden parameters that vary across tasks. The GHP-MDP augments model-based RL with latent variables that capture these hidden parameters, facilitating transfer across tasks. We also explore a variant of the model that incorporates explicit latent structure mirroring the causal factors of variation across tasks (for instance: agent properties, environmental factors, and goals). We experimentally demonstrate state-of-the-art performance and sample-efficiency on a new challenging MuJoCo task using reward and dynamics latent spaces, while beating a previous state-of-the-art baseline with > 10× less data. Using test-time inference of the latent variables, our approach generalizes in a single episode to novel combinations of dynamics and reward, and to novel rewards
    corecore