32,757 research outputs found

    Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

    Full text link
    The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions

    Robust Multi-Cellular Developmental Design

    Get PDF
    This paper introduces a continuous model for Multi-cellular Developmental Design. The cells are fixed on a 2D grid and exchange "chemicals" with their neighbors during the growth process. The quantity of chemicals that a cell produces, as well as the differentiation value of the cell in the phenotype, are controlled by a Neural Network (the genotype) that takes as inputs the chemicals produced by the neighboring cells at the previous time step. In the proposed model, the number of iterations of the growth process is not pre-determined, but emerges during evolution: only organisms for which the growth process stabilizes give a phenotype (the stable state), others are declared nonviable. The optimization of the controller is done using the NEAT algorithm, that optimizes both the topology and the weights of the Neural Networks. Though each cell only receives local information from its neighbors, the experimental results of the proposed approach on the 'flags' problems (the phenotype must match a given 2D pattern) are almost as good as those of a direct regression approach using the same model with global information. Moreover, the resulting multi-cellular organisms exhibit almost perfect self-healing characteristics

    A Multi Hidden Recurrent Neural Network with a Modified Grey Wolf Optimizer

    Full text link
    Identifying university students' weaknesses results in better learning and can function as an early warning system to enable students to improve. However, the satisfaction level of existing systems is not promising. New and dynamic hybrid systems are needed to imitate this mechanism. A hybrid system (a modified Recurrent Neural Network with an adapted Grey Wolf Optimizer) is used to forecast students' outcomes. This proposed system would improve instruction by the faculty and enhance the students' learning experiences. The results show that a modified recurrent neural network with an adapted Grey Wolf Optimizer has the best accuracy when compared with other models.Comment: 34 pages, published in PLoS ON
    • …
    corecore