32,757 research outputs found
Batch Reinforcement Learning on the Industrial Benchmark: First Experiences
The Particle Swarm Optimization Policy (PSO-P) has been recently introduced
and proven to produce remarkable results on interacting with academic
reinforcement learning benchmarks in an off-policy, batch-based setting. To
further investigate the properties and feasibility on real-world applications,
this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a
novel reinforcement learning (RL) benchmark that aims at being realistic by
including a variety of aspects found in industrial applications, like
continuous state and action spaces, a high dimensional, partially observable
state space, delayed effects, and complex stochasticity. The experimental
results of PSO-P on IB are compared to results of closed-form control policies
derived from the model-based Recurrent Control Neural Network (RCNN) and the
model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not
only of interest for academic benchmarks, but also for real-world industrial
applications, since it also yielded the best performing policy in our IB
setting. Compared to other well established RL techniques, PSO-P produced
outstanding results in performance and robustness, requiring only a relatively
low amount of effort in finding adequate parameters or making complex design
decisions
Robust Multi-Cellular Developmental Design
This paper introduces a continuous model for Multi-cellular Developmental
Design. The cells are fixed on a 2D grid and exchange "chemicals" with their
neighbors during the growth process. The quantity of chemicals that a cell
produces, as well as the differentiation value of the cell in the phenotype,
are controlled by a Neural Network (the genotype) that takes as inputs the
chemicals produced by the neighboring cells at the previous time step. In the
proposed model, the number of iterations of the growth process is not
pre-determined, but emerges during evolution: only organisms for which the
growth process stabilizes give a phenotype (the stable state), others are
declared nonviable. The optimization of the controller is done using the NEAT
algorithm, that optimizes both the topology and the weights of the Neural
Networks. Though each cell only receives local information from its neighbors,
the experimental results of the proposed approach on the 'flags' problems (the
phenotype must match a given 2D pattern) are almost as good as those of a
direct regression approach using the same model with global information.
Moreover, the resulting multi-cellular organisms exhibit almost perfect
self-healing characteristics
A Multi Hidden Recurrent Neural Network with a Modified Grey Wolf Optimizer
Identifying university students' weaknesses results in better learning and
can function as an early warning system to enable students to improve. However,
the satisfaction level of existing systems is not promising. New and dynamic
hybrid systems are needed to imitate this mechanism. A hybrid system (a
modified Recurrent Neural Network with an adapted Grey Wolf Optimizer) is used
to forecast students' outcomes. This proposed system would improve instruction
by the faculty and enhance the students' learning experiences. The results show
that a modified recurrent neural network with an adapted Grey Wolf Optimizer
has the best accuracy when compared with other models.Comment: 34 pages, published in PLoS ON
- …