Search CORE

66,034 research outputs found

Sequential Learning for Adaptive Critic Design: An Industrial Control Application

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

This paper investigates the feasibility of applying reinforcement learning (RL) concepts to industrial process optimisation. A model-free action-dependent adaptive critic design (ADAC), coupled with sequential learning neural network training, is proposed as an online RL strategy suitable for both modelling and controller optimisation. The proposed strategy is evaluated on data from an industrial grinding process used in the manufacture of disk drives. Comparison with a proprietary control system shows that the proposed RL technique is able to achieve comparable performance without any manual intervention

MURAL - Maynooth University Research Archive Library

Recommended from our members

Reinforcement Learning for Hybrid and Plug-In Hybrid Electric Vehicle Energy Management: Recent Advances and Prospects

Author: Barth Matthew
Hu Xiaosong
Liu Teng
Qi Xuewei
Publication venue: eScholarship, University of California
Publication date: 01/09/2019
Field of study

eScholarship - University of California

Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration

Author: Gotlieb Arnaud
Marijan Dusica
Mossige Morten
Spieker Helge
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/11/2018
Field of study

Testing in Continuous Integration (CI) involves test case prioritization, selection, and execution at each cycle. Selecting the most promising test cases to detect bugs is hard if there are uncertainties on the impact of committed code changes or, if traceability links between code and tests are not available. This paper introduces Retecs, a new method for automatically learning test case selection and prioritization in CI with the goal to minimize the round-trip time between code commits and developer feedback on failed test cases. The Retecs method uses reinforcement learning to select and prioritize test cases according to their duration, previous last execution and failure history. In a constantly changing environment, where new test cases are created and obsolete test cases are deleted, the Retecs method learns to prioritize error-prone test cases higher under guidance of a reward function and by observing previous CI cycles. By applying Retecs on data extracted from three industrial case studies, we show for the first time that reinforcement learning enables fruitful automatic adaptive test case selection and prioritization in CI and regression testing.Comment: Spieker, H., Gotlieb, A., Marijan, D., & Mossige, M. (2017). Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration. In Proceedings of 26th International Symposium on Software Testing and Analysis (ISSTA'17) (pp. 12--22). AC

arXiv.org e-Print Archive

Crossref

Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

Author: Hein Daniel
Hentschel Alexander
Runkler Thomas A.
Sterzing Volkmar
Tokic Michel
Udluft Steffen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/07/2017
Field of study

The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions

arXiv.org e-Print Archive

Crossref