1,273 research outputs found

    Generating Interpretable Fuzzy Controllers using Particle Swarm Optimization and Genetic Programming

    Full text link
    Autonomously training interpretable control strategies, called policies, using pre-existing plant trajectory data is of great interest in industrial applications. Fuzzy controllers have been used in industry for decades as interpretable and efficient system controllers. In this study, we introduce a fuzzy genetic programming (GP) approach called fuzzy GP reinforcement learning (FGPRL) that can select the relevant state features, determine the size of the required fuzzy rule set, and automatically adjust all the controller parameters simultaneously. Each GP individual's fitness is computed using model-based batch reinforcement learning (RL), which first trains a model using available system samples and subsequently performs Monte Carlo rollouts to predict each policy candidate's performance. We compare FGPRL to an extended version of a related method called fuzzy particle swarm reinforcement learning (FPSRL), which uses swarm intelligence to tune the fuzzy policy parameters. Experiments using an industrial benchmark show that FGPRL is able to autonomously learn interpretable fuzzy policies with high control performance.Comment: Accepted at Genetic and Evolutionary Computation Conference 2018 (GECCO '18

    Q-LEARNING, POLICY ITERATION AND ACTOR-CRITIC REINFORCEMENT LEARNING COMBINED WITH METAHEURISTIC ALGORITHMS IN SERVO SYSTEM CONTROL

    Get PDF
    This paper carries out the performance analysis of three control system structures and approaches, which combine Reinforcement Learning (RL) and Metaheuristic Algorithms (MAs) as representative optimization algorithms. In the first approach, the Gravitational Search Algorithm (GSA) is employed to initialize the parameters (weights and biases) of the Neural Networks (NNs) involved in Deep Q-Learning by replacing the traditional way of initializing the NNs based on random generated values. In the second approach, the Grey Wolf Optimizer (GWO) algorithm is employed to train the policy NN in Policy Iteration RL-based control. In the third approach, the GWO algorithm is employed as a critic in an Actor-Critic framework, and used to evaluate the performance of the actor NN. The goal of this paper is to analyze all three RL-based control approaches, aiming to determine which one represents the best fit for solving the proposed control optimization problem. The performance analysis is based on non-parametric statistical tests conducted on the data obtained from real-time experimental results specific to nonlinear servo system position control

    Policy optimization for industrial benchmark using deep reinforcement learning

    Get PDF
    2020 Summer.Includes bibliographical references.Significant advancements have been made in the field of Reinforcement Learning (RL) in recent decades. Numerous novel RL environments and algorithms are mastering these problems that have been studied, evaluated, and published. The most popular RL benchmark environments produced by OpenAI Gym and DeepMind Labs are modeled after single/multi-player board, video games, or single-purpose robots and the RL algorithms modeling optimal policies for playing those games have even outperformed humans in almost all of them. However, the real-world applications using RL is very limited, as the academic community has limited access to real industrial data and applications. Industrial Benchmark (IB) is a novel RL benchmark motivated by Industrial Control problems with properties such as continuous state and action spaces, high dimensionality, partially observable state space, delayed effects combined with complex heteroscedastic stochastic behavior. We have used Deep Reinforcement Learning (DRL) algorithms like Deep Q-Networks (DQN) and Double-DQN (DDQN) to study and model optimal policies on IB. Our empirical results show various DRL models outperforming previously published models on the same IB

    Learning a Swarm Foraging Behavior with Microscopic Fuzzy Controllers Using Deep Reinforcement Learning

    Get PDF
    This article presents a macroscopic swarm foraging behavior obtained using deep reinforcement learning. The selected behavior is a complex task in which a group of simple agents must be directed towards an object to move it to a target position without the use of special gripping mechanisms, using only their own bodies. Our system has been designed to use and combine basic fuzzy behaviors to control obstacle avoidance and the low-level rendezvous processes needed for the foraging task. We use a realistically modeled swarm based on differential robots equipped with light detection and ranging (LiDAR) sensors. It is important to highlight that the obtained macroscopic behavior, in contrast to that of end-to-end systems, combines existing microscopic tasks, which allows us to apply these learning techniques even with the dimensionality and complexity of the problem in a realistic robotic swarm system. The presented behavior is capable of correctly developing the macroscopic foraging task in a robust and scalable way, even in situations that have not been seen in the training phase. An exhaustive analysis of the obtained behavior is carried out, where both the movement of the swarm while performing the task and the swarm scalability are analyzed.This work was supported by the Ministerio de Ciencia, Innovación y Universidades (Spain), project RTI2018-096219-B-I00. Project co-financed with FEDER funds

    Artificial intelligence for photovoltaic systems

    Get PDF
    Photovoltaic systems have gained an extraordinary popularity in the energy generation industry. Despite the benefits, photovoltaic systems still suffer from four main drawbacks, which include low conversion efficiency, intermittent power supply, high fabrication costs and the nonlinearity of the PV system output power. To overcome these issues, various optimization and control techniques have been proposed. However, many authors relied on classical techniques, which were based on intuitive, numerical or analytical methods. More efficient optimization strategies would enhance the performance of the PV systems and decrease the cost of the energy generated. In this chapter, we provide an overview of how Artificial Intelligence (AI) techniques can provide value to photovoltaic systems. Particular attention is devoted to three main areas: (1) Forecasting and modelling of meteorological data, (2) Basic modelling of solar cells and (3) Sizing of photovoltaic systems. This chapter will aim to provide a comparison between conventional techniques and the added benefits of using machine learning methods

    Secure Energy Aware Optimal Routing using Reinforcement Learning-based Decision-Making with a Hybrid Optimization Algorithm in MANET

    Get PDF
    Mobile ad hoc networks (MANETs) are wireless networks that are perfect for applications such as special outdoor events, communications in areas without wireless infrastructure, crises and natural disasters, and military activities because they do not require any preexisting network infrastructure and can be deployed quickly. Mobile ad hoc networks can be made to last longer through the use of clustering, which is one of the most effective uses of energy. Security is a key issue in the development of ad hoc networks. Many studies have been conducted on how to reduce the energy expenditure of the nodes in this network. The majority of these approaches might conserve energy and extend the life of the nodes. The major goal of this research is to develop an energy-aware, secure mechanism for MANETs. Secure Energy Aware Reinforcement Learning based Decision Making with Hybrid Optimization Algorithm (RL-DMHOA) is proposed for detecting the malicious node in the network. With the assistance of the optimization algorithm, data can be transferred more efficiently by choosing aggregation points that allow individual nodes to conserve power The optimum path is chosen by combining the Particle Swarm Optimization (PSO) and the Bat Algorithm (BA) to create a fitness function that maximizes across-cluster distance, delay, and node energy. Three state-of-the-art methods are compared to the suggested method on a variety of metrics. Throughput of 94.8 percent, average latency of 28.1 percent, malicious detection rate of 91.4 percent, packet delivery ratio of 92.4 percent, and network lifetime of 85.2 percent are all attained with the suggested RL-DMHOA approach

    Military and Security Applications: Cybersecurity (Encyclopedia of Optimization, Third Edition)

    Get PDF
    The domain of cybersecurity is growing as part of broader military and security applications, and the capabilities and processes in this realm have qualities and characteristics that warrant using solution methods in mathematical optimization. Problems of interest may involve continuous or discrete variables, a convex or non-convex decision space, differing levels of uncertainty, and constrained or unconstrained frameworks. Cyberattacks, for example, can be modeled using hierarchical threat structures and may involve decision strategies from both an organization or individual and the adversary. Network traffic flow, intrusion detection and prevention systems, interconnected human-machine interfaces, and automated systems – these all require higher levels of complexity in mathematical optimization modeling and analysis. Attributes such as cyber resiliency, network adaptability, security capability, and information technology flexibility – these require the measurement of multiple characteristics, many of which may involve both quantitative and qualitative interpretations. And for nearly every organization that is invested in some cybersecurity practice, decisions must be made that involve the competing objectives of cost, risk, and performance. As such, mathematical optimization has been widely used and accepted to model important and complex decision problems, providing analytical evidence for helping drive decision outcomes in cybersecurity applications. In the paragraphs that follow, this chapter highlights some of the recent mathematical optimization research in the body of knowledge applied to the cybersecurity space. The subsequent literature discussed fits within a broader cybersecurity domain taxonomy considering the categories of analyze, collect and operate, investigate, operate and maintain, oversee and govern, protect and defend, and securely provision. Further, the paragraphs are structured around generalized mathematical optimization categories to provide a lens to summarize the existing literature, including uncertainty (stochastic programming, robust optimization, etc.), discrete (integer programming, multiobjective, etc.), continuous-unconstrained (nonlinear least squares, etc.), continuous-constrained (global optimization, etc.), and continuous-constrained (nonlinear programming, network optimization, linear programming, etc.). At the conclusion of this chapter, research implications and extensions are offered to the reader that desires to pursue further mathematical optimization research for cybersecurity within a broader military and security applications context
    corecore