4,452 research outputs found

    Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

    Full text link
    The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions

    Supervisory Control Optimization for a Series Hybrid Electric Vehicle with Consideration of Battery Thermal Management and Aging

    Get PDF
    This dissertation integrates battery thermal management and aging into the supervisory control optimization for a heavy-duty series hybrid electric vehicle (HEV). The framework for multi-objective optimization relies on novel implementation of the Dynamic Programing algorithm, and predictive models of critical phenomena. Electrochemistry based battery aging model is integrated into the framework to assesses the battery aging rate by considering instantaneous lithium ion (Li+) surface concentration rather than average concentration. This creates a large state-action space. Therefore, the computational effort required to solve a Deterministic or Stochastic Dynamic Programming becomes prohibitively intense, and a neuro-dynamic programming approach is proposed to remove the ‘curse of dimensionality’ in classical dynamic programming. First, unified simulation framework is developed for in-depth studies of series HEV system. The integration of a refrigerant system model enables prediction of energy use for cooling the battery pack. Side reaction, electrolyte decomposition, is considered as the main aging mechanism of LiFePO4/Graphite battery, and an electrochemical model is integrated to predict side reaction rate and the resulting fading of capacity and power. An approximate analytical solution is used to solve the partial difference equations (PDEs) for Li+ diffusion. Comparing with finite difference method, it largely reduces the number of states with only a slight penalty on prediction accuracy. This improves computational efficiency, and enables inclusion of the electrochemistry based aging model in the power management optimization framework. Next, a stochastic dynamic programming (SDP) approach is applied to the optimization of supervisory control. Auxiliary cooling power is included in addition to vehicle propulsion. Two objectives, fuel economy and battery life, are optimized by weighted sum method. To reduce the computation load, a simplified battery aging model coupled with equivalent circuit model is used in SDP optimization; Li+ diffusion dynamics are disregarded, and surface concentration is represented by the average concentration. This reduces the system state number to four with two control inputs. A real-time implementable strategy is generated and embedded into the supervisory controller. The result shows that SDP strategy can improve fuel economy and battery life simultaneously, comparing with Thermostatic SOC strategy. Further, the tradeoff between fuel consumption and active Li+ loss is studied under different battery temperature. Finally, the accuracy of battery aging model for optimization is improved by adding Li+ diffusion dynamics. This increases the number of states and brings challenges to classical dynamic programming algorithms. Hence, a neuro-dynamic programming (NDP) approach is proposed for the problem with large state-action space. It combines the idea of functional approximation and temporal difference learning with dynamic programming; in that case the computation load increases linearly with the number of parameters in the approximate function, rather than exponentially with state space. The result shows that ability of NDP to solve the complex control optimization problem reliably and efficiently. The battery-aging conscientious strategy generated by NDP optimization framework further improves battery life by 3.8% without penalty on fuel economy, compared to SDP strategy. Improvements of battery life compared to the heuristic strategy are much larger, on the order of 65%. This leads to progressively larger fuel economy gains over time

    Embedding Multi-Task Address-Event- Representation Computation

    Get PDF
    Address-Event-Representation, AER, is a communication protocol that is intended to transfer neuronal spikes between bioinspired chips. There are several AER tools to help to develop and test AER based systems, which may consist of a hierarchical structure with several chips that transmit spikes among them in real-time, while performing some processing. Although these tools reach very high bandwidth at the AER communication level, they require the use of a personal computer to allow the higher level processing of the event information. We propose the use of an embedded platform based on a multi-task operating system to allow both, the AER communication and processing without the requirement of either a laptop or a computer. In this paper, we present and study the performance of an embedded multi-task AER tool, connecting and programming it for processing Address-Event information from a spiking generator.Ministerio de Ciencia e Innovación TEC2006-11730-C03-0

    Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS

    Full text link
    Many distributed machine learning frameworks have recently been built to speed up the large-scale data learning process. However, most distributed machine learning used in these frameworks still uses an offline algorithm model which cannot cope with the data stream problems. In fact, large-scale data are mostly generated by the non-stationary data stream where its pattern evolves over time. To address this problem, we propose a novel Evolving Large-scale Data Stream Analytics framework based on a Scalable Parsimonious Network based on Fuzzy Inference System (Scalable PANFIS), where the PANFIS evolving algorithm is distributed over the worker nodes in the cloud to learn large-scale data stream. Scalable PANFIS framework incorporates the active learning (AL) strategy and two model fusion methods. The AL accelerates the distributed learning process to generate an initial evolving large-scale data stream model (initial model), whereas the two model fusion methods aggregate an initial model to generate the final model. The final model represents the update of current large-scale data knowledge which can be used to infer future data. Extensive experiments on this framework are validated by measuring the accuracy and running time of four combinations of Scalable PANFIS and other Spark-based built in algorithms. The results indicate that Scalable PANFIS with AL improves the training time to be almost two times faster than Scalable PANFIS without AL. The results also show both rule merging and the voting mechanisms yield similar accuracy in general among Scalable PANFIS algorithms and they are generally better than Spark-based algorithms. In terms of running time, the Scalable PANFIS training time outperforms all Spark-based algorithms when classifying numerous benchmark datasets.Comment: 20 pages, 5 figure

    Maintenance models applied to wind turbines. A comprehensive overview

    Get PDF
    Producción CientíficaWind power generation has been the fastest-growing energy alternative in recent years, however, it still has to compete with cheaper fossil energy sources. This is one of the motivations to constantly improve the efficiency of wind turbines and develop new Operation and Maintenance (O&M) methodologies. The decisions regarding O&M are based on different types of models, which cover a wide range of scenarios and variables and share the same goal, which is to minimize the Cost of Energy (COE) and maximize the profitability of a wind farm (WF). In this context, this review aims to identify and classify, from a comprehensive perspective, the different types of models used at the strategic, tactical, and operational decision levels of wind turbine maintenance, emphasizing mathematical models (MatMs). The investigation allows the conclusion that even though the evolution of the models and methodologies is ongoing, decision making in all the areas of the wind industry is currently based on artificial intelligence and machine learning models

    Asymmetric HMMs for online ball-bearing health assessments

    Get PDF
    The degradation of critical components inside large industrial assets, such as ball-bearings, has a negative impact on production facilities, reducing the availability of assets due to an unexpectedly high failure rate. Machine learning- based monitoring systems can estimate the remaining useful life (RUL) of ball-bearings, reducing the downtime by early failure detection. However, traditional approaches for predictive systems require run-to-failure (RTF) data as training data, which in real scenarios can be scarce and expensive to obtain as the expected useful life could be measured in years. Therefore, to overcome the need of RTF, we propose a new methodology based on online novelty detection and asymmetrical hidden Markov models (As-HMM) to work out the health assessment. This new methodology does not require previous RTF data and can adapt to natural degradation of mechanical components over time in data-stream and online environments. As the system is designed to work online within the electrical cabinet of machines it has to be deployed using embedded electronics. Therefore, a performance analysis of As-HMM is presented to detect the strengths and critical points of the algorithm. To validate our approach, we use real life ball-bearing data-sets and compare our methodology with other methodologies where no RTF data is needed and check the advantages in RUL prediction and health monitoring. As a result, we showcase a complete end-to-end solution from the sensor to actionable insights regarding RUL estimation towards maintenance application in real industrial environments.This study was supported partially by the Spanish Ministry of Economy and Competitiveness through the PID2019-109247GB-I00 project and by the Spanish Ministry of Science and Innovation through the RTC2019-006871-7 (DSTREAMS project). Also, by the H2020 IoTwins project (Distributed Digital Twins for industrial SMEs: a big-data platform) funded by the EU under the call ICT-11-2018- 2019, Grant Agreement No. 857191.Peer ReviewedPostprint (author's final draft

    A Survey of Prediction and Classification Techniques in Multicore Processor Systems

    Get PDF
    In multicore processor systems, being able to accurately predict the future provides new optimization opportunities, which otherwise could not be exploited. For example, an oracle able to predict a certain application\u27s behavior running on a smart phone could direct the power manager to switch to appropriate dynamic voltage and frequency scaling modes that would guarantee minimum levels of desired performance while saving energy consumption and thereby prolonging battery life. Using predictions enables systems to become proactive rather than continue to operate in a reactive manner. This prediction-based proactive approach has become increasingly popular in the design and optimization of integrated circuits and of multicore processor systems. Prediction transforms from simple forecasting to sophisticated machine learning based prediction and classification that learns from existing data, employs data mining, and predicts future behavior. This can be exploited by novel optimization techniques that can span across all layers of the computing stack. In this survey paper, we present a discussion of the most popular techniques on prediction and classification in the general context of computing systems with emphasis on multicore processors. The paper is far from comprehensive, but, it will help the reader interested in employing prediction in optimization of multicore processor systems

    Scheduling Allocation and Inventory Replenishment Problems Under Uncertainty: Applications in Managing Electric Vehicle and Drone Battery Swap Stations

    Get PDF
    In this dissertation, motivated by electric vehicle (EV) and drone application growth, we propose novel optimization problems and solution techniques for managing the operations at EV and drone battery swap stations. In Chapter 2, we introduce a novel class of stochastic scheduling allocation and inventory replenishment problems (SAIRP), which determines the recharging, discharging, and replacement decisions at a swap station over time to maximize the expected total profit. We use Markov Decision Process (MDP) to model SAIRPs facing uncertain demands, varying costs, and battery degradation. Considering battery degradation is crucial as it relaxes the assumption that charging/discharging batteries do not deteriorate their quality (capacity). Besides, it ensures customers receive high-quality batteries as we prevent recharging/discharging and swapping when the average capacity of batteries is lower than a predefined threshold. Our MDP has high complexity and dimensions regarding the state space, action space, and transition probabilities; therefore, we can not provide the optimal decision rules (exact solutions) for SAIRPs of increasing size. Thus, we propose high-quality approximate solutions, heuristic and reinforcement learning (RL) methods, for stochastic SAIRPs that provide near-optimal policies for the stations. In Chapter 3, we explore the structure and theoretical findings related to the optimal solution of SAIRP. Notably, we prove the monotonicity properties to develop fast and intelligent algorithms to provide approximate solutions and overcome the curses of dimensionality. We show the existence of monotone optimal decision rules when there is an upper bound on the number of batteries replaced in each period. We demonstrate the monotone structure for the MDP value function when considering the first, second, and both dimensions of the state. We utilize data analytics and regression techniques to provide an intelligent initialization for our monotone approximate dynamic programming (ADP) algorithm. Finally, we provide insights from solving realistic-sized SAIRPs. In Chapter 4, we consider the problem of optimizing the distribution operations of a hub using drones to deliver medical supplies to different geographic regions. Drones are an innovative method with many benefits including low-contact delivery thereby reducing the spread of pandemic and vaccine-preventable diseases. While we focus on medical supply delivery for this work, it is applicable to drone delivery for many other applications, including food, postal items, and e-commerce delivery. In this chapter, our goal is to address drone delivery challenges by optimizing the distribution operations at a drone hub that dispatch drones to different geographic locations generating stochastic demands for medical supplies. By considering different geographic locations, we consider different classes of demand that require different flight ranges, which is directly related to the amount of charge held in a drone battery. We classify the stochastic demands based on their distance from the drone hub, use a Markov decision process to model the problem, and perform computational tests using realistic data representing a prominent drone delivery company. We solve the problem using a reinforcement learning method and show its high performance compared with the exact solution found using dynamic programming. Finally, we analyze the results and provide insights for managing the drone hub operations
    • …
    corecore