Search CORE

2,787 research outputs found

Optimizing memory management for optimistic simulation with reinforcement learning

Author: PELLEGRINI ALESSANDRO
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Simulation is a powerful technique to explore complex scenarios and analyze systems related to a wide range of disciplines. To allow for an efficient exploitation of the available computing power, speculative Time Warp-based Parallel Discrete Event Simulation is universally recognized as a viable solution. In this context, the rollback operation is a fundamental building block to support a correct execution even when causality inconsistencies are a posteriori materialized. If this operation is supported via checkpoint/restore strategies, memory management plays a fundamental role to ensure high performance of the simulation run. With few exceptions, adaptive protocols targeting memory management for Time Warp-based simulations have been mostly based on a pre-defined analytic models of the system, expressed as a closed-form functions that map system's state to control parameters. The underlying assumption is that the model itself is optimal. In this paper, we present an approach that exploits reinforcement learning techniques. Rather than assuming an optimal control strategy, we seek to find the optimal strategy through parameter exploration. A value function that captures the history of system feedback is used, and no a-priori knowledge of the system is required. An experimental assessment of the viability of our proposal is also provided for a mobile cellular system simulation

ART

Archivio della ricerca- Università di Roma La Sapienza

On improving the performance of optimistic distributed simulations

Author: Dimakis Nikolaos
Dimakis Nikolaos
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/01/2010
Field of study

This report investigates means of improving the performance of optimistic distributed simulations without affecting the simulation accuracy. We argue that existing clustering algorithms are not adequate for application in distributed simulations, and outline some characteristics of an ideal algorithm that could be applied in this field. This report is structured as follows. We start by introducing the area of distributed simulation. Following a comparison of the dominant protocols used in distributed simulation, we elaborate on the current approaches of improving the simulation performance, using computation efficient techniques, exploiting the hardware configuration of processors, optimizations that can be derived from the simulation scenario, etc. We introduce the core characteristics of clustering approaches and argue that these cannot be applied in real-life distributed simulation problems. We present a typical distributed simulation setting and elaborate on the reasons that existing clustering approaches are not expected to improve the performance of a distributed simulation. We introduce a prototype distributed simulation platform that has been developed in the scope of this research, focusing on the area of emergency response and specifically building evacuation. We continue by outlining our current work on this issue, and finally, we end this report by outlining next actions which could be made in this field

Spiral - Imperial College Digital Repository

Self-Learning Neural controller for Hybrid Power Management using Neuro-Dynamic Programming

Author: Filipi Zoran
Johri Rajit
Publication venue: 'SAE International'
Publication date: 01/01/2011
Field of study

A supervisory controller strategy for a hybrid vehicle coordinates the operation of the two power sources onboard of a vehicle to maximize objectives like fuel economy. In the past, various control strategies have been developed using heuristics as well as optimal control theory. The Stochastic Dynamic Programming (SDP) has been previously applied to determine implementable optimal control policies for discrete time dynamic systems whose states evolve according to given transition probabilities. However, the approach is constrained by the curse of dimensionality, i.e. an exponential increase in computational effort with increase in system state space, faced by dynamic programming based algorithms. This paper proposes a novel approach capable of overcoming the curse of dimensionality and solving policy optimization for a system with very large design state space. We propose developing a supervisory controller for hybrid vehicles based on the principles of reinforcement learning and neuro-dynamic programming, whereby the cost-to-go function is approximated using a neural network. The controller learns and improves its performance over time. The simulation results obtained for a series hydraulic hybrid vehicle over a driving schedule demonstrate the effectiveness of the proposed technique.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/89874/1/draft_01.pd

Crossref

Deep Blue Documents at the University of Michigan

Recommended from our members

Towards Informed Exploration for Deep Reinforcement Learning

Author: Tang Haoran
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact

eScholarship - University of California

A Survey of Monte Carlo Tree Search Methods

Author: Browne Cameron B
Colton Simon
Cowling Peter I
Lucas Simon M
Perez Diego
Powley Edward
Rohlfshagen Philipp
Samothrakis Spyridon
Tavener Stephen
Whitehouse Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

University of Essex Research Repository

CiteSeerX

Maastricht University Research Portal

Optimal treatment allocations in space and time for on-line control of an emerging infectious disease

Author: Agarwal A.
Anderson R. M.
Bertsekas D. P.
Borth D. M.
Chapelle O.
Chapelle O.
Chesterton G. K.
Choi A. L.
Cox D. R.
Deardon R.
Estrada E.
Field K.
Gelman A.
Ghavamzadeh M.
Ghavamzadeh M.
Huang C.‐Y.
Kushner H. J.
Law A. M.
Little R. J.
Lusher D.
Mahadevan S.
May B. C.
Murphy S. A.
Murphy S. A.
Nahum‐Shani I.
Newton M. A.
Orellana L.
Osband I.
Palmer J. M.
Poupart P.
Ross S.
Russo D.
Sen A.
Spall J. C.
Subcommittee on Fisheries Wildlife, and Oceans
Sutton R.
Sutton R. S.
West M.
Yin G.
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

A key component in controlling the spread of an epidemic is deciding where, whenand to whom to apply an intervention.We develop a framework for using data to informthese decisionsin realtime.We formalize a treatment allocation strategy as a sequence of functions, oneper treatment period, that map up-to-date information on the spread of an infectious diseaseto a subset of locations where treatment should be allocated. An optimal allocation strategyoptimizes some cumulative outcome, e.g. the number of uninfected locations, the geographicfootprint of the disease or the cost of the epidemic. Estimation of an optimal allocation strategyfor an emerging infectious disease is challenging because spatial proximity induces interferencebetween locations, the number of possible allocations is exponential in the number oflocations, and because disease dynamics and intervention effectiveness are unknown at outbreak.We derive a Bayesian on-line estimator of the optimal allocation strategy that combinessimulation–optimization with Thompson sampling.The estimator proposed performs favourablyin simulation experiments. This work is motivated by and illustrated using data on the spread ofwhite nose syndrome, which is a highly fatal infectious disease devastating bat populations inNorth America

Crossref

eScholarship - University of California