41,151 research outputs found
Multi-advisor deep reinforcement learning for smart home energy control
Effective automated smart home control is essential for smart-grid enabled approaches to demand response, named in the literature as automated demand response. At itās heart, this is a multi-objective
adaptive control problem because it requires balancing an applianceās primary objective with demandresponse motivated objectives. This control problem is difficult due to the scale and heterogeneity of
appliances as well as the time-varying nature of both dynamics and consumer preferences. Computational considerations further limit the types of acceptable algorithms to apply to the problem. We
propose approaching the problem under the multi-objective reinforcement learning framework. We suggest a multi-agent multi-advisor reinforcement learning system to handle the consumerās time-varying
preferences across objectives. We design some simulations to produce preliminary results on the nature of user preferences and the feasibility of multi-advisor reinforcement learning. Further smarthome
simulations are designed to demonstrate the linear scalability of the algorithm with respect to both
number of agents and number of objectives. We demonstrate the algorithms performance in simulation against a comparable centrallized and decentrallized controller. Finally, we identify the need for
stronger performance measures for a system of this type by considering the effect on agents of newly
selected preferences
Basal-Bolus Advisor for Type 1 Diabetes (T1D) Patients Using Multi-Agent Reinforcement Learning (RL) Methodology
This paper presents a novel multi-agent reinforcement learning (RL) approach
for personalized glucose control in individuals with type 1 diabetes (T1D). The
method employs a closed-loop system consisting of a blood glucose (BG)
metabolic model and a multi-agent soft actor-critic RL model acting as the
basal-bolus advisor. Performance evaluation is conducted in three scenarios,
comparing the RL agents to conventional therapy. Evaluation metrics include
glucose levels (minimum, maximum, and mean), time spent in different BG ranges,
and average daily bolus and basal insulin dosages. Results demonstrate that the
RL-based basal-bolus advisor significantly improves glucose control, reducing
glycemic variability and increasing time spent within the target range (70-180
mg/dL). Hypoglycemia events are effectively prevented, and severe hyperglycemia
events are reduced. The RL approach also leads to a statistically significant
reduction in average daily basal insulin dosage compared to conventional
therapy. These findings highlight the effectiveness of the multi-agent RL
approach in achieving better glucose control and mitigating the risk of severe
hyperglycemia in individuals with T1D.Comment: 8 pages, 2 figures, 1 Tabl
?????? ?????? ??????????????? ?????? ????????????
Department of Computer Science and EngineeringRecently deep reinforcement learning (DRL) algorithms show super human performances in the simulated game domains. In practical points, the sample efficiency is also one of the most important measures to determine the performance of a model. Especially for the environment of large search spaces (e.g. continuous action space), it is very critical condition to achieve the state-of-the-art performance.
In this thesis, we design a model to be applicable to multi-end games in continuous space with high sample efficiency. A multi-end game has several sub-games which are independent each other but affect the result of the game by some rules of its domain. We verify the algorithm in the environment of simulated curling.clos
Recommended from our members
Action selection in modular reinforcement learning
textModular reinforcement learning is an approach to resolve the curse of dimensionality problem in traditional reinforcement learning. We design and implement a modular reinforcement learning algorithm, which is based on three major components: Markov decision process decomposition, module training, and global action selection. We define and formalize module class and module instance concepts in decomposition step. Under our framework of decomposition, we train each modules efficiently using SARSA() algorithm. Then we design, implement, test, and compare three action selection algorithms based on different heuristics: Module Combination, Module Selection, and Module Voting. For last two algorithms, we propose a method to calculate module weights efficiently, by using standard deviation of Q-values of each module. We show that Module Combination and Module Voting algorithms produce satisfactory performance in our test domain.Computer Science
- ā¦