23 research outputs found

    Constrained Model-Free Reinforcement Learning for Process Optimization

    Full text link
    Reinforcement learning (RL) is a control approach that can handle nonlinear stochastic optimal control problems. However, despite the promise exhibited, RL has yet to see marked translation to industrial practice primarily due to its inability to satisfy state constraints. In this work we aim to address this challenge. We propose an 'oracle'-assisted constrained Q-learning algorithm that guarantees the satisfaction of joint chance constraints with a high probability, which is crucial for safety critical tasks. To achieve this, constraint tightening (backoffs) are introduced and adjusted using Broyden's method, hence making them self-tuned. This results in a general methodology that can be imbued into approximate dynamic programming-based algorithms to ensure constraint satisfaction with high probability. Finally, we present case studies that analyze the performance of the proposed approach and compare this algorithm with model predictive control (MPC). The favorable performance of this algorithm signifies a step toward the incorporation of RL into real world optimization and control of engineering systems, where constraints are essential in ensuring safety

    An Analysis of Multi-Agent Reinforcement Learning for Decentralized Inventory Control Systems

    Full text link
    Most solutions to the inventory management problem assume a centralization of information that is incompatible with organisational constraints in real supply chain networks. The inventory management problem is a well-known planning problem in operations research, concerned with finding the optimal re-order policy for nodes in a supply chain. While many centralized solutions to the problem exist, they are not applicable to real-world supply chains made up of independent entities. The problem can however be naturally decomposed into sub-problems, each associated with an independent entity, turning it into a multi-agent system. Therefore, a decentralized data-driven solution to inventory management problems using multi-agent reinforcement learning is proposed where each entity is controlled by an agent. Three multi-agent variations of the proximal policy optimization algorithm are investigated through simulations of different supply chain networks and levels of uncertainty. The centralized training decentralized execution framework is deployed, which relies on offline centralization during simulation-based policy identification, but enables decentralization when the policies are deployed online to the real system. Results show that using multi-agent proximal policy optimization with a centralized critic leads to performance very close to that of a centralized data-driven solution and outperforms a distributed model-based solution in most cases while respecting the information constraints of the system

    Using process data to generate an optimal control policy via apprenticeship and reinforcement learning

    Get PDF
    From Wiley via Jisc Publications RouterHistory: received 2020-10-04, rev-recd 2021-04-23, accepted 2021-05-03, pub-electronic 2021-05-15Article version: VoRPublication status: PublishedAbstract: Reinforcement learning (RL) is a data‐driven approach to synthesizing an optimal control policy. A barrier to wide implementation of RL‐based controllers is its data‐hungry nature during online training and its inability to extract useful information from human operator and historical process operation data. Here, we present a two‐step framework to resolve this challenge. First, we employ apprenticeship learning via inverse RL to analyze historical process data for synchronous identification of a reward function and parameterization of the control policy. This is conducted offline. Second, the parameterization is improved online efficiently under the ongoing process via RL within only a few iterations. Significant advantages of this framework include to allow for the hot‐start of RL algorithms for process optimal control, and robust abstraction of existing controllers and control knowledge from data. The framework is demonstrated on three case studies, showing its potential for chemical process control

    Planning and optimising a digital intervention to protect older adults' cognitive health.

    Get PDF
    BackgroundBy 2050, worldwide dementia prevalence is expected to triple. Affordable, scalable interventions are required to support protective behaviours such as physical activity, cognitive training and healthy eating. This paper outlines the theory-, evidence- and person-based development of 'Active Brains': a multi-domain digital behaviour change intervention to reduce cognitive decline amongst older adults.MethodsDuring the initial planning phase, scoping reviews, consultation with PPI contributors and expert co-investigators and behavioural analysis collated and recorded evidence that was triangulated to inform provisional 'guiding principles' and an intervention logic model. The following optimisation phase involved qualitative think aloud and semi-structured interviews with 52 older adults with higher and lower cognitive performance scores. Data were analysed thematically and informed changes and additions to guiding principles, the behavioural analysis and the logic model which, in turn, informed changes to intervention content.ResultsScoping reviews and qualitative interviews suggested that the same intervention content may be suitable for individuals with higher and lower cognitive performance. Qualitative findings revealed that maintaining independence and enjoyment motivated engagement in intervention-targeted behaviours, whereas managing ill health was a potential barrier. Social support for engaging in such activities could provide motivation, but was not desirable for all. These findings informed development of intervention content and functionality that appeared highly acceptable amongst a sample of target users.ConclusionsA digitally delivered intervention with minimal support appears acceptable and potentially engaging to older adults with higher and lower levels of cognitive performance. As well as informing our own intervention development, insights obtained through this process may be useful for others working with, and developing interventions for, older adults and/or those with cognitive impairment

    The Active Brains Digital Intervention to Reduce Cognitive Decline in Older Adults: Protocol for a Feasibility Randomized Controlled Trial.

    Get PDF
    BACKGROUND: Increasing physical activity, improving diet, and performing brain training exercises are associated with reduced cognitive decline in older adults. OBJECTIVE: In this paper, we describe a feasibility trial of the Active Brains intervention, a web-based digital intervention developed to support older adults to make these 3 healthy behavior changes associated with improved cognitive health. The Active Brains trial is a randomized feasibility trial that will test how accessible, acceptable, and feasible the Active Brains intervention is and the effectiveness of the study procedures that we intend to use in the larger, main trial. METHODS: In the randomized controlled trial (RCT), we use a parallel design. We will be conducting the intervention with 2 populations recruited through GP practices (family practices) in England from 2018 to 2019: older adults with signs of cognitive decline and older adults without any cognitive decline. Trial participants were randomly allocated to 1 of 3 study groups: usual care, the Active Brains intervention, or the Active Brains website plus brief support from a trained coach (over the phone or by email). The main outcomes are performance on cognitive tasks, quality of life (using EuroQol-5D 5 level), Instrumental Activities of Daily Living, and diagnoses of dementia. Secondary outcomes (including depression, enablement, and health care costs) and process measures (including qualitative interviews with participants and supporters) will also be collected. The trial has been approved by the National Health Service Research Ethics Committee (reference 17/SC/0463). RESULTS: Results will be published in peer-reviewed journals, presented at conferences, and shared at public engagement events. Data collection was completed in May 2020, and the results will be reported in 2021. CONCLUSIONS: The findings of this study will help us to identify and make important changes to the website, the support received, or the study procedures before we progress to our main randomized phase III trial. TRIAL REGISTRATION: International Standard Randomized Controlled Trial Number 23758980; http://www.isrctn.com/ISRCTN23758980. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/18929

    Development and Characterization of a Probe Device toward Intracranial Spectroscopy of Traumatic Brain Injury.

    Get PDF
    Traumatic brain injury is a leading cause of mortality worldwide, often affecting individuals at their most economically active yet no primary disease-modifying interventions exist for their treatment. Real-time direct spectroscopic examination of the brain tissue within the context of traumatic brain injury has the potential to improve the understanding of injury heterogeneity and subtypes, better target management strategies and organ penetrance of pharmacological agents, identify novel targets for intervention, and allow a clearer understanding of fundamental biochemistry evolution. Here, a novel device is designed and engineered, delivering Raman spectroscopy-based measurements from the brain through clinically established cranial access techniques. Device prototyping is undertaken within the constraints imposed by the acquisition and site dimensions (standard intracranial access holes, probe's dimensions), and an artificial skull anatomical model with cortical impact is developed. The device shows a good agreement with the data acquired a standard commercial Raman, and the spectra measured are comparable in terms of quality and detectable bands to the established traumatic brain injury model. The developed proof-of-concept device demonstrates the feasibility for real-time optical brain spectroscopic interface while removing the noise of extracranial tissue and with further optimization and validation, such technology will be directly translatable for integration into currently available standards of neurological care
    corecore