27 research outputs found

    Real-Time Reinforcement Learning

    Full text link
    Les processus de dĂ©cision markovien (MDP), le cadre mathĂ©matiques sous-jacent Ă  la plupart des algorithmes de l'apprentissage par renforcement (RL) est souvent utilisĂ© d'une maniĂšre qui suppose, Ă  tort, que l'Ă©tat de l'environnement d'un agent ne change pas pendant la sĂ©lection des actions. Puisque les systĂšmes RL basĂ©s sur les MDP classiques commencent Ă  ĂȘtre appliquĂ©s dans les situations critiques pour la sĂ©curitĂ© du monde rĂ©el, ce dĂ©calage entre les hypothĂšses sous-jacentes aux MDP classiques et la rĂ©alitĂ© du calcul en temps rĂ©el peut entraĂźner des rĂ©sultats indĂ©sirables. Dans cette thĂšse, nous introduirons un nouveau cadre dans lequel les Ă©tats et les actions Ă©voluent simultanĂ©ment, nous montrerons comment il est liĂ© Ă  la formulation MDP classique. Nous analyserons des algorithmes existants selon la nouvelle formulation en temps rĂ©el et montrerons pourquoi ils sont infĂ©rieurs, lorsqu'ils sont utilisĂ©s en temps rĂ©el. Par la suite, nous utiliserons ces perspectives pour crĂ©er un nouveau algorithme Real-Time Actor Critic qui est supĂ©rieur au Soft Actor Critic contrĂŽle continu de l'Ă©tat de l'art actuel, aussi bien en temps rĂ©el qu'en temps non rĂ©el.Markov Decision Processes (MDPs), the mathematical framework underlying most algorithms in Reinforcement Learning (RL), are often used in a way that wrongfully assumes that the state of an agent's environment does not change during action selection. As RL systems based on MDPs begin to find application in real-world safety critical situations, this mismatch between the assumptions underlying classical MDPs and the reality of real-time computation may lead to undesirable outcomes. In this thesis, we introduce a new framework, in which states and actions evolve simultaneously, we show how it is related to the classical MDP formulation. We analyze existing algorithms under the new real-time formulation and show why they are suboptimal when used in real-time. We then use those insights to create a new algorithm, Real-Time Actor Critic (RTAC) that outperforms the existing state-of-the-art continuous control algorithm Soft Actor Critic both in real-time and non-real-time settings

    Real-Time Reinforcement Learning

    Full text link
    Markov Decision Processes (MDPs), the mathematical framework underlying most algorithms in Reinforcement Learning (RL), are often used in a way that wrongfully assumes that the state of an agent's environment does not change during action selection. As RL systems based on MDPs begin to find application in real-world safety critical situations, this mismatch between the assumptions underlying classical MDPs and the reality of real-time computation may lead to undesirable outcomes. In this paper, we introduce a new framework, in which states and actions evolve simultaneously and show how it is related to the classical MDP formulation. We analyze existing algorithms under the new real-time formulation and show why they are suboptimal when used in real-time. We then use those insights to create a new algorithm Real-Time Actor-Critic (RTAC) that outperforms the existing state-of-the-art continuous control algorithm Soft Actor-Critic both in real-time and non-real-time settings. Code and videos can be found at https://github.com/rmst/rtrl.Comment: Neural Information Processing Systems (2019

    Reinforcement Learning with Random Delays

    Full text link
    Action and observation delays commonly occur in many Reinforcement Learning applications, such as remote control scenarios. We study the anatomy of randomly delayed environments, and show that partially resampling trajectory fragments in hindsight allows for off-policy multi-step value estimation. We apply this principle to derive Delay-Correcting Actor-Critic (DCAC), an algorithm based on Soft Actor-Critic with significantly better performance in environments with delays. This is shown theoretically and also demonstrated practically on a delay-augmented version of the MuJoCo continuous control benchmark

    Suicidal Behavior and Alcohol Abuse

    Get PDF
    Suicide is an escalating public health problem, and alcohol use has consistently been implicated in the precipitation of suicidal behavior. Alcohol abuse may lead to suicidality through disinhibition, impulsiveness and impaired judgment, but it may also be used as a means to ease the distress associated with committing an act of suicide. We reviewed evidence of the relationship between alcohol use and suicide through a search of MedLine and PsychInfo electronic databases. Multiple genetically-related intermediate phenotypes might influence the relationship between alcohol and suicide. Psychiatric disorders, including psychosis, mood disorders and anxiety disorders, as well as susceptibility to stress, might increase the risk of suicidal behavior, but may also have reciprocal influences with alcohol drinking patterns. Increased suicide risk may be heralded by social withdrawal, breakdown of social bonds, and social marginalization, which are common outcomes of untreated alcohol abuse and dependence. People with alcohol dependence or depression should be screened for other psychiatric symptoms and for suicidality. Programs for suicide prevention must take into account drinking habits and should reinforce healthy behavioral patterns

    matthiasplappert/keras-rl: v0.2.0rc1

    No full text
    Deep Reinforcement Learning for Keras

    Alcohol‐attributed disease burden in four Nordic countries between 2000 and 2017: Are the gender gaps narrowing? A comparison using the Global Burden of Disease, Injury and Risk Factor 2017 study

    Get PDF
    Abstract Introduction and Aims The gender difference in alcohol use seems to have narrowed in the Nordic countries, but it is not clear to what extent this may have affected differences in levels of harm. We compared gender differences in all‐cause and cause‐specific alcohol‐attributed disease burden, as measured by disability‐adjusted life‐years (DALY), in four Nordic countries in 2000–2017, to find out if gender gaps in DALYs had narrowed. Design and Methods Alcohol‐attributed disease burden by DALYs per 100 000 population with 95% uncertainty intervals were extracted from the Global Burden of Disease database. Results In 2017, all‐cause DALYs in males varied between 2531 in Finland and 976 in Norway, and in females between 620 in Denmark and 270 in Norway. Finland had the largest gender differences and Norway the smallest, closely followed by Sweden. During 2000–2017, absolute gender differences in all‐cause DALYs declined by 31% in Denmark, 26% in Finland, 19% in Sweden and 18% in Norway. In Finland, this was driven by a larger relative decline in males than females; in Norway, it was due to increased burden in females. In Denmark, the burden in females declined slightly more than in males, in relative terms, while in Sweden the relative decline was similar in males and females. Discussion and Conclusions The gender gaps in harm narrowed to a different extent in the Nordic countries, with the differences driven by different conditions. Findings are informative about how inequality, policy and sociocultural differences affect levels of harm by gender.publishedVersio
    corecore