46 research outputs found

    Control Regularization for Reduced Variance Reinforcement Learning

    Get PDF
    Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.Comment: Appearing in ICML 201

    Too many swipes for today: The Development of the Problematic Tinder Use Scale (PTUS)

    Get PDF
    Background and aims Tinder is a very popular smartphone-based geolocated dating application. The goal of the present study was creating a short Problematic Tinder Use Scale (PTUS). Methods Griffiths’ (2005) six-component model was implemented for covering all components of problematic Tinder use. Confirmatory factor analyses were carried out on a Tinder user sample (N = 430). Results Both the 12- and the 6-item versions were tested. The 6-item unidimensional structure has appropriate reliability and factor structure. No salient demography-related differences were found. Users irrespectively to their relationship status have similar scores on PTUS. Discussion Tinder users deserve the attention of scientific examination considering their large proportion among smartphone users. It is especially true considering the emerging trend of geolocated online dating applications. Conclusions Before PTUS, no prior scale has been created to measure problematic Tinder use. The PTUS is a suitable and reliable measure to assess problematic Tinder use

    Safety-Critical Control of Compartmental Epidemiological Models with Measurement Delays

    Get PDF
    We introduce a methodology to guarantee safety against the spread of infectious diseases by viewing epidemiological models as control systems and by considering human interventions (such as quarantining or social distancing) as control input. We consider a generalized compartmental model that represents the form of the most popular epidemiological models and we design safety-critical controllers that formally guarantee safe evolution with respect to keeping certain populations of interest under prescribed safe limits. Furthermore, we discuss how measurement delays originated from incubation period and testing delays affect safety and how delays can be compensated via predictor feedback. We demonstrate our results by synthesizing active intervention policies that bound the number of infections, hospitalizations and deaths for epidemiological models capturing the spread of COVID-19 in the USA.Comment: Submitted to the IEEE Control System Letters (L-CSS) and the 2021 American Control Conference (ACC). 6 pages, 3 figure
    corecore