11 research outputs found

    Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline Reinforcement Learning

    Full text link
    A key problem in off-policy Reinforcement Learning (RL) is the mismatch, or distribution shift, between the dataset and the distribution over states and actions visited by the learned policy. This problem is exacerbated in the fully offline setting. The main approach to correct this shift has been through importance sampling, which leads to high-variance gradients. Other approaches, such as conservatism or behavior-regularization, regularize the policy at the cost of performance. In this paper, we propose a new approach for stable off-policy Q-Learning. Our method, Projected Off-Policy Q-Learning (POP-QL), is a novel actor-critic algorithm that simultaneously reweights off-policy samples and constrains the policy to prevent divergence and reduce value-approximation error. In our experiments, POP-QL not only shows competitive performance on standard benchmarks, but also out-performs competing methods in tasks where the data-collection policy is significantly sub-optimal.Comment: 10 page

    Proposed Pathogenesis, Characteristics, and Management of COVID-19 mRNA Vaccine-Related Myopericarditis

    Full text link
    Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the novel coronavirus causing coronavirus disease 2019 (COVID-19), has affected human lives across the globe. On 11 December 2020, the US FDA granted an emergency use authorization for the first COVID-19 vaccine, and vaccines are now widely available. Undoubtedly, the emergence of these vaccines has led to substantial relief, helping alleviate the fear and anxiety around the COVID-19 illness for both the general public and clinicians. However, recent cases of vaccine complications, including myopericarditis, have been reported after administration of COVID-19 vaccines. This article discusses the cases, possible pathogenesis of myopericarditis, and treatment of the condition. Most cases were mild and should not yet change vaccine policies, although prospective studies are needed to better assess the risk-benefit ratios in different groups
    corecore