Search CORE

1,193 research outputs found

Robust Losses for Learning Value Functions

Author: Liao Victor
Patterson Andrew
White Martha
Publication venue
Publication date: 17/04/2023
Field of study

Most value function learning algorithms in reinforcement learning are based on the mean squared (projected) Bellman error. However, squared errors are known to be sensitive to outliers, both skewing the solution of the objective and resulting in high-magnitude and high-variance gradients. To control these high-magnitude updates, typical strategies in RL involve clipping gradients, clipping rewards, rescaling rewards, or clipping errors. While these strategies appear to be related to robust losses -- like the Huber loss -- they are built on semi-gradient update rules which do not minimize a known loss. In this work, we build on recent insights reformulating squared Bellman errors as a saddlepoint optimization problem and propose a saddlepoint reformulation for a Huber Bellman error and Absolute Bellman error. We start from a formalization of robust losses, then derive sound gradient-based approaches to minimize these losses in both the online off-policy prediction and control settings. We characterize the solutions of the robust losses, providing insight into the problem settings where the robust losses define notably better solutions than the mean squared Bellman error. Finally, we show that the resulting gradient-based algorithms are more stable, for both prediction and control, with less sensitivity to meta-parameters.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence (2022

arXiv.org e-Print Archive

The Natural Gas Policy Act of 1978: Four Years of Practice and Two Years to Make Perfect

Author: Morgan Richard Greer
Patterson Martha Priddy
Publication venue: UKnowledge
Publication date: 01/01/1982
Field of study

University of Kentucky

When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

Author: Liu Vincent
Nagarajan Prabhat
Patterson Andrew
White Martha
Publication venue
Publication date: 04/12/2023
Field of study

Offline reinforcement learning algorithms often require careful hyperparameter tuning. Consequently, before deployment, we need to select amongst a set of candidate policies. As yet, however, there is little understanding about the fundamental limits of this offline policy selection (OPS) problem. In this work we aim to provide clarity on when sample efficient OPS is possible, primarily by connecting OPS to off-policy policy evaluation (OPE) and Bellman error (BE) estimation. We first show a hardness result, that in the worst case, OPS is just as hard as OPE, by proving a reduction of OPE to OPS. As a result, no OPS method can be more sample efficient than OPE in the worst case. We then propose a BE method for OPS, called Identifiable BE Selection (IBES), that has a straightforward method for selecting its own hyperparameters. We highlight that using IBES for OPS generally has more requirements than OPE methods, but if satisfied, can be more sample efficient. We conclude with an empirical study comparing OPE and IBES, and by showing the difficulty of OPS on an offline Atari benchmark dataset

arXiv.org e-Print Archive

What You Need to Know about Bar-Code Medication Administration

Author: Kuhlmann Martha, DNP, MSN, RN, FNP, PMHCNS-BC, APRN
McBee Marie E, DNP, MSN
Patterson Pam, DNP, MSN, NE-BC
Publication venue: DigitalCommons@TMC
Publication date: 16/05/2019
Field of study

Medication errors are the most common type of preventable error. Bar-code medication administration (BCMA) technology was designed to reduce medication administration errors. Poor system design, implementation and workarounds remain a cause of errors. This paper reviews the literature on BCMA, identifies a gap in the findings and identifies three evidence based practices that could be used to improve system implementation and reduce error. The literature review identified that Bar-code medication administration and system workarounds are well documented and affect patient safety. Based on the critical analysis of 10 studies, we identified gaps in the standardization of BCMA planning, implementation, and sustainability. The themes that emerged from the literature were poor BCMA design and implementation that resulted in workarounds.The three evidence based strategies proposed to address this gap are, evidence based standardization in planning and implementation, the identification and elimination of workarounds and hard wiring. An evidence based checklist evaluates compliance with standard procedures. The LEAN model of Jodoka is used to assure adaptation of the machine to human workflow. Direct observation provides valuable workflow assessment. An effective BCMA implementation involves careful system design, identification of workflow issues which cause workarounds, and adapting the machine to nursing needs

DigitalCommons@The Texas Medical Center

SGPS 9105A: Value of residential care: Review and strategies to advocate for better wages for PHSS

Author: Currie-Patterson Natalie
Le Thi Hoai Anh
Palma Paolo Aldrin
Sinclair Vanessa Martha
Publication venue: Scholarship@Western
Publication date: 01/12/2018
Field of study

Scholarship@Western

Reviews

Author: Christopher Joe R.
Colvin George
Krieg Laurence
Krieg Martha
Patterson Nancy-Lou
Publication venue: SWOSU Digital Commons
Publication date: 15/12/1976
Field of study

1977 Tolkien Calendar. Greg and Tim Hildebrandt. Reviewed by Nancy-Lou Patterson. The Lord of the Rings 1977 Calendar. Illustrations by J. R. R. Tolkien, notes by Christopher Tolkien. Reviewed by Nancy-Lou Patterson. Adventure, Mystery, and Romance: Formula Stories as Art and Popular Culture. John G. Caweiti. Reviewed by Joe R. Christopher. Encyclopedia of Mystery and Detection. Chris Steinbrunner and Otto Penzler (eds.). Reviewed by Joe R. Christopher. The Father Christmas Letters. John Ronald Reuel Tolkien. Reviewed by Martha and Laurence Krieg. The Middle-earth Song- book. Ruth Berman and Ken Nahigian (eds.). Reviewed by George Colvin. From Elfland to Poughkeepsie. Ursula K. Le Guin. Reviewed by George Colvin. Camber of Culdi. Katherine Kurtz. Reviewed by George Colvin

SWOSU Digital Commons (Southwestern Oklahoma State University)