Search CORE

22 research outputs found

On Reward Structures of Markov Decision Processes

Author: Dai Falcon Z.
Publication venue
Publication date: 31/08/2023
Field of study

A Markov decision process can be parameterized by a transition kernel and a reward function. Both play essential roles in the study of reinforcement learning as evidenced by their presence in the Bellman equations. In our inquiry of various kinds of "costs" associated with reinforcement learning inspired by the demands in robotic applications, rewards are central to understanding the structure of a Markov decision process and reward-centric notions can elucidate important concepts in reinforcement learning. Specifically, we study the sample complexity of policy evaluation and develop a novel estimator with an instance-specific error bound of

\tilde{O}(\sqrt{\frac{\tau_s}{n}})

for estimating a single state value. Under the online regret minimization setting, we refine the transition-based MDP constant, diameter, into a reward-based constant, maximum expected hitting cost, and with it, provide a theoretical explanation for how a well-known technique, potential-based reward shaping, could accelerate learning with expert knowledge. In an attempt to study safe reinforcement learning, we model hazardous environments with irrecoverability and proposed a quantitative notion of safe learning via reset efficiency. In this setting, we modify a classic algorithm to account for resets achieving promising preliminary numerical results. Lastly, for MDPs with multiple reward functions, we develop a planning algorithm that computationally efficiently finds Pareto-optimal stochastic policies.Comment: This PhD thesis draws heavily from arXiv:1907.02114 and arXiv:2002.06299; minor edit

arXiv.org e-Print Archive

Loop Estimator for Discounted Values in Markov Reward Processes

Author: Dai Falcon Z.
Walter Matthew R.
Publication venue
Publication date: 03/03/2021
Field of study

At the working heart of policy iteration algorithms commonly used and studied in the discounted setting of reinforcement learning, the policy evaluation step estimates the value of states with samples from a Markov reward process induced by following a Markov policy in a Markov decision process. We propose a simple and efficient estimator called loop estimator that exploits the regenerative structure of Markov reward processes without explicitly estimating a full model. Our method enjoys a space complexity of

O(1)

when estimating the value of a single positive recurrent state

s

unlike TD with

O(S)

or model-based methods with

O\left(S^2\right)

. Moreover, the regenerative structure enables us to show, without relying on the generative model approach, that the estimator has an instance-dependent convergence rate of

\widetilde{O}\left(\sqrt{\tau_s/T}\right)

over steps

T

on a single sample path, where

\tau_s

is the maximal expected hitting time to state

s

. In preliminary numerical experiments, the loop estimator outperforms model-free methods, such as TD(k), and is competitive with the model-based estimator.Comment: accepted to AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Avoiding the pitfalls of gene set enrichment analysis with SetRank

Author: A Alexa
A Subramanian
AL Tarca
AL Tarca
B Efron
Cedric Simillion
D Croft
D Nam
D Szklarczyk
D Wu
DA Barbie
E Eden
E Eden
E Lee
F Bastian
Heidi E.L. Lischer
J Michaud
J Tomfohr
JJ Goeman
K-H Pan
L Tian
M Ashburner
M Dai
M Kanehisa
M Krupp
ME Smoot
N Raghavan
PD Karp
RA Irizarry
Robin Liechti
Rémy Bruggmann
S Bauer
S Falcon
S Holm
S Hänzelmann
S Maere
T Barrett
T Kelder
Vassilios Ioannidis
W Luo
WT Barry
Y Lu
Z Jiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Association of time-serial changes in ambient particulate matters (PMs) with respiratory emergency cases in Taipei's Wenshan District

Author: A Roy
A Winquist
BJ Tunno
Chin-Wang Hsu
CI Falcon-Rodriguez
CL Lee
CP Iii
D Meszaros
DR Riva
H Zhou
J Heo
Jer-Hwa Chang
JI Halonen
JT Bates
Koustubh Panda
Kuan-Jen Bai
L Dai
M Lundborg
M-H Cheng
MM Nakhlé
N a H Janssen
N Mushtaq
P Li
Q Song
RW Atkinson
S Devos
S Huang
S Zhang
S-L Hwang
S-S Tsai
S-S Tsai
SD Adar
Shau-Ku Huang
Shih-Chang Hsu
SL Zeger
T Goehl
Y-F Xing
Z Meng
Publication venue: 'Public Library of Science (PLoS)'
Publication date
Field of study

Crossref

Network analysis identifies a putative role for the PPAR and type 1 interferon pathways in glucocorticoid actions in asthmatics

Abstract Background Asthma is a chronic inflammatory airway disease influenced by genetic and environmental factors that affects ~300 million people worldwide, leading to ~250,000 deaths annually. Glucocorticoids (GCs) are well-known therapeutics that are used extensively to suppress airway inflammation in asthmatics. The airway epithelium plays an important role in the initiation and modulation of the inflammatory response. While the role of GCs in disease management is well understood, few studies have examined the holistic effects on the airway epithelium. Methods Gene expression data were used to generate a co-transcriptional network, which was interrogated to identify modules of functionally related genes. In parallel, expression data were mapped to the human protein-protein interaction (PPI) network in order to identify modules with differentially expressed genes. A common pathways approach was applied to highlight genes and pathways functionally relevant and significantly altered following GC treatment. Results Co-transcriptional network analysis identified pathways involved in inflammatory processes in the epithelium of asthmatics, including the Toll-like receptor (TLR) and PPAR signaling pathways. Analysis of the PPI network identified <it>RXRA</it>, <it>PPARGC1A</it>, <it>STAT1</it> and <it>IRF9</it>, among others genes, as differentially expressed. Common pathways analysis highlighted TLR and PPAR signaling pathways, providing a link between general inflammatory processes and the actions of GCs. Promoter analysis identified genes regulated by the glucocorticoid receptor (GCR) and PPAR pathways as well as highlighted the interferon pathway as a target of GCs. Conclusions Network analyses identified known genes and pathways associated with inflammatory processes in the airway epithelium of asthmatics. This workflow illustrated a hypothesis generating experimental design that integrated multiple analysis methods to produce a weight-of-evidence based approach upon which future focused studies can be designed. In this case, results suggested a mechanism whereby GCs repress TLR-mediated interferon production via upregulation of the PPAR signaling pathway. These results highlight the role of interferons in asthma and their potential as targets of future therapeutic efforts.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California