Search CORE

13 research outputs found

Extreme State Aggregation Beyond MDPs

Author: A.L. Strehl
I. Fazekas
M. Hutter
M. Hutter
M.L. Puterman
O.-A. Maillard
P. Nguyen
P. Nguyen
P. Sunehag
R. Givan
R.S. Sutton
S.J. Russell
T. Jaksch
T. Lattimore
T. Lattimore
T. Lattimote
V. Vovk
Publication venue
Publication date: 01/01/2014
Field of study

We consider a Reinforcement Learning setup where an agent interacts with an environment in observation-reward-action cycles without any (esp.\ MDP) assumptions on the environment. State aggregation and more generally feature reinforcement learning is concerned with mapping histories/raw-states to reduced/aggregated states. The idea behind both is that the resulting reduced process (approximately) forms a small stationary finite-state MDP, which can then be efficiently solved or learnt. We considerably generalize existing aggregation results by showing that even if the reduced process is not an MDP, the (q-)value functions and (optimal) policies of an associated MDP with same state-space size solve the original problem, as long as the solution can approximately be represented as a function of the reduced states. This implies an upper bound on the required state space size that holds uniformly for all RL problems. It may also explain why RL algorithms designed for MDPs sometimes perform well beyond MDPs.Comment: 28 LaTeX pages. 8 Theorem

arXiv.org e-Print Archive

Crossref

The Australian National University

Bounded parameter Markov decision processes with average reward criterion

Author: A. Nilim
A.L. Strehl
A.N. Burnetas
E. Even-Dar
P. Auer
R. Givan
R.I. Brafman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in the parameters of a Markov Decision Process (MDP). Unlike the case of an MDP, the notion of an optimal policy for a BMDP is not entirely straightforward. We consider two notions of optimality based on optimistic and pessimistic criteria. These have been analyzed for discounted BMDPs. Here we provide results for average reward BMDPs. We establish a fundamental relationship between the discounted and the average reward problems, prove the existence of Blackwell optimal policies and, for both notions of optimality, derive algorithms that converge to the optimal value function

CiteSeerX

Crossref

Queensland University of Technology ePrints Archive

Ribulose bisphosphate carboxylase from a mutant strain of Chlamydomonas reinhardii deficient in chloroplast ribosomes

Author: A.C. Vasconcelos
A.L. Givan
A.L. Givan
A.R. Cashmore
B. Bowien
B. Dobberstein
B. Parthier
B.J. Davis
C.A. Codd
D.M. Baumgartel
F.R. Tabita
G.E. Blair
G.F. Wildner
J.D. Gregory
J.E. Boynton
J.J. Armstrong
J.K. Hoober
K. Weber
O. Ciferri
O.M. Lowry
P.E. Highfield
R.J. Ellis
R.J. Ellis
R.K. Togasaki
R.M. Smillie
R.M. Smillie
R.S. Criddle
T. Börner
T. Börner
T. Takabe
U.W. Goodenough
V. Iwanij
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1979
Field of study

Crossref

Knowledge of the histological distribution of leucocytes and adhesion molecules in the human genital tract is scarce although local immunity in this region is important. Using immunohistochemical methods, we here describe the organization of CD3+, CD8+ and CD4+ T cells, CD19+ B cells, CD38+ plasma cells, major histocompatibility complex (MHC) class II+ antigen-presenting cells and CD14+ monocytes, as well as the expression of endothelial addressins in normal human ecto-cervical and vaginal mucosa. T cells were clustered in a distinct band beneath the epithelium and were also dispersed in the epithelium and the lamina propria, whereas CD38+ plasma cells were present only in the lamina propria. MHC class II+ cells were numerous in the lamina propria and in the epithelium, where they morphologically resembled dendritic cells. Lymphoid aggregates containing CD19+ and CD20+B cells as well as CD3+, CD4+ and CD8+ cells were also found in the cervix. The mucosal addressin cell adhesion molecule-1 (MAdCAM-1) was not expressed on the vascular endothelium in the cervical or vaginal mucosa. In contrast, intercellular adhesion molecule-1 (ICAM-1), vascular adhesion protein-1 (VAP-1) and P-selectin were expressed in all tissue samples, and vascular cell adhesion molecule-1 (VCAM-1) and E-selectin were found in four of seven samples. We conclude that the distribution of leucocytes and adhesion molecules is very similar in the ecto-cervical and the vaginal mucosa and that the regulation of lymphocyte homing to the genital tract is different from that seen in the intestine. Our results also clearly suggest that the leucocytes are not randomly scattered in the tissue but organized in a distinct pattern

Crossref

PubMed Central