13 research outputs found

    Extreme State Aggregation Beyond MDPs

    Full text link
    We consider a Reinforcement Learning setup where an agent interacts with an environment in observation-reward-action cycles without any (esp.\ MDP) assumptions on the environment. State aggregation and more generally feature reinforcement learning is concerned with mapping histories/raw-states to reduced/aggregated states. The idea behind both is that the resulting reduced process (approximately) forms a small stationary finite-state MDP, which can then be efficiently solved or learnt. We considerably generalize existing aggregation results by showing that even if the reduced process is not an MDP, the (q-)value functions and (optimal) policies of an associated MDP with same state-space size solve the original problem, as long as the solution can approximately be represented as a function of the reduced states. This implies an upper bound on the required state space size that holds uniformly for all RL problems. It may also explain why RL algorithms designed for MDPs sometimes perform well beyond MDPs.Comment: 28 LaTeX pages. 8 Theorem

    Bounded parameter Markov decision processes with average reward criterion

    No full text
    Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in the parameters of a Markov Decision Process (MDP). Unlike the case of an MDP, the notion of an optimal policy for a BMDP is not entirely straightforward. We consider two notions of optimality based on optimistic and pessimistic criteria. These have been analyzed for discounted BMDPs. Here we provide results for average reward BMDPs. We establish a fundamental relationship between the discounted and the average reward problems, prove the existence of Blackwell optimal policies and, for both notions of optimality, derive algorithms that converge to the optimal value function

    Distribution of lymphocytes and adhesion molecules in human cervix and vagina

    No full text
    Knowledge of the histological distribution of leucocytes and adhesion molecules in the human genital tract is scarce although local immunity in this region is important. Using immunohistochemical methods, we here describe the organization of CD3+, CD8+ and CD4+ T cells, CD19+ B cells, CD38+ plasma cells, major histocompatibility complex (MHC) class II+ antigen-presenting cells and CD14+ monocytes, as well as the expression of endothelial addressins in normal human ecto-cervical and vaginal mucosa. T cells were clustered in a distinct band beneath the epithelium and were also dispersed in the epithelium and the lamina propria, whereas CD38+ plasma cells were present only in the lamina propria. MHC class II+ cells were numerous in the lamina propria and in the epithelium, where they morphologically resembled dendritic cells. Lymphoid aggregates containing CD19+ and CD20+B cells as well as CD3+, CD4+ and CD8+ cells were also found in the cervix. The mucosal addressin cell adhesion molecule-1 (MAdCAM-1) was not expressed on the vascular endothelium in the cervical or vaginal mucosa. In contrast, intercellular adhesion molecule-1 (ICAM-1), vascular adhesion protein-1 (VAP-1) and P-selectin were expressed in all tissue samples, and vascular cell adhesion molecule-1 (VCAM-1) and E-selectin were found in four of seven samples. We conclude that the distribution of leucocytes and adhesion molecules is very similar in the ecto-cervical and the vaginal mucosa and that the regulation of lymphocyte homing to the genital tract is different from that seen in the intestine. Our results also clearly suggest that the leucocytes are not randomly scattered in the tissue but organized in a distinct pattern
    corecore