30 research outputs found

    A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum Markov Games

    Full text link
    Optimal policies in standard MDPs can be obtained using either value iteration or policy iteration. However, in the case of zero-sum Markov games, there is no efficient policy iteration algorithm; e.g., it has been shown that one has to solve Omega(1/(1-alpha)) MDPs, where alpha is the discount factor, to implement the only known convergent version of policy iteration. Another algorithm, called naive policy iteration, is easy to implement but is only provably convergent under very restrictive assumptions. Prior attempts to fix naive policy iteration algorithm have several limitations. Here, we show that a simple variant of naive policy iteration for games converges exponentially fast. The only addition we propose to naive policy iteration is the use of lookahead policies, which are anyway used in practical algorithms. We further show that lookahead can be implemented efficiently in the function approximation setting of linear Markov games, which are the counterpart of the much-studied linear MDPs. We illustrate the application of our algorithm by providing bounds for policy-based RL (reinforcement learning) algorithms. We extend the results to the function approximation setting.Comment: 41 page

    The Role of Lookahead and Approximate Policy Evaluation in Reinforcement Learning with Linear Value Function Approximation

    Full text link
    Function approximation is widely used in reinforcement learning to handle the computational difficulties associated with very large state spaces. However, function approximation introduces errors which may lead to instabilities when using approximate dynamic programming techniques to obtain the optimal policy. Therefore, techniques such as lookahead for policy improvement and m-step rollout for policy evaluation are used in practice to improve the performance of approximate dynamic programming with function approximation. We quantitatively characterize, for the first time, the impact of lookahead and m-step rollout on the performance of approximate dynamic programming (DP) with function approximation: (i) without a sufficient combination of lookahead and m-step rollout, approximate DP may not converge, (ii) both lookahead and m-step rollout improve the convergence rate of approximate DP, and (iii) lookahead helps mitigate the effect of function approximation and the discount factor on the asymptotic performance of the algorithm. Our results are presented for two approximate DP methods: one which uses least-squares regression to perform function approximation and another which performs several steps of gradient descent of the least-squares objective in each iteration.Comment: 36 pages, 4 figure

    Jak poprawić stopień przestrzegania zaleceń terapeutycznych i jakość współpracy lekarz - pacjent

    Get PDF
    Good cooperation between patient and physician is a very important part of treatment, especially in the case of chronic diseases. Previous studies conducted by the World Health Organization show that, on average, every second patient doesn’t follow therapeutic recommendations. In Poland, this percentage is even higher, and in the case of some diseases exceeds 70%. Importantly, these results are based primarily on patient statements, obtained by using questionnaire reviews, so in practice, the percentage of not properly cooperating patients may be even larger.The reasons for this phenomenon lie both on the patients and health care professionals side. The greatest impact on patients health behavior have psychological and socio-economical factors. First group includes primarily cognitive function, life satisfaction, personality, sense of control and mental state. The second group is associated mainly with the material status, but as the cyclic surveys on the Polish population show, the impact of income on treatment adherence from year to year is becoming smaller. Causes related with Health Service concern invalid communication between doctor and patient as well as lack of patient’s involvement in setting plan of therapy.Previous studies indicate how important is the quality of the relationship between physician and patient. Healthcare professionals should recognize patient’s needs and possibilities and fit treatment process to them. Better cooperation can be achieved by guiding motivation dialogue and patient’s engagement in therapy plan determination.Dobra współpraca lekarza z pacjentem jest bardzo ważnym elementem leczenia, zwłaszcza w przypadku chorób przewlekłych. Dotychczasowe badania prowa­dzone przez Światową Organizację Zdrowia wskazują, że przeciętnie co drugi chory nie przestrzega prawidło­wo zaleceń terapeutycznych. W Polsce odsetek ten jest jeszcze wyższy i w przypadku niektórych chorób sięga ponad 70%. Co ważne, wyniki te opierają się przede wszystkim na deklaracjach pacjentów uzyskanych na postawie kwestionariuszowych narzędzi badawczych, zatem w praktyce odsetek chorych niewspółpracują­cych w sposób prawidłowy może być jeszcze większy. Przyczyny tego zjawiska leżą zarówno po stronie pa­cjentów, jak i pracowników służby zdrowia. Na chorych najbardziej wpływają czynniki psychologiczne oraz społeczno-ekonomiczne. Do tych pierwszych należy zaliczyć przede wszystkim funkcjonowanie poznaw­cze, satysfakcję z życia, osobowość, poczucie kontroli oraz stan psychiczny. Druga grupa wiąże się przede wszystkim z sytuacją materialną, jednak — jak poka­zują cykliczne badania w polskiej populacji — wpływ dochodów na przestrzeganie zaleceń terapeutycznych z roku na rok jest coraz mniejszy. Powody związane z opieką medyczną to przede wszystkim nieprawidło­wa komunikacja z lekarzem i nieangażowanie chorego w ustalanie planu terapii. Dotychczasowe wyniki badań wskazują, jak istotna dla przestrzegania zaleceń terapeutycznych jest jakość relacji lekarz–pacjent. Pracownicy służby zdrowia po­winni poznać chorego i dostosować proces leczenia do jego potrzeb i możliwości. Polepszenie współpracy można osiągnąć, prowadząc dialog motywujący i an­gażując chorego w ustalanie planu terapii

    Genome-wide association study identifies human genetic variants associated with fatal outcome from Lassa fever

    Get PDF
    Infection with Lassa virus (LASV) can cause Lassa fever, a haemorrhagic illness with an estimated fatality rate of 29.7%, but causes no or mild symptoms in many individuals. Here, to investigate whether human genetic variation underlies the heterogeneity of LASV infection, we carried out genome-wide association studies (GWAS) as well as seroprevalence surveys, human leukocyte antigen typing and high-throughput variant functional characterization assays. We analysed Lassa fever susceptibility and fatal outcomes in 533 cases of Lassa fever and 1,986 population controls recruited over a 7 year period in Nigeria and Sierra Leone. We detected genome-wide significant variant associations with Lassa fever fatal outcomes near GRM7 and LIF in the Nigerian cohort. We also show that a haplotype bearing signatures of positive selection and overlapping LARGE1, a required LASV entry factor, is associated with decreased risk of Lassa fever in the Nigerian cohort but not in the Sierra Leone cohort. Overall, we identified variants and genes that may impact the risk of severe Lassa fever, demonstrating how GWAS can provide insight into viral pathogenesis
    corecore