74,936 research outputs found

    Non-parametric policy search with limited information loss

    Get PDF
    Learning complex control policies from non-linear and redundant sensory input is an important challenge for reinforcement learning algorithms. Non-parametric methods that approximate values functions or transition models can address this problem, by adapting to the complexity of the dataset. Yet, many current non-parametric approaches rely on unstable greedy maximization of approximate value functions, which might lead to poor convergence or oscillations in the policy update. A more robust policy update can be obtained by limiting the information loss between successive state-action distributions. In this paper, we develop a policy search algorithm with policy updates that are both robust and non-parametric. Our method can learn non-parametric control policies for infinite horizon continuous Markov decision processes with non-linear and redundant sensory representations. We investigate how we can use approximations of the kernel function to reduce the time requirements of the demanding non-parametric computations. In our experiments, we show the strong performance of the proposed method, and how it can be approximated efficiently. Finally, we show that our algorithm can learn a real-robot underpowered swing-up task directly from image data

    Non-parametric policy search with limited information loss

    Get PDF
    Learning complex control policies from non-linear and redundant sensory input is an important challenge for reinforcement learning algorithms. Non-parametric methods that approximate values functions or transition models can address this problem, by adapting to the complexity of the data set. Yet, many current non-parametric approaches rely on unstable greedy maximization of approximate value functions, which might lead to poor convergence or oscillations in the policy update. A more robust policy update can be obtained by limiting the information loss between successive state-action distributions. In this paper, we develop a policy search algorithm with policy updates that are both robust and non-parametric. Our method can learn non- parametric control policies for infinite horizon continuous Markov decision processes with non-linear and redundant sensory representations. We investigate how we can use approximations of the kernel function to reduce the time requirements of the demanding non-parametric computations. In our experiments, we show the strong performance of the proposed method, and how it can be approximated efficiently. Finally, we show that our algorithm can learn a real-robot under-powered swing-up task directly from image data

    An event history analysis on German long-term unemployment

    Get PDF
    This paper investigates the determinants of German long-term unemployment. In particular a microeconometric event history analysis will be carried out to examine what impact personal characteristics such as age, gender, education, etc. or factors such as receiving unemployment benefits have on the length of unemployment. The paper further discusses the advantages and disadvantages of a semi-parametric and a parametric estimate of the sample. The use of the Cox model on the one hand and a Weibull specified model on the other have failed to offer any corroboration for application of the semiparametric approach favoured in the theoretical literature. One can also see that not all groups are equally affected by long term unemployment. This is an important finding in terms of economic policy because it sheds light into appropriate policy measures that should be considered to reduce the lenght of time certain groups spend in unemployment. --Unemployment Models,Duration Analysis
    • …
    corecore