13,331 research outputs found

    Sample Efficient Policy Search for Optimal Stopping Domains

    Full text link
    Optimal stopping problems consider the question of deciding when to stop an observation-generating process in order to maximize a return. We examine the problem of simultaneously learning and planning in such domains, when data is collected directly from the environment. We propose GFSE, a simple and flexible model-free policy search method that reuses data for sample efficiency by leveraging problem structure. We bound the sample complexity of our approach to guarantee uniform convergence of policy value estimates, tightening existing PAC bounds to achieve logarithmic dependence on horizon length for our setting. We also examine the benefit of our method against prevalent model-based and model-free approaches on 3 domains taken from diverse fields.Comment: To appear in IJCAI-201

    Robustness of optimal inter-city railway network structure in Japan against alternative population distributions

    Get PDF
    It takes long time and huge amount of money to construct inter-city railway network. Careful demand forecasting and rational service planning are therefore required. However, long ranged demand forecasting is always facing to unintended change of regional population or change of the service level of competing transportation modes such as airline and inter-city express bus. Those changes sometimes resulted in severe decrease of demand for the constructed railway lines and discussion of abolishment of train service occurs. In order to avoid such tragedy, we want to build a robust network plan not vulnerable for the changes in forecasting conditions. This paper discusses the robustness of optimal inter-city railway network structure in Japan against alternative population distributions. Genetic Algorithm is applied to find best mixture of maximum operation speed category and number of daily train service for each link, which maximize the total consumer surplus of inter-city railway passengers. Consumer surplus is assessed by a gravity demand model considering service level along several routes for each OD pair. Travel time calculated by allocated link speed category, allocated train frequency, and estimated fare regressed by travel speed, will be summarized as route service level via ML route choice model parameters. In the GA, we consider a chromosome consists of two parts; speed category of 275 links and relative operation distance of trains in those links. Besides the real distribution of population in 197 Japanese local areas in the year of 1995, we set four other hypothetic population distributions; two of them concentrate in megalopolises like Tokyo, others disperse along geographically remote areas. We first obtain network structures optimized by the GA for each population setting. Speed category allocation will be compared for the five network plans. Secondly, we calculate total consumer surplus of each network plan under the different population settings and discuss the vulnerability of those plans. Thirdly, we optimize train operation plans for different population settings under the given speed category arrangements. The results shows that spatial arrangement of high speed railway service in 1995 keeps optimality for wide range of population settings, if we adjust number of trains according to alternative population distribution.
    • …
    corecore