70,107 research outputs found

    Better Optimism By Bayes: Adaptive Planning with Rich Models

    Full text link
    The computational costs of inference and planning have confined Bayesian model-based reinforcement learning to one of two dismal fates: powerful Bayes-adaptive planning but only for simplistic models, or powerful, Bayesian non-parametric models but using simple, myopic planning strategies such as Thompson sampling. We ask whether it is feasible and truly beneficial to combine rich probabilistic models with a closer approximation to fully Bayesian planning. First, we use a collection of counterexamples to show formal problems with the over-optimism inherent in Thompson sampling. Then we leverage state-of-the-art techniques in efficient Bayes-adaptive planning and non-parametric Bayesian methods to perform qualitatively better than both existing conventional algorithms and Thompson sampling on two contextual bandit-like problems.Comment: 11 pages, 11 figure

    NIEZRÓWNOWAŻONA KLASYFIKACJA WIELOKLASOWA Z ADAPTACYJNYM SYNTETYCZNYM WIELOMIANOWYM NAIWNYM PODEJŚCIEM BAYESA

    Get PDF
    Opinions related to rising fuel prices need to be seen and analysed. Public opinion is closely related to public policy in Indonesia in the future. Twitter is one of the media that people use to convey their opinions. This study uses sentiment analysis to look at this phenomenon. Sentiment is divided into three categories: positive, neutral, and negative. The methods used in this research are Adaptive Synthetic Multinomial Naive Bayes, Adaptive Synthetic k-nearest neighbours, and Adaptive Synthetic Random Forest. The Adaptive Synthetic method is used to handle unbalanced data. The data used in this study are public arguments per province in Indonesia. The results obtained in this study are negative sentiments that dominate all provinces in Indonesia. There is a relationship between negative sentiment and the level of education, internet use, and the human development index. Adaptive Synthetic Multinomial Naive Bayes performed better than other methods, with an accuracy of 0.882. The highest accuracy of the Adaptive Synthetic Multinomial Naive Bayes method is 0.990 in Papua Barat Province.Należy przyjrzeć się i przeanalizować opinie związane z rosnącymi cenami paliw. Opinia publiczna jest ściśle związana z polityką publiczną Indonezji w przyszłości. Twitter jest jednym z mediów, których ludzie używają do przekazywania swoich opinii. Niniejsze badanie wykorzystuje analizę nastrojów, aby przyjrzeć się temu zjawisku. Opinia jest podzielona na trzy kategorie: pozytywną, neutralną i negatywną. Metody wykorzystane w tym badaniu to Adaptive Synthetic Multinomial Naive Bayes, Adaptive Synthetic k-nearest neighbours i Adaptive Synthetic Random Forest. Metoda Adaptive Synthetic służy do obsługi niezrównoważonych danych. Dane wykorzystane w tym badaniu to argumenty publiczne według prowincji w Indonezji. Wyniki uzyskane w tym badaniu to negatywne nastroje, które dominują we wszystkich prowincjach Indonezji. Istnieje związek między negatywnymi nastrojami a poziomem wykształcenia, korzystaniem z Internetu i wskaźnikiem rozwoju społecznego. Adaptive Synthetic Multinomial Naive Bayes działała lepiej niż inne metody, z dokładnością 0,882. Najwyższa dokładność metody Adaptive Synthetic Multinomial Naive Bayes wynosi 0,990 w prowincji Papua Barat

    Adaptive posterior contraction rates for the horseshoe

    Get PDF
    We investigate the frequentist properties of Bayesian procedures for estimation based on the horseshoe prior in the sparse multivariate normal means model. Previous theoretical results assumed that the sparsity level, that is, the number of signals, was known. We drop this assumption and characterize the behavior of the maximum marginal likelihood estimator (MMLE) of a key parameter of the horseshoe prior. We prove that the MMLE is an effective estimator of the sparsity level, in the sense that it leads to (near) minimax optimal estimation of the underlying mean vector generating the data. Besides this empirical Bayes procedure, we consider the hierarchical Bayes method of putting a prior on the unknown sparsity level as well. We show that both Bayesian techniques lead to rate-adaptive optimal posterior contraction, which implies that the horseshoe posterior is a good candidate for generating rate-adaptive credible sets.Comment: arXiv admin note: substantial text overlap with arXiv:1607.0189

    Bayesian Reinforcement Learning via Deep, Sparse Sampling

    Full text link
    We address the problem of Bayesian reinforcement learning using efficient model-based online planning. We propose an optimism-free Bayes-adaptive algorithm to induce deeper and sparser exploration with a theoretical bound on its performance relative to the Bayes optimal policy, with a lower computational complexity. The main novelty is the use of a candidate policy generator, to generate long-term options in the planning tree (over beliefs), which allows us to create much sparser and deeper trees. Experimental results on different environments show that in comparison to the state-of-the-art, our algorithm is both computationally more efficient, and obtains significantly higher reward in discrete environments.Comment: Published in AISTATS 202

    Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

    Full text link
    Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach outperformed prior Bayesian model-based RL algorithms by a significant margin on several well-known benchmark problems -- because it avoids expensive applications of Bayes rule within the search tree by lazily sampling models from the current beliefs. We illustrate the advantages of our approach by showing it working in an infinite state space domain which is qualitatively out of reach of almost all previous work in Bayesian exploration.Comment: 14 pages, 7 figures, includes supplementary material. Advances in Neural Information Processing Systems (NIPS) 201
    corecore