Search CORE

2,275 research outputs found

Monte Carlo Bayesian Reinforcement Learning

Author: Hsu David
Lee Wee Sun
Wang Yi
Won Kok Sung
Publication venue
Publication date: 01/01/2012
Field of study

Bayesian reinforcement learning (BRL) encodes prior knowledge of the world in a model and represents uncertainty in model parameters by maintaining a probability distribution over them. This paper presents Monte Carlo BRL (MC-BRL), a simple and general approach to BRL. MC-BRL samples a priori a finite set of hypotheses for the model parameter values and forms a discrete partially observable Markov decision process (POMDP) whose state space is a cross product of the state space for the reinforcement learning task and the sampled model parameter space. The POMDP does not require conjugate distributions for belief representation, as earlier works do, and can be solved relatively easily with point-based approximation algorithms. MC-BRL naturally handles both fully and partially observable worlds. Theoretical and experimental results show that the discrete POMDP approximates the underlying BRL task well with guaranteed performance.Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012

arXiv.org e-Print Archive

CiteSeerX

ScholarBank@NUS

Invariant measures concentrated on countable structures

Author: Cameron
Cherlin
Cherlin
Diaconis
Erdős
Erdős
Janson
Kallenberg
Keisler
Kolaitis
Marker
Steinhorn
Vershik
Vershik
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2016
Field of study

Let L be a countable language. We say that a countable infinite L-structure M admits an invariant measure when there is a probability measure on the space of L-structures with the same underlying set as M that is invariant under permutations of that set, and that assigns measure one to the isomorphism class of M. We show that M admits an invariant measure if and only if it has trivial definable closure, i.e., the pointwise stabilizer in Aut(M) of an arbitrary finite tuple of M fixes no additional points. When M is a Fraisse limit in a relational language, this amounts to requiring that the age of M have strong amalgamation. Our results give rise to new instances of structures that admit invariant measures and structures that do not.Comment: 46 pages, 2 figures. Small changes following referee suggestion

arXiv.org e-Print Archive

Crossref

On Minimum Violations Ranking in Paired Comparisons

Author: Callagan T
Domb C
Dunnavant K
Erdös P
Ferreira F F
Jaynes E T
Juyong Park
Kendall M G
Lott D F
Park J
Peskin M E
Publication venue: 'IOP Publishing'
Publication date: 01/11/2005
Field of study

Ranking a set of objects from the most dominant one to the least, based on the results of paired comparisons, proves to be useful in many contexts. Using the rankings of teams or individuals players in sports to seed tournaments is an example. The quality of a ranking is often evaluated by the number of violations, cases in which an object is ranked lower than another that it has dominated in a comparison, that it contains. A minimum violations ranking (MVR) method, as its name suggests, searches specifically for rankings that have the minimum possible number of violations which may or may not be zero. In this paper, we present a method based on statistical physics that overcomes conceptual and practical difficulties faced by earlier studies of the problem.Comment: 10 pages, 10 figures; typos corrected (v2

arXiv.org e-Print Archive

Crossref