2,275 research outputs found

    Monte Carlo Bayesian Reinforcement Learning

    Full text link
    Bayesian reinforcement learning (BRL) encodes prior knowledge of the world in a model and represents uncertainty in model parameters by maintaining a probability distribution over them. This paper presents Monte Carlo BRL (MC-BRL), a simple and general approach to BRL. MC-BRL samples a priori a finite set of hypotheses for the model parameter values and forms a discrete partially observable Markov decision process (POMDP) whose state space is a cross product of the state space for the reinforcement learning task and the sampled model parameter space. The POMDP does not require conjugate distributions for belief representation, as earlier works do, and can be solved relatively easily with point-based approximation algorithms. MC-BRL naturally handles both fully and partially observable worlds. Theoretical and experimental results show that the discrete POMDP approximates the underlying BRL task well with guaranteed performance.Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012

    Invariant measures concentrated on countable structures

    Get PDF
    Let L be a countable language. We say that a countable infinite L-structure M admits an invariant measure when there is a probability measure on the space of L-structures with the same underlying set as M that is invariant under permutations of that set, and that assigns measure one to the isomorphism class of M. We show that M admits an invariant measure if and only if it has trivial definable closure, i.e., the pointwise stabilizer in Aut(M) of an arbitrary finite tuple of M fixes no additional points. When M is a Fraisse limit in a relational language, this amounts to requiring that the age of M have strong amalgamation. Our results give rise to new instances of structures that admit invariant measures and structures that do not.Comment: 46 pages, 2 figures. Small changes following referee suggestion

    On Minimum Violations Ranking in Paired Comparisons

    Full text link
    Ranking a set of objects from the most dominant one to the least, based on the results of paired comparisons, proves to be useful in many contexts. Using the rankings of teams or individuals players in sports to seed tournaments is an example. The quality of a ranking is often evaluated by the number of violations, cases in which an object is ranked lower than another that it has dominated in a comparison, that it contains. A minimum violations ranking (MVR) method, as its name suggests, searches specifically for rankings that have the minimum possible number of violations which may or may not be zero. In this paper, we present a method based on statistical physics that overcomes conceptual and practical difficulties faced by earlier studies of the problem.Comment: 10 pages, 10 figures; typos corrected (v2
    corecore