376 research outputs found

    Bandit Algorithms for Tree Search

    Get PDF
    Bandit based methods for tree search have recently gained popularity when applied to huge trees, e.g. in the game of go (Gelly et al., 2006). The UCT algorithm (Kocsis and Szepesvari, 2006), a tree search method based on Upper Confidence Bounds (UCB) (Auer et al., 2002), is believed to adapt locally to the effective smoothness of the tree. However, we show that UCT is too ``optimistic'' in some cases, leading to a regret O(exp(exp(D))) where D is the depth of the tree. We propose alternative bandit algorithms for tree search. First, a modification of UCT using a confidence sequence that scales exponentially with the horizon depth is proven to have a regret O(2^D \sqrt{n}), but does not adapt to possible smoothness in the tree. We then analyze Flat-UCB performed on the leaves and provide a finite regret bound with high probability. Then, we introduce a UCB-based Bandit Algorithm for Smooth Trees which takes into account actual smoothness of the rewards for performing efficient ``cuts'' of sub-optimal branches with high confidence. Finally, we present an incremental tree search version which applies when the full tree is too big (possibly infinite) to be entirely represented and show that with high probability, essentially only the optimal branches is indefinitely developed. We illustrate these methods on a global optimization problem of a Lipschitz function, given noisy data

    Sensitivity analysis in HMMs with application to likelihood maximization

    Get PDF
    International audienceThis paper considers a sensitivity analysis in Hidden Markov Models with continuous state and observation spaces. We propose an Infinitesimal Perturbation Analysis (IPA) on the filtering distribution with respect to some parameters of the model. We describe a methodology for using any algorithm that estimates the filtering density, such as Sequential Monte Carlo methods, to design an algorithm that estimates its gradient. The resulting IPA estimator is proven to be asymptotically unbiased, consistent and has computational complexity linear in the number of particles. We consider an application of this analysis to the problem of identifying unknown parameters of the model given a sequence of observations. We derive an IPA estimator for the gradient of the log-likelihood, which may be used in a gradient method for the purpose of likelihood maximization. We illustrate the method with several numerical experiments

    Matthias Eifler, Die Bibliothek des Erfurter Petersklosters im späten Mittelalter

    Get PDF
    Tiré d’une thèse soutenue en 2014 à Iéna, l’ouvrage vise à la reconstitution et à l’étude minutieuse de la bibliothèque du couvent bénédictin d’Erfurt dans une période de réforme interne à l’ordre bénédictin, Saint-Pierre d’Erfurt rejoignant la congrégation de Bursfelde mais aussi d’évolution plus générale des formes de piété et des prémices de la Réforme – l’étude va jusqu’en 1525. La bibliothèque de Saint-Pierre forme un observatoire remarquable pour vérifier la thèse de Felix Heinzer (2008..

    Particle filter-based policy gradient for pomdps

    Get PDF
    International audienceOur setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the belief state given past observations. We consider a policy gradient approach for parameterized policy optimization. For that purpose, we investigate sensitivity analysis of the performance measure with respect to the parameters of the policy, focusing on Finite Difference (FD) techniques. We show that the naive FD is subject to variance explosion because of the non-smoothness of the resampling procedure. We propose a more sophisticated FD method which overcomes this problem and establish its consistency

    Optimal Policies Search for Sensor Management : Application to the AESA Radar

    Get PDF
    This report introduces a new approach to solve sensor management problems. Classically sensor management problems are formalized as Partially-Observed Markov Decision Process (POMPD). Our original approach consists in deriving the optimal parameterized policy based on stochastic gradient estimation. Two differents techniques nammed Infinitesimal Approximation (IPA) and Likelihood Ratio (LR) can be used to adress such a problem. This report discusses how these methods can be used for gradient estimation in the context of sensor management . The effectiveness of this general framework is illustrated by the managing of an Active Electronically Scanned Array Radar (AESA Radar)

    William O’Brien

    Get PDF
    Nouvelle notice consacrée à William O’Brien (1881-1968), grande figure du mouvement ouvrier irlandais, dans le cadre du projet de mise à jour du dictionnaire en ligne Le Maitron (Dictionnaire biographique du mouvement ouvrier international - Grande-Bretagne et Irlande). http://maitron-en-ligne.univ-paris1.fr/spip.php?article139953&id_mot=125

    The Great Community: Culture and Nationalism in Ireland

    Get PDF
    Le présent ouvrage se veut une réévaluation du nationalisme culturel en Irlande à travers l’étude d’une part, de ses deux principaux représentants que sont, selon l’auteur, le mouvement Jeune Irlande des années 1840 et l’écrivain, William Butler Yeats (Parties I et II) et d’autre part, de ses relations avec l’un de ses principaux vecteurs de propagande, la presse (Partie III). L’analyse des deux premières parties présente l’intérêt de mettre à jour nombre d’aspects bien souvent méconnus des i..

    The erythrocytic schizogony of two synchronized strains of Plasmodium berghei, NK65 and ANKA, in normocytes and reticulocytes

    Get PDF
    By a modified Percoll-glucose centrifugation technique the rings and young trophozoites of two strains of Plasmodium berghei, NK65 and ANKA, were separated from the other erythrocytic stages and inoculated into mice. The subsequent infection was followed for ANKA in normal mice and for NK65 in normal mice and in mice with high-grade reticulocytosis induced by injections of phenylhydrazine. The duration of the erythrocytic schizogony of the NK65 strain was shown to be independent of the age of the host cell, and the hour of inoculation did not influence the cycle of the ANKA strain

    La prudence et l'amitié. Politique et imaginaire urbains au miroir de la correspondance erfortoise.

    Get PDF
    Version soumise aux éditeurs, avant corrections. L'article définitif a paru dans I. Draelants et Chr. Balouzat-Loubet, La formule au Moyen Âge II / Formulas in Medieval Culture II, Turnhout, Brepols, 2015, p.35-60.The letters sent by the city of Erfurt to other city councils, at the end of the 15th century and the very beginning of the 16th century, used codified forms of address which showed off the ideal qualities that a city was meant to have and expressed the common friendship that existed between German cities. Forming a city’s essence, this friendship implied qualities such as wisdom and caution, qualities which ensured good governance. However this urban world was heterogeneous : variations in the forms of address rendered explicit the organization of the network as a hierarchical system as well as positioned Erfurt within it. These forms of address distinguished several groups of partners to which Erfurt presented itself, in turn recognizing Erfurt, as one of the major cities in the Holy Roman Empire. Language was a way of building an efficient political network, and of compensating for the distance between cities. The process of writing the letters, as well as the letters themselves, helped to promote Erfurt’s political position, a position constructed as much through written representations as through actual power
    • …
    corecore