41 research outputs found

    Evaluating Risk to People and Property for Aircraft Emergency Landing Planning

    Full text link

    Dependency-aware Attention Control for Unconstrained Face Recognition with Image Sets

    Full text link
    This paper targets the problem of image set-based face verification and identification. Unlike traditional single media (an image or video) setting, we encounter a set of heterogeneous contents containing orderless images and videos. The importance of each image is usually considered either equal or based on their independent quality assessment. How to model the relationship of orderless images within a set remains a challenge. We address this problem by formulating it as a Markov Decision Process (MDP) in the latent space. Specifically, we first present a dependency-aware attention control (DAC) network, which resorts to actor-critic reinforcement learning for sequential attention decision of each image embedding to fully exploit the rich correlation cues among the unordered images. Moreover, we introduce its sample-efficient variant with off-policy experience replay to speed up the learning process. The pose-guided representation scheme can further boost the performance at the extremes of the pose variation.Comment: Fixed the unreadable code in CVF version. arXiv admin note: text overlap with arXiv:1707.00130 by other author

    Planning in hybrid relational MDPs

    Get PDF
    We study planning in relational Markov decision processes involving discrete and continuous states and actions, and an unknown number of objects. This combination of hybrid relational domains has so far not received a lot of attention. While both relational and hybrid approaches have been studied separately, planning in such domains is still challenging and often requires restrictive assumptions and approximations. We propose HYPE: a sample-based planner for hybrid relational domains that combines model-based approaches with state abstraction. HYPE samples episodes and uses the previous episodes as well as the model to approximate the Q-function. In addition, abstraction is performed for each sampled episode, this removes the complexity of symbolic approaches for hybrid relational domains. In our empirical evaluations, we show that HYPE is a general and widely applicable planner in domains ranging from strictly discrete to strictly continuous to hybrid ones, handles intricacies such as unknown objects and relational models. Moreover, empirical results showed that abstraction provides significant improvements.status: publishe

    AMPLE: an anytime planning and execution framework for dynamic and uncertain problems in robotics

    Get PDF
    Acting in robotics is driven by reactive and deliberative reasonings which take place in the competition between execution and planning processes. Properly balancing reactivity and deliberation is still an open question for harmonious execution of deliberative plans in complex robotic applications. We propose a flexible algorithmic framework to allow continuous real-time planning of complex tasks in parallel of their executions. Our framework, named AMPLE, is oriented towards robotic modular architectures in the sense that it turns planning algorithms into services that must be generic, reactive, and valuable. Services are optimized actions that are delivered at precise time points following requests from other modules that include states and dates at which actions are needed. To this end, our framework is divided in two concurrent processes: a planning thread which receives planning requests and delegates action selection to embedded planning softwares in compliance with the queue of internal requests, and an execution thread which orchestrates these planning requests as well as action execution and state monitoring. We show how the behavior of the execution thread can be parametrized to achieve various strategies which can differ, for instance, depending on the distribution of internal planning requests over possible future execution states in anticipation of the uncertain evolution of the system, or over different underlying planners to take several levels into account. We demonstrate the flexibility and the relevance of our framework on various robotic benchmarks and real experiments that involve complex planning problems of different natures which could not be properly tackled by existing dedicated planning approaches which rely on the standard plan-then-execute loop

    Le dilemme entre exploration et exploitation dans l'apprentissage par renforcement : optimisation adaptative des modèles de décision multi-états.

    No full text
    This thesis addresses the dilemma between exploration and exploitation as it is faced by reinforcement learning algorithms, i.e. the problem of the choice of the action during the adaptive optimisation of multi-states decision models, and particularly of Markovian decision processes. Reinforcement learning is characterised by the use of approximate solutions. For this sake, we take inspiration from works of other communities as decision theory and adaptive optimal control. Three groups of difficulties are stressed : the impossibility to reach certainty about the unknown parameters before an infinite number of experiments, the algorithms to the representation of the problem used. The original contribution of this thesis is then the synthesis of the different approaches to the problem, the study of the limits of reinforcement learning distributed architectures, the proposal of algorithms using back-propagation of uncertainty and the results of numerical simulations. / Cette thèse s'intéresse au dilemme entre l'exploration et l'exploitation tel qu'il se pose dans les algorithmes d'apprentissage par renforcement, c'est-à-dire au problème du choix de l'action lors de l'optimisation adaptative des modèles de décision multi-états, et plus particulièrement des processus de décision de Markov. L'apprentissage par renforcement se caractérise par l'utilisation de solutions approchées; cette recherche vise à améliorer ces solutions. Trois groupes de difficultés sont soulignées : l'impossibilité d'obtenir des certitudes sur les paramètres inconnus avant un nombre infini d'expérimentations, l'insuffisance des raisonnements à l'échelle locale, la sensibilité des algorithmes à la représentation du problème utilisée. La contribution de cette thèse porte donc sur la synthèse des différentes approches du problème, l'étude des limites des architectures distribuées de l'apprentissage par renforcement, la proposition d'algorithmes utilisant la rétropropagation de l'incertitude et les résultats des simulations numériques

    The artificial evolution of cooperation

    No full text
    We propose here a new approach to study co-evolution and we apply it to the well-known iterated prisoner's dilemma. The originality of our work is that it uses a simplified version of the game, and thus, restrict the search space of evolutionary dynamics. This allows to have a look at the totality of the search space in permanence, and so, a complete understanding of the phenomenon of co-evolution in process. The paper includes a little game-theoretic introduction to iterated prisoner's dilemma, a survey of previous works on evolution in this game and the exposition of the questions that were still asked to us. We describe then our special approach to the problem, using populations larger than the search space, or even infinite. The experimental results that we present complete the actual knowledge of iterated prisoner's dilemma. / Nous proposons une nouvelle approche pour étudier la co-évolution et l'appliquons au dilemme itéré du prisonnier. L'originalité de notre travail est qu'il utilise une version simplifié du jeu et ainsi limite l'espace de recherche des dynamiques évolutives. Cela permet d'avoir en permanence un regard sur la totalité de l'espace de recherche et ainsi, une compréhension complète des phénomènes de co-évolution en cours

    L'évolution artificielle de la coopération

    No full text
    We propose here a new approach to study co-evolution and we apply it to the well-known iterated prisoner's dilemma. The originality of our work is that it uses a simplified version of the game, and thus, restrict the search space of evolutionary dynamics. This allows to have a look at the totality of the search space in permanence, and so, a complete understanding of the phenomenon of co-evolution in process. The paper includes a little game-theoretic introduction to iterated prisoner's dilemma, a survey of previous works on evolution in this game and the exposition of the questions that were still asked to us. We describe then our special approach to the problem, using populations larger than the search space, or even infinite. The experimental results that we present complete the actual knowledge of iterated prisoner's dilemma.Nous proposons une nouvelle approche pour étudier la co-évolution et l'appliquons au dilemme itéré du prisonnier. L'originalité de notre travail est qu'il utilise une version simplifié du jeu et ainsi limite l'espace de recherche des dynamiques évolutives. Cela permet d'avoir en permanence un regard sur la totalité de l'espace de recherche et ainsi, une compréhension complète des phénomènes de co-évolution en cours

    Emergence de la coopération dans un modèle darwinien

    No full text
    This article is about the emergence of organization and cooperation within a population of autonomous individuals. The authors present the synthetic theory of evolution, taken from Darwin and modern genetics, and assume that these mechanisms help explain phenomena of organization evolution that can be observed in living animals. They propose a computer model of evolution that makes it possible to validate these hypotheses. / Cet article aborde la problématique de l'émergence de l'organisation et de la coopération au sein d'une population d'individus autonomes. Les auteurs présentent la théorie synthétique de l'évolution, issue du darwinisme et de la génétique moderne, et prétendent que ces mécanismes permettent d'expliquer les phénomènes d'évolution de l'organisation observables chez les êtres vivants. Sur ces bases, ils proposent un modèle computationnel de l'évolution permettant de tester la validité des hypothèses
    corecore