85 research outputs found

    Essays on Self-Referential Games

    Get PDF
    This dissertation studies self-referential games in which agents can learn (perfectly and imperfectly) about an opponents\u27 intentions from a private signal. In the first chapter, my main focus is on the interaction of two sources of information about opponents\u27 play: direct observation of an opponent\u27s code of conduct and indirect observation of the same opponent\u27s play in a repeated setting. Using both sources of information I prove a folk theorem for repeated self-referential games with private monitoring. In the second chapter, I investigate the impact of self-referentiality on bad reputation games in which the long-run player must choose specific actions to make short-run players participate in the game. Since these particular actions could be interpreted as evidence of perverse behavior, the long-run agent attempts to separate himself from other types and this results in efficiency losses. When players identify intentions perfectly, I show that inefficiencies and reputational concerns due to a bad reputation disappear. In the case of imperfect observation, I find that self-referentiality and stochastic renewal of the long-run player together overcome inefficiencies because of bad reputation. In the third chapter, I address the timing of signals in self-referential games. These models typically suppose that intentions are divined in a pre-play phase; however, in many applications this may not be the case. For games with perfect information when players observe signals in advance, I show that any subgame perfect equilibria of an infinite-horizon game coincides with a Nash equilibrium of the self-referential finite-horizon approximation of the original game. Then, I focus on two specific classes of games. First, in finitely repeated games with discounting I show that a version of the folk theorem holds regardless of the time at which signals are observed. Second, I examine exit games in which players can terminate the game at any stage. In contrast to repeated games, I find that the equilibrium outcome of the self-referential exit game is unique if signals arrive after the first stage, whereas a folk theorem results only if they occur before the first stage. Finally, I explore asynchronous monitoring of intentions where players may not receive signals simultaneously. With asynchronicity, a folk theorem continues to apply for repeated games; however, for exit games there is a unique equilibrium outcome independent of signal timing, or indeed, independent of having a signal

    How do life, economy and other complex systems escape the heat death?

    Full text link
    The primordial confrontation underlying the existence of our universe can be conceived as the battle between entropy and complexity. The law of ever-increasing entropy (Boltzmann H-theorem) evokes an irreversible, one-directional evolution (or rather involution) going uniformly and monotonically from birth to death. Since the 19th century, this concept is one of the cornerstones and in the same time puzzles of statistical mechanics. On the other hand, there is the empirical experience where one witnesses the emergence, growth and diversification of new self-organized objects with ever-increasing complexity. When modeling them in terms of simple discrete elements one finds that the emergence of collective complex adaptive objects is a rather generic phenomenon governed by a new type of laws. These 'emergence' laws, not connected directly with the fundamental laws of the physical reality, nor acting 'in addition' to them but acting through them were called by Phil Anderson 'More is Different', 'das Maass' by Hegel etc. Even though the 'emergence laws' act through the intermediary of the fundamental laws that govern the individual elementary agents, it turns out that different systems apparently governed by very different fundamental laws: gravity, chemistry, biology, economics, social psychology, end up often with similar emergence laws and outcomes. In particular the emergence of adaptive collective objects endows the system with a granular structure which in turn causes specific macroscopic cycles of intermittent fluctuations.Comment: 42 pages, 18 figure

    Artificial general intelligence: Proceedings of the Second Conference on Artificial General Intelligence, AGI 2009, Arlington, Virginia, USA, March 6-9, 2009

    Get PDF
    Artificial General Intelligence (AGI) research focuses on the original and ultimate goal of AI – to create broad human-like and transhuman intelligence, by exploring all available paths, including theoretical and experimental computer science, cognitive science, neuroscience, and innovative interdisciplinary methodologies. Due to the difficulty of this task, for the last few decades the majority of AI researchers have focused on what has been called narrow AI – the production of AI systems displaying intelligence regarding specific, highly constrained tasks. In recent years, however, more and more researchers have recognized the necessity – and feasibility – of returning to the original goals of the field. Increasingly, there is a call for a transition back to confronting the more difficult issues of human level intelligence and more broadly artificial general intelligence

    Using reinforcement learning for optimizing the reproduction of tasks in robot programming by demonstration

    Get PDF
    As robots start pervading human environments, the need for new interfaces that would simplify human-robot interaction has become more pressing. Robot Programming by Demonstration (RbD) develops intuitive ways of programming robots, taking inspiration in strategies used by humans to transmit knowledge to apprentices. The user-friendliness of RbD is meant to allow lay users with no prior knowledge in computer science, electronics or mechanics to train robots to accomplish tasks the same way as they would with a co-worker. When a trainer teaches a task to a robot, he/she shows a particular way of fulfilling the task. For a robot to be able to learn from observing the trainer, it must be able to learn what the task entails (i.e. answer the so-called "What-to-imitate?" question), by inferring the user's intentions. But most importantly, the robot must be able to adapt its own controller to fit at best the demonstration (the so-called "How-to-imitate?" question) despite different setups and embodiments. The latter is the question that interested us in this thesis. It relates to the problem of optimizing the reproduction of the task under environmental constraints. The "How-to-imitate?" question is subdivided into two problems. The first problem, also known as the "correspondence problem", relates to resolving the discrepancy between the human demonstrator and robot's body that prevent the robot from doing an identical reproduction of the task. Even though we helped ourselves by considering solely humanoid platforms, that is platforms that have a joint configuration similar to that of the human, discrepancies in the number of degrees of freedom and range of motion remained. We resolved these by exploiting the redundant information conveyed through the demonstrations by collecting data through different frames of reference. By exploiting these redundancies in an algorithm comparable to the damped least square algorithm, we are able to reproduce a trajectory that minimizes the error between the desired trajectory and the reproduced trajectory across each frame of reference. The second problem consists in reproducing a trajectory in an unknown setup while respecting the task constraints learned during training. When the information learned from the demonstration no longer suffice to generalize the task constraints to a new set-up, the robot must re-learn the task; this time through trial-and-error. Here we considered the combination of trial-and-error learning to complement RbD. By adding a trial-and-error module to the original Imitation Learning algorithm, the robot can find a solution that is more adapted to the context and to its embodiment than the solution found using RbD. Specifically, we compared Reinforcement Learning (RL) – to other classical optimization techniques. We show that the system is advantageous in that: a) learning is more robust to unexpected events that have not been encountered during the demonstrations and b) the robot is able to optimize its own model of the task according to its own embodiment

    Proceedings of the 1st Doctoral Consortium at the European Conference on Artificial Intelligence (DC-ECAI 2020)

    Get PDF
    1st Doctoral Consortium at the European Conference on Artificial Intelligence (DC-ECAI 2020), 29-30 August, 2020 Santiago de Compostela, SpainThe DC-ECAI 2020 provides a unique opportunity for PhD students, who are close to finishing their doctorate research, to interact with experienced researchers in the field. Senior members of the community are assigned as mentors for each group of students based on the student’s research or similarity of research interests. The DC-ECAI 2020, which is held virtually this year, allows students from all over the world to present their research and discuss their ongoing research and career plans with their mentor, to do networking with other participants, and to receive training and mentoring about career planning and career option

    ISIPTA'07: Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications

    Get PDF
    B
    • …
    corecore