2,363 research outputs found

    Automated Game Design Learning

    Full text link
    While general game playing is an active field of research, the learning of game design has tended to be either a secondary goal of such research or it has been solely the domain of humans. We propose a field of research, Automated Game Design Learning (AGDL), with the direct purpose of learning game designs directly through interaction with games in the mode that most people experience games: via play. We detail existing work that touches the edges of this field, describe current successful projects in AGDL and the theoretical foundations that enable them, point to promising applications enabled by AGDL, and discuss next steps for this exciting area of study. The key moves of AGDL are to use game programs as the ultimate source of truth about their own design, and to make these design properties available to other systems and avenues of inquiry.Comment: 8 pages, 2 figures. Accepted for CIG 201

    CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments

    Full text link
    In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, and where an apt agent Bob acts independently, with non-observable intentions. We argue that this setting defines a realistic scenario and we present a generic discrete-state discrete-action model of such environments. To learn in this environment, we propose an unsupervised reinforcement learning agent called CLIC for Curriculum Learning and Imitation for Control. CLIC learns to control individual objects in its environment, and imitates Bob's interactions with these objects. It selects objects to focus on when training and imitating by maximizing its learning progress. We show that CLIC is an effective baseline in our new setting. It can effectively observe Bob to gain control of objects faster, even if Bob is not explicitly teaching. It can also follow Bob when he acts as a mentor and provides ordered demonstrations. Finally, when Bob controls objects that the agent cannot, or in presence of a hierarchy between objects in the environment, we show that CLIC ignores non-reproducible and already mastered interactions with objects, resulting in a greater benefit from imitation

    Preference-Based Monte Carlo Tree Search

    Full text link
    Monte Carlo tree search (MCTS) is a popular choice for solving sequential anytime problems. However, it depends on a numeric feedback signal, which can be difficult to define. Real-time MCTS is a variant which may only rarely encounter states with an explicit, extrinsic reward. To deal with such cases, the experimenter has to supply an additional numeric feedback signal in the form of a heuristic, which intrinsically guides the agent. Recent work has shown evidence that in different areas the underlying structure is ordinal and not numerical. Hence erroneous and biased heuristics are inevitable, especially in such domains. In this paper, we propose a MCTS variant which only depends on qualitative feedback, and therefore opens up new applications for MCTS. We also find indications that translating absolute into ordinal feedback may be beneficial. Using a puzzle domain, we show that our preference-based MCTS variant, wich only receives qualitative feedback, is able to reach a performance level comparable to a regular MCTS baseline, which obtains quantitative feedback.Comment: To be publishe

    On the Evolutionary Emergence of Optimism

    Get PDF
    Successful individuals were frequently found to be overly optimistic. This is puzzling because it might be thought that optimistic individuals who consistently overestimate their eventual payoffs will not do as well as realists who see the situation as it truly is and hence will not survive evolutionary pressures. We show that contrary to this intuition, there is a large class of either competitive or cooperative strategic interactions between randomly matched pairs of individuals in the population, in which "cautiously" optimistic individuals not only survive but also prosper and take over the entire population. The reason for this result is that optimistic individuals who overestimate the impact of their actions on their payoffs, behave more aggressively than realists and pessimists. When the interactions between individuals involve negative externalities (the payoff of one player decreases with the actions taken by another player) and the actions are strategic substitutes, being aggressive induces the opponent to be softer, so optimists gain a strategic advantage that, for moderate levels of optimism, outweighs the loss from having the wrong perception of the environment. Likewise, when the interactions between individuals involve positive externalities and the actions are strategic complements, being aggressive triggers a favorable aggressive behavior from the opponent. Hence, in both cases, cautiously optimistic types fare better on average than other types of individuals. We show that if the initial distribution of types is sufficiently wide, then over time it will converge in distribution to a mass point on some level of cautious optimism.

    Synthetic steganography: Methods for generating and detecting covert channels in generated media

    Get PDF
    Issues of privacy in communication are becoming increasingly important. For many people and businesses, the use of strong cryptographic protocols is sufficient to protect their communications. However, the overt use of strong cryptography may be prohibited or individual entities may be prohibited from communicating directly. In these cases, a secure alternative to the overt use of strong cryptography is required. One promising alternative is to hide the use of cryptography by transforming ciphertext into innocuous-seeming messages to be transmitted in the clear. ^ In this dissertation, we consider the problem of synthetic steganography: generating and detecting covert channels in generated media. We start by demonstrating how to generate synthetic time series data that not only mimic an authentic source of the data, but also hide data at any of several different locations in the reversible generation process. We then design a steganographic context-sensitive tiling system capable of hiding secret data in a variety of procedurally-generated multimedia objects. Next, we show how to securely hide data in the structure of a Huffman tree without affecting the length of the codes. Next, we present a method for hiding data in Sudoku puzzles, both in the solved board and the clue configuration. Finally, we present a general framework for exploiting steganographic capacity in structured interactions like online multiplayer games, network protocols, auctions, and negotiations. Recognizing that structured interactions represent a vast field of novel media for steganography, we also design and implement an open-source extensible software testbed for analyzing steganographic interactions and use it to measure the steganographic capacity of several classic games. ^ We analyze the steganographic capacity and security of each method that we present and show that existing steganalysis techniques cannot accurately detect the usage of the covert channels. We develop targeted steganalysis techniques which improve detection accuracy and then use the insights gained from those methods to improve the security of the steganographic systems. We find that secure synthetic steganography, and accurate steganalysis thereof, depends on having access to an accurate model of the cover media
    corecore