270 research outputs found

    Symbolic Search in Planning and General Game Playing

    Get PDF
    Search is an important topic in many areas of AI. Search problems often result in an immense number of states. This work addresses this by using a special datastructure, BDDs, which can represent large sets of states efficiently, often saving space compared to explicit representations. The first part is concerned with an analysis of the complexity of BDDs for some search problems, resulting in lower or upper bounds on BDD sizes for these. The second part is concerned with action planning, an area where the programmer does not know in advance what the search problem will look like. This part presents symbolic algorithms for finding optimal solutions for two different settings, classical and net-benefit planning, as well as several improvements to these algorithms. The resulting planner was able to win the International Planning Competition IPC 2008. The third part is concerned with general game playing, which is similar to planning in that the programmer does not know in advance what game will be played. This work proposes algorithms for instantiating the input and solving games symbolically. For playing, a hybrid player based on UCT and the solver is presented

    Quantum-enhanced reinforcement learning

    Get PDF
    Dissertação de mestrado em Engenharia FísicaThe field of Artificial Intelligence has lately witnessed extraordinary results. The ability to design a system capable of beating the world champion of Go, an ancient Chinese game known as the holy grail of AI, caused a spark worldwide, making people believe that some thing revolutionary is about to happen. A different flavor of learning called Reinforcement Learning is at the core of this revolution. In parallel, we are witnessing the emergence of a new field, that of Quantum Machine Learning which has already shown promising results in supervised/unsupervised learning. In this dissertation, we reach for the interplay between Quantum Computing and Reinforcement Learning. This learning by interaction was made possible in the quantum setting using the con cept of oraculization of task environments suggested by Dunjko in 2015. In this dissertation, we extended the oracular instances previously suggested to work in more general stochastic environments. On top of this quantum agent-environment paradigm we developed a novel quantum algorithm for near-optimal decision-making based on the Reinforcement Learn ing paradigm known as Sparse Sampling, obtaining a quantum speedup compared to the classical counterpart. The achievement was a quantum algorithm that exhibits a complexity independent on the number of states of the environment. This independence guarantees its suitability for dealing with large state spaces where planning may be inapplicable. The most important open questions remain whether it is possible to improve the orac ular instances of task environments to deal with even more general environments, especially the ability to represent negative rewards as a natural mechanism for negative feedback instead of some normalization of the reward and the extension of the algorithm to perform an informed tree-based search instead of the uninformed search proposed. Improvements on this result would allow the comparison between the algorithm and more recent classical Reinforcement Learning algorithms.O campo da Inteligência Artificial tem tido resultados extraordinários ultimamente, a capacidade de projetar um sistema capaz de vencer o campeão mundial de Go, um antigo jogo de origem Chinesa, conhecido como o santo graal da IA, causou uma faísca em todo o mundo, fazendo as pessoas acreditarem em que algo revolucionário estar a para acontecer. Um tipo diferente de aprendizagem, chamada Aprendizagem por Reforço está no cerne dessa revolução. Em paralelo surge também um novo campo, o da Aprendizagem Máquina Quântica, que já vem apresentando resultados promissores na aprendizagem supervisionada/não, supervisionada. Nesta dissertação, procuramos invés a interação entre Computação Quântica e a Aprendizagem por Reforço. Esta interação entre agente e Ambiente foi possível no cenário quântico usando o conceito de oraculização de ambientes sugerido por Dunjko em 2015. Neste trabalho, estendemos as instâncias oraculares sugeridas anteriormente para trabalhar em ambientes estocásticos generalizados. Tendo em conta este paradigma quântico agente-ambiente, desenvolvemos um novo algoritmo quântico para tomada de decisão aproximadamente ótima com base no paradigma da Aprendizagem por Reforço conhecido como Amostragem Esparsa, obtendo uma aceleração quântica em comparação com o caso clássico que possibilitou a obtenção de um algoritmo quântico que exibe uma complexidade independente do número de estados do ambiente. Esta independência garante a sua adaptação para ambientes com um grande espaço de estados em que o planeamento pode ser intratável. As questões mais pertinentes que se colocam é se é possível melhorar as instâncias oraculares de ambientes para lidar com ambientes ainda mais gerais, especialmente a capacidade de exprimir recompensas negativas como um mecanismo natural para feedback negativo em vez de alguma normalização da recompensa. Além disso, a extensão do algoritmo para realizar uma procura em árvore informada ao invés da procura não informada proposta. Melhorias neste resultado permitiriam a comparação entre o algoritmo quântico e os algoritmos clássicos mais recentes da Aprendizagem por Reforço

    Intelligent Agents for Active Malware Analysis

    Get PDF
    The main contribution of this thesis is to give a novel perspective on Active Malware Analysis modeled as a decision making process between intelligent agents. We propose solutions aimed at extracting the behaviors of malware agents with advanced Artificial Intelligence techniques. In particular, we devise novel action selection strategies for the analyzer agents that allow to analyze malware by selecting sequences of triggering actions aimed at maximizing the information acquired. The goal is to create informative models representing the behaviors of the malware agents observed while interacting with them during the analysis process. Such models can then be used to effectively compare a malware against others and to correctly identify the malware famil

    Approximate inference in graphical models

    Get PDF
    Probability theory provides a mathematically rigorous yet conceptually flexible calculus of uncertainty, allowing the construction of complex hierarchical models for real-world inference tasks. Unfortunately, exact inference in probabilistic models is often computationally expensive or even intractable. A close inspection in such situations often reveals that computational bottlenecks are confined to certain aspects of the model, which can be circumvented by approximations without having to sacrifice the model's interesting aspects. The conceptual framework of graphical models provides an elegant means of representing probabilistic models and deriving both exact and approximate inference algorithms in terms of local computations. This makes graphical models an ideal aid in the development of generalizable approximations. This thesis contains a brief introduction to approximate inference in graphical models (Chapter 2), followed by three extensive case studies in which approximate inference algorithms are developed for challenging applied inference problems. Chapter 3 derives the first probabilistic game tree search algorithm. Chapter 4 provides a novel expressive model for inference in psychometric questionnaires. Chapter 5 develops a model for the topics of large corpora of text documents, conditional on document metadata, with a focus on computational speed. In each case, graphical models help in two important ways: They first provide important structural insight into the problem; and then suggest practical approximations to the exact probabilistic solution.This work was supported by a scholarship from Microsoft Research, Ltd

    Automated Machine Learning for Multi-Label Classification

    Get PDF

    Hybrid optimizer for expeditious modeling of virtual urban environments

    Get PDF
    Tese de mestrado. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 200

    Low-resource learning in complex games

    Get PDF
    This project is concerned with learning to take decisions in complex domains, in games in particular. Previous work assumes that massive data resources are available for training, but aside from a few very popular games, this is generally not the case, and the state of the art in such circumstances is to rely extensively on hand-crafted heuristics. On the other hand, human players are able to quickly learn from only a handful of examples, exploiting specific characteristics of the learning problem to accelerate their learning process. Designing algorithms that function in a similar way is an open area of research and has many applications in today’s complex decision problems. One solution presented in this work is design learning algorithms that exploit the inherent structure of the game. Specifically, we take into account how the action space can be clustered into sets called types and exploit this characteristic to improve planning at decision time. Action types can also be leveraged to extract high-level strategies from a sparse corpus of human play, and this generates more realistic trajectories during planning, further improving performance. Another approach that proved successful is using an accurate model of the environment to reduce the complexity of the learning problem. Similar to how human players have an internal model of the world that allows them to focus on the relevant parts of the problem, we decouple learning to win from learning the rules of the game, thereby making supervised learning more data efficient. Finally, in order to handle partial observability that is usually encountered in complex games, we propose an extension to Monte Carlo Tree Search that plans in the Belief Markov Decision Process. We found that this algorithm doesn’t outperform the state of the art models on our chosen domain. Our error analysis indicates that the method struggles to handle the high uncertainty of the conditions required for the game to end. Furthermore, our relaxed belief model can cause rollouts in the belief space to be inaccurate, especially in complex games. We assess the proposed methods in an agent playing the highly complex board game Settlers of Catan. Building on previous research, our strongest agent combines planning at decision time with prior knowledge extracted from an available corpus of general human play; but unlike this prior work, our human corpus consists of only 60 games, as opposed to many thousands. Our agent defeats the current state of the art agent by a large margin, showing that the proposed modifications aid in exploiting general human play in highly complex games

    Optimizing quantum circuit layouts

    Get PDF
    Un dels problemes amb els quals s'enfronta la computació quàntica és el de l'optimització de la compilació d'un circuit quàntic. El procés de compilació inclou bàsicament dues etapes: síntesi del circuit a executar en termes de les portes quàntiques suportades pel processador, i adaptació del circuit a executar a les limitacions de connectivitat imposades pel processador. En aquest treball, he abordat el segon d'aquests problemes, conegut amb el nom de Quantum Circuit Layout (QCL). Per a la seva resolució, he intentat usar tècniques de Reinforcement Learning (RL), que requereixen modelitzar prèviament el problema en termes d'un Markov Decision Process (MDP). En concret, descric dos MDP's finits la solució dels quals proporciona una solució a una part del problema del QCL. El problema principal és dissenyar un mètode que permeti efectivament resoldre aquests MDP's, ni que sigui de manera aproximada. En el treball es discuteixen dues aproximacions al problema. La primera d'elles utilitza una variant de l'algoritme usat per AlphaZero, dissenyat amb l'objectiu d'entrenar a una màquina per tal que aprengui a jugar als jocs d'Escacs, Shogi i Go. La segona utilitza una aproximació més estàndard coneguda com a Deep Q-Learning (DQL).One of the challenges in quantum computing is the problem of optimizing quantum circuit compilation. The compilation process involves two main stages: synthesizing the circuit to be executed in terms of the quantum gates supported by the processor, and adapting the circuit to the connectivity limitations imposed by the processor. In this work, I have addressed the second of these problems, known as Quantum Circuit Layout (QCL). To tackle this problem, I have attempted to use Reiforcement Learning (RL) techniques, which require modeling the problem as a Markov Decision Process (MDP). Specifically, I describe two finite MDPs whose solution provides a solution to a part of the QCL problem. The main problem is to design a method that effectively solves these MDPs, even if it is only an approximate solution. In the thesis two approaches to the problem are discussed. The first one uses a variant of the algorithm used in AlphaZero, designed to train a machine to learn how to play Chess, Shogi, and Go. The second approach uses a more standard approximation known as Deep Q-Learning (DQL)
    corecore