Search CORE

270 research outputs found

Symbolic Search in Planning and General Game Playing

Author: Kissmann Peter
Publication venue
Publication date: 01/01/2012
Field of study

Search is an important topic in many areas of AI. Search problems often result in an immense number of states. This work addresses this by using a special datastructure, BDDs, which can represent large sets of states efficiently, often saving space compared to explicit representations. The first part is concerned with an analysis of the complexity of BDDs for some search problems, resulting in lower or upper bounds on BDD sizes for these. The second part is concerned with action planning, an area where the programmer does not know in advance what the search problem will look like. This part presents symbolic algorithms for finding optimal solutions for two different settings, classical and net-benefit planning, as well as several improvements to these algorithms. The resulting planner was able to win the International Planning Competition IPC 2008. The third part is concerned with general game playing, which is similar to planning in that the programmer does not know in advance what game will be played. This work proposes algorithms for instantiating the input and solving games symbolically. For playing, a hybrid player based on UCT and the solver is presented

E-LIB Dokumentserver - Staats und Universitätsbibliothek Bremen

Quantum-enhanced reinforcement learning

Author: Sequeira André Manuel Resende
Publication venue
Publication date: 14/01/2021
Field of study

Dissertação de mestrado em Engenharia FísicaThe field of Artificial Intelligence has lately witnessed extraordinary results. The ability to design a system capable of beating the world champion of Go, an ancient Chinese game known as the holy grail of AI, caused a spark worldwide, making people believe that some thing revolutionary is about to happen. A different flavor of learning called Reinforcement Learning is at the core of this revolution. In parallel, we are witnessing the emergence of a new field, that of Quantum Machine Learning which has already shown promising results in supervised/unsupervised learning. In this dissertation, we reach for the interplay between Quantum Computing and Reinforcement Learning. This learning by interaction was made possible in the quantum setting using the con cept of oraculization of task environments suggested by Dunjko in 2015. In this dissertation, we extended the oracular instances previously suggested to work in more general stochastic environments. On top of this quantum agent-environment paradigm we developed a novel quantum algorithm for near-optimal decision-making based on the Reinforcement Learn ing paradigm known as Sparse Sampling, obtaining a quantum speedup compared to the classical counterpart. The achievement was a quantum algorithm that exhibits a complexity independent on the number of states of the environment. This independence guarantees its suitability for dealing with large state spaces where planning may be inapplicable. The most important open questions remain whether it is possible to improve the orac ular instances of task environments to deal with even more general environments, especially the ability to represent negative rewards as a natural mechanism for negative feedback instead of some normalization of the reward and the extension of the algorithm to perform an informed tree-based search instead of the uninformed search proposed. Improvements on this result would allow the comparison between the algorithm and more recent classical Reinforcement Learning algorithms.O campo da Inteligência Artificial tem tido resultados extraordinários ultimamente, a capacidade de projetar um sistema capaz de vencer o campeão mundial de Go, um antigo jogo de origem Chinesa, conhecido como o santo graal da IA, causou uma faísca em todo o mundo, fazendo as pessoas acreditarem em que algo revolucionário estar a para acontecer. Um tipo diferente de aprendizagem, chamada Aprendizagem por Reforço está no cerne dessa revolução. Em paralelo surge também um novo campo, o da Aprendizagem Máquina Quântica, que já vem apresentando resultados promissores na aprendizagem supervisionada/não, supervisionada. Nesta dissertação, procuramos invés a interação entre Computação Quântica e a Aprendizagem por Reforço. Esta interação entre agente e Ambiente foi possível no cenário quântico usando o conceito de oraculização de ambientes sugerido por Dunjko em 2015. Neste trabalho, estendemos as instâncias oraculares sugeridas anteriormente para trabalhar em ambientes estocásticos generalizados. Tendo em conta este paradigma quântico agente-ambiente, desenvolvemos um novo algoritmo quântico para tomada de decisão aproximadamente ótima com base no paradigma da Aprendizagem por Reforço conhecido como Amostragem Esparsa, obtendo uma aceleração quântica em comparação com o caso clássico que possibilitou a obtenção de um algoritmo quântico que exibe uma complexidade independente do número de estados do ambiente. Esta independência garante a sua adaptação para ambientes com um grande espaço de estados em que o planeamento pode ser intratável. As questões mais pertinentes que se colocam é se é possível melhorar as instâncias oraculares de ambientes para lidar com ambientes ainda mais gerais, especialmente a capacidade de exprimir recompensas negativas como um mecanismo natural para feedback negativo em vez de alguma normalização da recompensa. Além disso, a extensão do algoritmo para realizar uma procura em árvore informada ao invés da procura não informada proposta. Melhorias neste resultado permitiriam a comparação entre o algoritmo quântico e os algoritmos clássicos mais recentes da Aprendizagem por Reforço

Universidade do Minho: RepositoriUM

Intelligent Agents for Active Malware Analysis

Author: SARTEA RICCARDO
Publication venue
Publication date: 01/01/2020
Field of study

The main contribution of this thesis is to give a novel perspective on Active Malware Analysis modeled as a decision making process between intelligent agents. We propose solutions aimed at extracting the behaviors of malware agents with advanced Artificial Intelligence techniques. In particular, we devise novel action selection strategies for the analyzer agents that allow to analyze malware by selecting sequences of triggering actions aimed at maximizing the information acquired. The goal is to create informative models representing the behaviors of the malware agents observed while interacting with them during the analysis process. Such models can then be used to effectively compare a malware against others and to correctly identify the malware famil

Catalogo dei prodotti della ricerca

Approximate inference in graphical models

Author: Hennig Philipp
Publication venue: University of Cambridge
Publication date: 01/01/2010
Field of study

Probability theory provides a mathematically rigorous yet conceptually flexible calculus of uncertainty, allowing the construction of complex hierarchical models for real-world inference tasks. Unfortunately, exact inference in probabilistic models is often computationally expensive or even intractable. A close inspection in such situations often reveals that computational bottlenecks are confined to certain aspects of the model, which can be circumvented by approximations without having to sacrifice the model's interesting aspects. The conceptual framework of graphical models provides an elegant means of representing probabilistic models and deriving both exact and approximate inference algorithms in terms of local computations. This makes graphical models an ideal aid in the development of generalizable approximations. This thesis contains a brief introduction to approximate inference in graphical models (Chapter 2), followed by three extensive case studies in which approximate inference algorithms are developed for challenging applied inference problems. Chapter 3 derives the first probabilistic game tree search algorithm. Chapter 4 provides a novel expressive model for inference in psychometric questionnaires. Chapter 5 develops a model for the topics of large corpora of text documents, conditional on document metadata, with a focus on computational speed. In each case, graphical models help in two important ways: They first provide important structural insight into the problem; and then suggest practical approximations to the exact probabilistic solution.This work was supported by a scholarship from Microsoft Research, Ltd

CiteSeerX

Apollo (Cambridge)

MPG.PuRe

Automated Machine Learning for Multi-Label Classification

Author: Wever Marcel
Publication venue
Publication date: 01/01/2021
Field of study

Open Access LMU

Recommended from our members

Principled control of approximate programs

Author: Sui Xin, Ph.D.
Publication venue
Publication date: 10/11/2016
Field of study

In conventional computing, most programs are treated as implementations of mathematical functions for which there is an exact output that must computed from a given input. However, in many problem domains, it is sufficient to produce some approximation of this output. For example, when rendering a scene in graphics, it is acceptable to take computational short-cuts if human beings cannot tell the difference in the rendered scene. In other problem domains like machine learning, programs are often implementations of heuristic approaches to solving problems and therefore already compute approximate solutions to the original problem. This is the key insight for the new research area, approximate computing, which attempts to trade-off such approximations against the cost of computational resources such as program execution time, energy consumption, and memory usage. We believe that approximate computing is an important step towards a more fundamental and comprehensive goal that we call information-efficiency. Current applications compute more information (bits) than are needed to produce their outputs, and since producing and transporting bits of information inside a computer requires energy/computation time/memory usage, information-inefficient computing leads directly to resources inefficiency. Although there is now a fairly large literature on approximate computing, system researchers have focused mostly on what we can call the forward problem; that is, they have explored different ways in both hardware and software to introduce approximations in a program and have demonstrated that these approximations can enable significant execution speedups and energy savings with some quality degradation of the result. However, these efforts do not provide any guarantee on the amount of the quality degradation. Since the acceptable amount of degradation usually depends on the scenario in which the application is deployed, it is very important to be able to control the degree of approximation. In this dissertation, we refer to this problem as the inverse problem. Relatively little is known about how to solve the inverse problem in a disciplined way. This dissertation makes two contributions towards solving the inverse problem. First, we investigate a large set of approximate algorithms from a variety of domains in order to understand how approximation is used in real-world applications. From this investigation, we determine that many approximate programs are tunable approximate programs. Tunable approximate programs have one or more parameters called knobs that can be changed to vary the quality of the output of the approximate computation as well as the corresponding cost. For example, an iterative linear equation solver can vary the number of iterations to trade quality of the solution versus the execution time, a Monte Carlo path tracer can change the number of sampling light paths to trade the quality of the resulting image against execution time, etc. Tunable approximate programs provide many opportunities for trading accuracy versus cost. By carefully analyzing these algorithms, we have found a set of patterns for how approximation is applied in tunable programs. Our classification can be used to identify new approximation opportunities in programs. A second contribution of this dissertation is an approach to solving the inverse problem for tunable approximate programs. Concretely, the problem is to determine knob settings to minimize the cost while keeping the quality degradation within a given bound. There are four challenges: i) for real-world applications, the quality and cost are usually complex non-linear functions of the knobs and these functions are usually hard to express analytically; ii) the quality and the cost for an application vary greatly for different inputs; iii) when an acceptable quality degradation bound is presented, determining the knob setting has to be very efficient so that the extra overhead incurred by the identification will not exceed the cost saved by the approximation; and iv) the approach should be general so that it can be applied to many applications. To meet these requirements, we formulate the inverse problem as a constrained optimization problem and solve it using a machine learning based approach. We build a system which uses machine learning techniques to learn cost and quality models for the program by profiling the program with a set of representative inputs. Then, when a quality degradation bound is presented, the system searches these error and cost models to identify the knob settings which can achieve the best cost savings while simultaneously guaranteeing the quality degradation bound statistically. We evaluate the system with a set of real world applications, including a social network graph partitioner, an image search engine, a 2-D graph layout engine, a 3-D game physics engine, a SVM solver and a radar signal processing engine. The experiments showed great savings in execution time and energy savings for a variety of quality bounds.Computer Science

Texas ScholarWorks

Physics at TeV Colliders 2003: THE QCD/SM WORKING GROUP: Summary Report

Institutional Research Information System University of Turin

Hybrid optimizer for expeditious modeling of virtual urban environments

Author: Cruz Filipe Manuel Miranda da
Publication venue
Publication date: 01/01/2008
Field of study

Tese de mestrado. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 200

Repositório Aberto da Universidade do Porto

Low-resource learning in complex games

Author: Dobre Mihai Sorin
Publication venue: The University of Edinburgh
Publication date: 01/07/2019
Field of study

This project is concerned with learning to take decisions in complex domains, in games in particular. Previous work assumes that massive data resources are available for training, but aside from a few very popular games, this is generally not the case, and the state of the art in such circumstances is to rely extensively on hand-crafted heuristics. On the other hand, human players are able to quickly learn from only a handful of examples, exploiting specific characteristics of the learning problem to accelerate their learning process. Designing algorithms that function in a similar way is an open area of research and has many applications in today’s complex decision problems. One solution presented in this work is design learning algorithms that exploit the inherent structure of the game. Specifically, we take into account how the action space can be clustered into sets called types and exploit this characteristic to improve planning at decision time. Action types can also be leveraged to extract high-level strategies from a sparse corpus of human play, and this generates more realistic trajectories during planning, further improving performance. Another approach that proved successful is using an accurate model of the environment to reduce the complexity of the learning problem. Similar to how human players have an internal model of the world that allows them to focus on the relevant parts of the problem, we decouple learning to win from learning the rules of the game, thereby making supervised learning more data efficient. Finally, in order to handle partial observability that is usually encountered in complex games, we propose an extension to Monte Carlo Tree Search that plans in the Belief Markov Decision Process. We found that this algorithm doesn’t outperform the state of the art models on our chosen domain. Our error analysis indicates that the method struggles to handle the high uncertainty of the conditions required for the game to end. Furthermore, our relaxed belief model can cause rollouts in the belief space to be inaccurate, especially in complex games. We assess the proposed methods in an agent playing the highly complex board game Settlers of Catan. Building on previous research, our strongest agent combines planning at decision time with prior knowledge extracted from an available corpus of general human play; but unlike this prior work, our human corpus consists of only 60 games, as opposed to many thousands. Our agent defeats the current state of the art agent by a large margin, showing that the proposed modifications aid in exploiting general human play in highly complex games

Edinburgh Research Archive

Optimizing quantum circuit layouts

Author: Elgueta Serra Jordi
Publication venue: Universitat Politècnica de Catalunya
Publication date: 27/06/2023
Field of study

Un dels problemes amb els quals s'enfronta la computació quàntica és el de l'optimització de la compilació d'un circuit quàntic. El procés de compilació inclou bàsicament dues etapes: síntesi del circuit a executar en termes de les portes quàntiques suportades pel processador, i adaptació del circuit a executar a les limitacions de connectivitat imposades pel processador. En aquest treball, he abordat el segon d'aquests problemes, conegut amb el nom de Quantum Circuit Layout (QCL). Per a la seva resolució, he intentat usar tècniques de Reinforcement Learning (RL), que requereixen modelitzar prèviament el problema en termes d'un Markov Decision Process (MDP). En concret, descric dos MDP's finits la solució dels quals proporciona una solució a una part del problema del QCL. El problema principal és dissenyar un mètode que permeti efectivament resoldre aquests MDP's, ni que sigui de manera aproximada. En el treball es discuteixen dues aproximacions al problema. La primera d'elles utilitza una variant de l'algoritme usat per AlphaZero, dissenyat amb l'objectiu d'entrenar a una màquina per tal que aprengui a jugar als jocs d'Escacs, Shogi i Go. La segona utilitza una aproximació més estàndard coneguda com a Deep Q-Learning (DQL).One of the challenges in quantum computing is the problem of optimizing quantum circuit compilation. The compilation process involves two main stages: synthesizing the circuit to be executed in terms of the quantum gates supported by the processor, and adapting the circuit to the connectivity limitations imposed by the processor. In this work, I have addressed the second of these problems, known as Quantum Circuit Layout (QCL). To tackle this problem, I have attempted to use Reiforcement Learning (RL) techniques, which require modeling the problem as a Markov Decision Process (MDP). Specifically, I describe two finite MDPs whose solution provides a solution to a part of the QCL problem. The main problem is to design a method that effectively solves these MDPs, even if it is only an approximate solution. In the thesis two approaches to the problem are discussed. The first one uses a variant of the algorithm used in AlphaZero, designed to train a machine to learn how to play Chess, Shogi, and Go. The second approach uses a more standard approximation known as Deep Q-Learning (DQL)

UPCommons. Portal del coneixement obert de la UPC