4,034 research outputs found

    Transfer Reinforcement Learning Based Negotiating Agent Framework

    Get PDF
    While achieving tremendous success, there is still a major issue standing out in the domain of automated negotiation: it is inefficient for a negotiating agent to learn a strategy from scratch when being faced with an unknown opponent. Transfer learning can alleviate this problem by utilizing the knowledge of previously learned policies to accelerate the current task learning. This work presents a novel Transfer Learning based Negotiating Agent (TLNAgent) framework that allows a negotiating agent to transfer previous knowledge from source strategies optimized by deep reinforcement learning, to boost its performance in new tasks. TLNAgent comprises three key components: the negotiation module, the adaptation module and the transfer module. To be specific, the negotiation module is responsible for interacting with the other agent during negotiation. The adaptation module measures the helpfulness of each source policy based on a fusion of two selection mechanisms. The transfer module is based on lateral connections between source and target networks and accelerates the agent’s training by transferring knowledge from the selected source strategy. Our comprehensive experiments clearly demonstrate that TL is effective in the context of automated negotiation, and TLNAgent outperforms state-of-the-art Automated Negotiating Agents Competition (ANAC) negotiating agents in various domains

    Classification of local energy trading negotiation profiles using artificial neural networks

    Get PDF
    Electricity markets are evolving into a local trading setting, which makes it for unexperienced players to achieve good agreements and obtain profits. One of the solutions to deal with this issue is to provide players with decision support solutions capable of identifying opponents' negotiation profiles, so that negotiation strategies can be adapted to those profiles in order to reach the best possible results from negotiations. This paper presents an approach that classifies opponents' proposals during a negotiation, to determine which is the typical negotiation profile in which the opponent most relates. The classification process is performed using an artificial neural network approach, and it is able to adapt at each new proposal during the negotiation process, by re-classifying the opponents' negotiation profile according to the most recent actions. In this way, effective decision support is provided to market players, enabling them to adapt the negotiation strategy throughout the negotiations.This work has received funding from National Funds through FCT (Fundaçao da Ciencia e Tecnologia) under the project SPET – 29165, call SAICT 2017info:eu-repo/semantics/publishedVersio

    A scalability analysis of grid allocation mechanisms

    Get PDF
    This article examines the broker's behavior with regard to a varying number of participating nodes and shows that incremental losses have to be accepted in central resource allocation when introducing new nodes. --Grid Computing

    Complex negotiations in multi-agent systems

    Full text link
    Los sistemas multi-agente (SMA) son sistemas distribuidos donde entidades autónomas llamadas agentes, ya sean humanos o software, persiguen sus propios objetivos. El paradigma de SMA ha sido propuesto como la aproximación de modelo apropiada para aplicaciones como el comercio electrónico, los sistemas multi-robot, aplicaciones de seguridad, etc. En la comunidad de SMA, la visión de sistemas multi-agente abiertos, donde agentes heterogéneos pueden entrar y salir del sistema dinámicamente, ha cobrado fuerza como paradigma de modelado debido a su relación conceptual con tecnologías como la Web, la computación grid, y las organizaciones virtuales. Debido a la heterogeneidad de los agentes, y al hecho de dirigirse por sus propios objetivos, el conflicto es un fenómeno candidato a aparecer en los sistemas multi-agente. En los últimos años, el término tecnologías del acuerdo ha sido usado para referirse a todos aquellos mecanismos que, directa o indirectamente, promueven la resolución de conflictos en sistemas computacionales como los sistemas multi-agente. Entre las tecnologías del acuerdo, la negociación automática ha sido propuesta como uno de los mecanismos clave en la resolución de conflictos debido a su uso análogo en la resolución de conflictos entre humanos. La negociación automática consiste en el intercambio automático de propuestas llevado a cabo por agentes software en nombre de sus usuarios. El objetivo final es conseguir un acuerdo con todas las partes involucradas. Pese a haber sido estudiada por la Inteligencia Artificial durante años, distintos problemas todavía no han sido resueltos por la comunidad científica todavía. El principal objetivo de esta tesis es proponer modelos de negociación para escenarios complejos donde la complejidad deriva de (1) las limitaciones computacionales o (ii) la necesidad de representar las preferencias de múltiples individuos. En la primera parte de esta tesis proponemos un modelo de negociación bilateral para el problema deSánchez Anguix, V. (2013). Complex negotiations in multi-agent systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/21570Palanci

    Modeling Mutual Influence in Multi-Agent Reinforcement Learning

    Get PDF
    In multi-agent systems (MAS), agents rarely act in isolation but tend to achieve their goals through interactions with other agents. To be able to achieve their ultimate goals, individual agents should actively evaluate the impacts on themselves of other agents' behaviors before they decide which actions to take. The impacts are reciprocal, and it is of great interest to model the mutual influence of agent's impacts with one another when they are observing the environment or taking actions in the environment. In this thesis, assuming that the agents are aware of each other's existence and their potential impact on themselves, I develop novel multi-agent reinforcement learning (MARL) methods that can measure the mutual influence between agents to shape learning. The first part of this thesis outlines the framework of recursive reasoning in deep multi-agent reinforcement learning. I hypothesize that it is beneficial for each agent to consider how other agents react to their behavior. I start from Probabilistic Recursive Reasoning (PR2) using level-1 reasoning and adopt variational Bayes methods to approximate the opponents' conditional policies. Each agent shapes the individual Q-value by marginalizing the conditional policies in the joint Q-value and finding the best response to improving their policies. I further extend PR2 to Generalized Recursive Reasoning (GR2) with different hierarchical levels of rationality. GR2 enables agents to possess various levels of thinking ability, thereby allowing higher-level agents to best respond to less sophisticated learners. The first part of the thesis shows that eliminating the joint Q-value to an individual Q-value via explicitly recursive reasoning would benefit the learning. In the second part of the thesis, in reverse, I measure the mutual influence by approximating the joint Q-value based on the individual Q-values. I establish Q-DPP, an extension of the Determinantal Point Process (DPP) with partition constraints, and apply it to multi-agent learning as a function approximator for the centralized value function. An attractive property of using Q-DPP is that when it reaches the optimum value, it can offer a natural factorization of the centralized value function, representing both quality (maximizing reward) and diversity (different behaviors). In the third part of the thesis, I depart from the action-level mutual influence and build a policy-space meta-game to analyze agents' relationship between adaptive policies. I present a Multi-Agent Trust Region Learning (MATRL) algorithm that augments single-agent trust region policy optimization with a weak stable fixed point approximated by the policy-space meta-game. The algorithm aims to find a game-theoretic mechanism to adjust the policy optimization steps that force the learning of all agents toward the stable point

    Making Sense of Unexpected Preferences

    Get PDF
    This dissertation includes three papers using quantitative models to sensibly describe what kinds of preferences political actors will or actually do hold when existing theory offers no insight. The first two papers use evolutionary game theory to predict ways in which politicians, artificially selected on the basis of good performance to remain in office, will in the long run diverge from instrumental rationality as ordinarily assumed in game theory. The first sets out a general principle for producing models of preference evolution in games as political models, namely, that the information about opponent preferences necessary for evolution of non-rational preferences comes from opponents\u27 previous plays, and applies it to two simple games. The second uses the same principles in more detail on a bargaining game that models the plea negotiations between a prosecutor and a defense attorney, leading to a conclusion that failure to learn from setbacks during a trial is an evolutionarily favored trait among prosecutors. The third paper addresses the ideological preferences of Supreme Court justices, which existing statistical models do not effectively compare to those of elected officials since the two groups never vote on the same items, by identifying a set of political actors with whom both groups commonly interact: organized interest groups who vote on Supreme Court cases with amicus curiae briefs and on electoral candidates using campaign donations

    Opponent Modelling in Multi-Agent Systems

    Get PDF
    Reinforcement Learning (RL) formalises a problem where an intelligent agent needs to learn and achieve certain goals by maximising a long-term return in an environment. Multi-agent reinforcement learning (MARL) extends traditional RL to multiple agents. Many RL algorithms lose convergence guarantee in non-stationary environments due to the adaptive opponents. Partial observation caused by agents’ different private observations introduces high variance during the training which exacerbates the data inefficiency. In MARL, training an agent to perform well against a set of opponents often leads to bad performance against another set of opponents. Non-stationarity, partial observation and unclear learning objective are three critical problems in MARL which hinder agents’ learning and they all share a cause which is the lack of knowledge of the other agents. Therefore, in this thesis, we propose to solve these problems with opponent modelling methods. We tailor our solutions by combining opponent modelling with other techniques according to the characteristics of problems we face. Specifically, we first propose ROMMEO, an algorithm inspired by Bayesian inference, as a solution to alleviate the non-stationarity in cooperative games. Then we study the partial observation problem caused by agents’ private observation and design an implicit communication training method named PBL. Lastly, we investigate solutions to the non-stationarity and unclear learning objective problems in zero-sum games. We propose a solution named EPSOM which aims for finding safe exploitation strategies to play against non-stationary opponents. We verify our proposed methods by varied experiments and show they can achieve the desired performance. Limitations and future works are discussed in the last chapter of this thesis
    corecore