1,559 research outputs found

    Deep reinforcement learning algorithms in multi agent changing environments using potential fields

    Get PDF
    Se propone el desarrollo de sistemas y algoritmos de aprendizaje reforzado profundo para entornos de vehículos autónomos. Para ello se propone inicialmente realizar una búsqueda bibliográfica sobre el uso de esta técnica de aprendizaje reforzado profundo para aplicaciones futuras de vehículos autónomos. Otro elemento básico de este proyecto será el desarrollo de herramientas de aprendizaje reforzado profundo, para mejorar en lo posible, la capacidad de aprendizaje del vehículo, la capacidad de adaptación a un entorno cambiante, y su capacidad final de decidir y realizar uThe project explores the possibilities offered by reinforcement learning in the field of robotics with the vision of guiding robots in changing environments with collision avoidance through potential fields. For this, the DDPG, TD3, SAC and PPO reinforcement learning algorithms are implemented through the Matlab Toolbox "Reinforcement Learning" with the aim of carrying out a comparative study on which of them is the most optimal for different configurations of environments and parameters, with the help of training graphs and statistical tables. Also, potential fields have been developed in this project, demonstrating to be a suitable tool for guiding robots in changing environments, and even to implement multi agent scenarios, avoiding collisions among them and enhancing collaboration.El projecte explora les possibilitats que ofereix l'aprenentatge per reforç en l'àmbit de la robòtica amb la visió de guiar robots a través d'entorns canviants amb evitació de col·lisions mitjançant camps de potencials. Per això s'implementen els algorismes d'aprenentatge per reforç DDPG, TD3, SAC i PPO per mitjà de la Toolbox de Matlab Reinforcement Learning amb l'objectiu de fer un estudi comparatiu sobre quin d'ells és el més òptim per a diferents configuracions d'entorns i paràmetres; tot això amb l'ajuda de gràfiques d'entrenament i taules estadístiques. Així mateix, s'han desenvolupat camps potencials en aquest projecte, demostrant ser una eina adequada per a guiar robots en entorns canviants, i fins i tot per implementar escenaris multiagent, evitant col·lisions entre ells i potenciant la col·laboració.El proyecto explora las posibilidades que ofrece el aprendizaje por refuerzo en el ámbito de la robótica con la visión de guiar a robots a través de entornos cambiantes con evitación de colisiones mediante campos potenciales. Para ello se implementan los algoritmos de aprendizaje por refuerzo DDPG, TD3, SAC y PPO por intermedio de la Toolbox de Matlab "Reinforcement Learning" con el objetivo de realizar un estudio comparativo sobre cuál de ellos es el más óptimo para diferentes configuraciones de entornos y parámetros; todo ello con la ayuda de gráficas de entrenamiento y tablas estadísticas. Además, en este proyecto se han desarrollado campos potenciales, demostrando ser una herramienta adecuada para guiar robots en entornos cambiantes, e incluso implementar escenarios multiagente, evitando colisiones entre ellos y potenciando la colaboración

    Training & acceleration of deep reinforcement learning agents

    Get PDF
    Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) "Επιστήμη Δεδομένων και Μηχανική Μάθηση

    A novel multi-level and community-based agent ecosystem to support customers dynamic decision-making in smart grids

    Get PDF
    Electrical systems have evolved at a fast pace over the past years, particularly in response to the current environmental and climate challenges. Consequently, the European Union and the United Nations have encouraged the development of a more sustainable energy strategy. This strategy triggered a paradigm shift in energy consumption and production, which becoming increasingly distributed, resulted in the development and emergence of smart energy grids. Multi-agent systems are one of the most widely used artificial intelligence concepts in smart grids. Both multi-agent systems and smart grids are distributed, so there is correspondence between the used technology and the network's complex reality. Due to the wide variety of multi-agent systems applied to smart grids, which typically have very specific goals, the ability to model the network as a whole may be compromised, as communication between systems is typically non-existent. This dissertation, therefore, proposes an agent-based ecosystem to model smart grids in which different agent-based systems can coexist. This dissertation aims to conceive, implement, test, and validate a new agent-based ecosystem, entitled A4SG (agent-based ecosystem for smart grids modelling), which combines the concepts of multi-agent systems and agent communities to enable the modelling and representation of smart grids and the entities that compose them. The proposed ecosystem employs an innovative methodology for managing static or dynamic interactions present in smart grids. The creation of a solution that allows the integration of existing systems into an ecosystem, enables the representation of smart grids in a realistic and comprehensive manner. A4SG integrates several functionalities that support the ecosystem's management, also conceived, implemented, tested, and validated in this dissertation. Two mobility functionalities are proposed: one that allows agents to move between physical machines and another that allows "virtual" mobility, where agents move between agent communities to improve the context for the achievement of their objectives. In order to prevent an agent from becoming overloaded, a novel functionality is proposed to enable the creation of agents that function as extensions of the main agent (i.e., branch agents), allowing the distribution of objectives among the various extensions of the main agent. Several case studies, which test the proposed services and functionalities individually and the ecosystem as a whole, were used to test and validate the proposed solution. These case studies were conducted in realistic contexts using data from multiple sources, including energy communities. The results indicate that the used methodologies can increase participation in demand response events, increasing the fitting between consumers and aggregators from 12 % to 69 %, and improve the strategies used in energy transaction markets, allowing an energy community of 50 customers to save 77.0 EUR per week.Os últimos anos têm sido de mudança nos sistemas elétricos, especialmente devido aos atuais desafios ambientais e climáticos. A procura por uma estratégia mais sustentável para o domínio da energia tem sido promovida pela União Europeia e pela Organização das Nações Unidas. A mudança de paradigma no que toca ao consumo e produção de energia, que acontece, cada vez mais, de forma distribuída, tem levado à emergência das redes elétricas inteligentes. Os sistemas multi-agente são um dos conceitos, no domínio da inteligência artificial, mais aplicados em redes inteligentes. Tanto os sistemas multi-agente como as redes inteligentes têm uma natureza distribuída, existindo por isso um alinhamento entre a tecnologia usada e a realidade complexa da rede. Devido a existir uma vasta oferta de sistemas multi-agente aplicados a redes inteligentes, normalmente com objetivos bastante específicos, a capacidade de modelar a rede como um todo pode ficar comprometida, porque a comunicação entre sistemas é, geralmente, inexistente. Por isso, esta dissertação propõe um ecossistema baseado em agentes para modelar as redes inteligentes, onde vários sistemas de agentes coexistem. Esta dissertação pretende conceber, implementar, testar, e validar um novo ecossistema multiagente, intitulado A4SG (agent-based ecosystem for smart grids modelling), que combina os conceitos de sistemas multi-agente e comunidades de agentes, permitindo a modelação e representação de redes inteligentes e das suas entidades. O ecossistema proposto utiliza uma metodologia inovadora para gerir as interações presentes nas redes inteligentes, sejam elas estáticas ou dinâmicas. A criação de um ecossistema que permite a integração de sistemas já existentes, cria a possibilidade de uma representação realista e detalhada das redes de energia. O A4SG integra diversas funcionalidades, também estas concebidas, implementadas, testadas, e validadas nesta dissertação, que suportam a gestão do próprio ecossistema. São propostas duas funcionalidades de mobilidade, uma que permite aos agentes mover-se entre máquinas físicas, e uma que permite uma mobilidade “virtual”, onde os agentes se movem entre comunidades de agentes, de forma a melhorar o contexto para a execução dos seus objetivos. É também proposta uma nova funcionalidade que permite a criação de agentes que funcionam como uma extensão de um agente principal, com o objetivo de evitar a sobrecarga de um agente, permitindo a distribuição de objetivos entre as várias extensões do agente principal. A solução proposta foi testada e validada por vários casos de estudo, que testam os serviços e funcionalidades propostas individualmente, e o ecossistema como um todo. Estes casos de estudo foram executados em contextos realistas, usando dados provenientes de diversas fontes, tais como comunidades de energia. Os resultados demonstram que as metodologias utilizadas podem melhorar a participação em eventos de demand response, subindo a adequação entre consumidores e agregadores de 12 % para 69 %, e melhorar as estratégias utilizadas em mercados de transações de energia, permitindo a uma comunidade de energia com 50 consumidores poupar 77,0 EUR por semana

    A REINFORCEMENT LEARNING APPROACH TO VEHICLE PATH OPTIMIZATION IN URBAN ENVIRONMENTS

    Get PDF
    Road traffic management in metropolitan cities and urban areas, in general, is an important component of Intelligent Transportation Systems (ITS). With the increasing number of world population and vehicles, a dramatic increase in road traffic is expected to put pressure on the transportation infrastructure. Therefore, there is a pressing need to devise new ways to optimize the traffic flow in order to accommodate the growing needs of transportation systems. This work proposes to use an Artificial Intelligent (AI) method based on reinforcement learning techniques for computing near-optimal vehicle itineraries applied to Vehicular Ad-hoc Networks (VANETs). These itineraries are optimized based on the vehicle’s travel distance, travel time, and traffic road congestion. The problem of traffic density is formulated as a Markov Decision Process (MDP). In particular, this work introduces a new reward function that takes into account the traffic congestion when learning about the vehicle’s best action (best turn) to take in different situations. To learn the effect of this approach, the work investigated different learning algorithms such as Q-Learning and SARSA in conjunction with two exploration strategies: (a) e-greedy and (b) Softmax. A comparative performance study of these methods is presented to determine the most effective solution that enables the vehicles to find a fast and reliable path. Simulation experiments illustrate the effectiveness of proposed methods in computing optimal itineraries allowing vehicles to avoid traffic congestion while maintaining reasonable travel times and distances

    Exploring Evolution Strategies for Reinforcement Learning in the Obstacle Tower Environment

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsIn 2017 OpenAI demonstrated that it was possible to train an AI agent by using Evolution Strategies (ES), and that the results rivaled standard Reinforcement Learning (RL) techniques on modern benchmarks. Their research effectively showed that Evolution Strategies is a viable alternative to traditional Reinforcement Learning techniques, and that it bypasses many of Reinforcement Learning’s inconveniences, notably the use of backpropagation. The Obstacle Tower environment aims to set a new Reinforcement Learning benchmark by challenging Artificial Intelligence (AI) agents to traverse 3-Dimensional procedurally generated levels using a real-time 3-Dimensional physics system. The environment tests an agent’s ability to generalize by requiring it to optimize aspects that are common in many Reinforcement Learning environments, but rarely combined in the same environment: vision, planning, and control. In this research, the original implementation of OpenAI’s Evolution Strategies algorithm was applied for the first time to the Obstacle Tower environment to assess how well it performs in a more complex environment, where the agent’s generalization ability is critical. Additionally, in the interest of exploring Evolution Strategies in this environment, common Genetic Algorithm selection and mutation techniques were developed and applied to try and improve the performance of the original Evolution Strategies implementation. Crossover techniques were not explored during this research, as they are rarely applied in Evolution Strategies. The results show that although the basic implementation of Evolution Strategies does not perform well in the complex Obstacle Tower environment, it is possible to improve its performance by applying different evolution methods borrowed from Genetic Algorithm (GA), which are algorithms belonging to the same family as Evolution Strategies

    Extrinsic Rewards and Intrinsic Motives: Standard and Behavioral Approaches to Agency and Labor Markets

    Get PDF
    Employers structure pay and employment relationships to mitigate agency problems. A large literature in economics documents how the resolution of these problems shapes personnel policies and labor markets. For the most part, the study of agency in employment relationships relies on highly stylized assumptions regarding human motivation, e.g., that employees seek to earn as much money as possible with minimal effort. In this essay, we explore the consequences of introducing behavioral complexity and realism into models of agency within organizations. Specifically, we assess the insights gained by allowing employees to be guided by such motivations as the desire to compare favorably to others, the aspiration to contribute to intrinsically worthwhile goals, and the inclination to reciprocate generosity or exact retribution for perceived wrongs. More provocatively, from the standpoint of standard economics, we also consider the possibility that people are driven, in ways that may be opaque even to themselves, by the desire to earn social esteem or to shape and reinforce identity.agency, motivation, employment relationships, behavioral economics

    Neural representation in active inference: using generative models to interact with -- and understand -- the lived world

    Full text link
    This paper considers neural representation through the lens of active inference, a normative framework for understanding brain function. It delves into how living organisms employ generative models to minimize the discrepancy between predictions and observations (as scored with variational free energy). The ensuing analysis suggests that the brain learns generative models to navigate the world adaptively, not (or not solely) to understand it. Different living organisms may possess an array of generative models, spanning from those that support action-perception cycles to those that underwrite planning and imagination; namely, from "explicit" models that entail variables for predicting concurrent sensations, like objects, faces, or people - to "action-oriented models" that predict action outcomes. It then elucidates how generative models and belief dynamics might link to neural representation and the implications of different types of generative models for understanding an agent's cognitive capabilities in relation to its ecological niche. The paper concludes with open questions regarding the evolution of generative models and the development of advanced cognitive abilities - and the gradual transition from "pragmatic" to "detached" neural representations. The analysis on offer foregrounds the diverse roles that generative models play in cognitive processes and the evolution of neural representation
    corecore