50 research outputs found

    Fuzzy and tile coding approximation techniques for coevolution in reinforcement learning

    Get PDF
    PhDThis thesis investigates reinforcement learning algorithms suitable for learning in large state space problems and coevolution. In order to learn in large state spaces, the state space must be collapsed to a computationally feasible size and then generalised about. This thesis presents two new implementations of the classic temporal difference (TD) reinforcement learning algorithm Sarsa that utilise fuzzy logic principles for approximation, FQ Sarsa and Fuzzy Sarsa. The effectiveness of these two fuzzy reinforcement learning algorithms is investigated in the context of an agent marketplace. It presents a practical investigation into the design of fuzzy membership functions and tile coding schemas. A critical analysis of the fuzzy algorithms to a related technique in function approximation, a coarse coding approach called tile coding is given in the context of three different simulation environments; the mountain-car problem, a predator/prey gridworld and an agent marketplace. A further comparison between Fuzzy Sarsa and tile coding in the context of the nonstationary environments of the agent marketplace and predator/prey gridworld is presented. This thesis shows that the Fuzzy Sarsa algorithm achieves a significant reduction of state space over traditional Sarsa, without loss of the finer detail that the FQ Sarsa algorithm experiences. It also shows that Fuzzy Sarsa and gradient descent Sarsa(λ) with tile coding learn similar levels of distinction against a stationary strategy. Finally, this thesis demonstrates that Fuzzy Sarsa performs better in a competitive multiagent domain than the tile coding solution

    Genetic Programming Techniques in Engineering Applications

    Get PDF
    2012/2013Machine learning is a suite of techniques that allow developing algorithms for performing tasks by generalizing from examples. Machine learning systems, thus, may automatically synthesize programs from data. This approach is often feasible and cost-effective where manual programming or manual algorithm design is not. In the last decade techniques based on machine learning have spread in a broad range of application domains. In this thesis, we will present several novel applications of a specific machine Learning technique, called Genetic Programming, to a wide set of engineering applications grounded in real world problems. The problems treated in this work range from the automatic synthesis of regular expressions, to the generation of electricity price forecast, to the synthesis of a model for the tracheal pressure in mechanical ventilation. The results demonstrate that Genetic Programming is indeed a suitable tool for solving complex problems of practical interest. Furthermore, several results constitute a significant improvement over the existing state-of-the-art. The main contribution of this thesis is the design and implementation of a framework for the automatic inference of regular expressions from examples based on Genetic Programming. First, we will show the ability of such a framework to cope with the generation of regular expressions for solving text-extraction tasks from examples. We will experimentally assess our proposal comparing our results with previous proposals on a collection of real-world datasets. The results demonstrate a clear superiority of our approach. We have implemented the approach in a web application that has gained considerable interest and has reached peaks of more 10000 daily accesses. Then, we will apply the framework to a popular "regex golf" challenge, a competition for human players that are required to generate the shortest regular expression solving a given set of problems. Our results rank in the top 10 list of human players worldwide and outperform those generated by the only existing algorithm specialized to this purpose. Hence, we will perform an extensive experimental evaluation in order to compare our proposal to the state-of-the-art proposal in a very close and long-established research field: the generation of a Deterministic Finite Automata (DFA) from a labelled set of examples. Our results demonstrate that the existing state-of-the-art in DFA learning is not suitable for text extraction tasks. We will also show a variant of our framework designed for solving text processing tasks of the search-and-replace form. A common way to automate search-and-replace is to describe the region to be modified and the desired changes through a regular expression and a replacement expression. We will propose a solution to automatically produce both those expressions based only on examples provided by user. We will experimentally assess our proposal on real-word search-and-replace tasks. The results indicate that our proposal is indeed feasible. Finally, we will study the applicability of our framework to the generation of schema based on a sample of the eXtensible Markup Language documents. The eXtensible Markup Language documents are largely used in machine-to-machine interactions and such interactions often require that some constraints are applied to the contents of the documents. These constraints are usually specified in a separate document which is often unavailable or missing. In order to generate a missing schema, we will apply and will evaluate experimentally our framework to solve this problem. In the final part of this thesis we will describe two significant applications from different domains. We will describe a forecasting system for producing estimates of the next day electricity price. The system is based on a combination of a predictor based on Genetic Programming and a classifier based on Neural Networks. Key feature of this system is the ability of handling outliers-i.e., values rarely seen during the learning phase. We will compare our results with a challenging baseline representative of the state-of-the-art. We will show that our proposal exhibits smaller prediction error than the baseline. Finally, we will move to a biomedical problem: estimating tracheal pressure in a patient treated with high-frequency percussive ventilation. High-frequency percussive ventilation is a new and promising non-conventional mechanical ventilatory strategy. In order to avoid barotrauma and volutrauma in patience, the pressure of air insufflated must be monitored carefully. Since measuring the tracheal pressure is difficult, a model for accurately estimating the tracheal pressure is required. We will propose a synthesis of such model by means of Genetic Programming and we will compare our results with the state-of-the-art.XXVI Ciclo198

    The influence of topology and information diffusion on networked game dynamics

    Get PDF
    This thesis studies the influence of topology and information diffusion on the strategic interactions of agents in a population. It shows that there exists a reciprocal relationship between the topology, information diffusion and the strategic interactions of a population of players. In order to evaluate the influence of topology and information flow on networked game dynamics, strategic games are simulated on populations of players where the players are distributed in a non-homogeneous spatial arrangement. The initial component of this research consists of a study of evolution of the coordination of strategic players, where the topology or the structure of the population is shown to be critical in defining the coordination among the players. Next, the effect of network topology on the evolutionary stability of strategies is studied in detail. Based on the results obtained, it is shown that network topology plays a key role in determining the evolutionary stability of a particular strategy in a population of players. Then, the effect of network topology on the optimum placement of strategies is studied. Using genetic optimisation, it is shown that the placement of strategies in a spatially distributed population of players is crucial in maximising the collective payoff of the population. Exploring further the effect of network topology and information diffusion on networked games, the non-optimal or bounded rationality of players is modelled using topological and directed information flow of the network. Based on the topologically distributed bounded rationality model, it is shown that the scale-free and small-world networks emerge in randomly connected populations of sub-optimal players. Thus, the topological and information theoretic interpretations of bounded rationality suggest the topology, information diffusion and the strategic interactions of socio-economical structures are cyclically interdependent

    The influence of topology and information diffusion on networked game dynamics

    Get PDF
    This thesis studies the influence of topology and information diffusion on the strategic interactions of agents in a population. It shows that there exists a reciprocal relationship between the topology, information diffusion and the strategic interactions of a population of players. In order to evaluate the influence of topology and information flow on networked game dynamics, strategic games are simulated on populations of players where the players are distributed in a non-homogeneous spatial arrangement. The initial component of this research consists of a study of evolution of the coordination of strategic players, where the topology or the structure of the population is shown to be critical in defining the coordination among the players. Next, the effect of network topology on the evolutionary stability of strategies is studied in detail. Based on the results obtained, it is shown that network topology plays a key role in determining the evolutionary stability of a particular strategy in a population of players. Then, the effect of network topology on the optimum placement of strategies is studied. Using genetic optimisation, it is shown that the placement of strategies in a spatially distributed population of players is crucial in maximising the collective payoff of the population. Exploring further the effect of network topology and information diffusion on networked games, the non-optimal or bounded rationality of players is modelled using topological and directed information flow of the network. Based on the topologically distributed bounded rationality model, it is shown that the scale-free and small-world networks emerge in randomly connected populations of sub-optimal players. Thus, the topological and information theoretic interpretations of bounded rationality suggest the topology, information diffusion and the strategic interactions of socio-economical structures are cyclically interdependent

    A practical guide to multi-objective reinforcement learning and planning

    Get PDF
    Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems. © 2022, The Author(s)

    A Practical Guide to Multi-Objective Reinforcement Learning and Planning

    Get PDF
    Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems

    Operational Research: Methods and Applications

    Get PDF
    Throughout its history, Operational Research has evolved to include a variety of methods, models and algorithms that have been applied to a diverse and wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first aims to summarise the up-to-date knowledge and provide an overview of the state-of-the-art methods and key developments in the various subdomains of the field. The second offers a wide-ranging list of areas where Operational Research has been applied. The article is meant to be read in a nonlinear fashion. It should be used as a point of reference or first-port-of-call for a diverse pool of readers: academics, researchers, students, and practitioners. The entries within the methods and applications sections are presented in alphabetical order. The authors dedicate this paper to the 2023 Turkey/Syria earthquake victims. We sincerely hope that advances in OR will play a role towards minimising the pain and suffering caused by this and future catastrophes

    Essays on modeling and analysis of dynamic sociotechnical systems

    Get PDF
    A sociotechnical system is a collection of humans and algorithms that interact under the partial supervision of a decentralized controller. These systems often display in- tricate dynamics and can be characterized by their unique emergent behavior. In this work, we describe, analyze, and model aspects of three distinct classes of sociotech- nical systems: financial markets, social media platforms, and elections. Though our work is diverse in subject matter content, it is unified though the study of evolution- and adaptation-driven change in social systems and the development of methods used to infer this change. We first analyze evolutionary financial market microstructure dynamics in the context of an agent-based model (ABM). The ABM’s matching engine implements a frequent batch auction, a recently-developed type of price-discovery mechanism. We subject simple agents to evolutionary pressure using a variety of selection mech- anisms, demonstrating that quantile-based selection mechanisms are associated with lower market-wide volatility. We then evolve deep neural networks in the ABM and demonstrate that elite individuals are profitable in backtesting on real foreign ex- change data, even though their fitness had never been evaluated on any real financial data during evolution. We then turn to the extraction of multi-timescale functional signals from large panels of timeseries generated by sociotechnical systems. We introduce the discrete shocklet transform (DST) and associated similarity search algorithm, the shocklet transform and ranking (STAR) algorithm, to accomplish this task. We empirically demonstrate the STAR algorithm’s invariance to quantitative functional parameteri- zation and provide use case examples. The STAR algorithm compares favorably with Twitter’s anomaly detection algorithm on a feature extraction task. We close by using STAR to automatically construct a narrative timeline of societally-significant events using a panel of Twitter word usage timeseries. Finally, we model strategic interactions between the foreign intelligence service (Red team) of a country that is attempting to interfere with an election occurring in another country, and the domestic intelligence service of the country in which the election is taking place (Blue team). We derive subgame-perfect Nash equilibrium strategies for both Red and Blue and demonstrate the emergence of arms race inter- ference dynamics when either player has “all-or-nothing” attitudes about the result of the interference episode. We then confront our model with data from the 2016 U.S. presidential election contest, in which Russian military intelligence interfered. We demonstrate that our model captures the qualitative dynamics of this interference for most of the time under stud

    Game theoretic modeling and analysis : A co-evolutionary, agent-based approach

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore