6,391 research outputs found

    Hedging of Financial Derivative Contracts via Monte Carlo Tree Search

    Full text link
    The construction of approximate replication strategies for derivative contracts in incomplete markets is a key problem of financial engineering. Recently Reinforcement Learning algorithms for pricing and hedging under realistic market conditions have attracted significant interest. While financial research mostly focused on variations of QQ-learning, in Artificial Intelligence Monte Carlo Tree Search is the recognized state-of-the-art method for various planning problems, such as the games of Hex, Chess, Go,... This article introduces Monte Carlo Tree Search as a method to solve the stochastic optimal control problem underlying the pricing and hedging of financial derivatives. As compared to QQ-learning it combines reinforcement learning with tree search techniques. As a consequence Monte Carlo Tree Search has higher sample efficiency, is less prone to over-fitting to specific market models and generally learns stronger policies faster. In our experiments we find that Monte Carlo Tree Search, being the world-champion in games like Chess and Go, is easily capable of directly maximizing the utility of investor's terminal wealth without an intermediate mathematical theory.Comment: Added figures. Added references. Corrected typos. 15 pages, 5 figure

    Edge Intelligence Simulator:a platform for simulating intelligent edge orchestration solutions

    Get PDF
    Abstract. To support the stringent requirements of the future intelligent and interactive applications, intelligence needs to become an essential part of the resource management in the edge environment. Developing intelligent orchestration solutions is a challenging and arduous task, where the evaluation and comparison of the proposed solution is a focal point. Simulation is commonly used to evaluate and compare proposed solutions. However, there does not currently exist openly available simulators that would have a specific focus on supporting the research on intelligent edge orchestration methods. This thesis presents a simulation platform called Edge Intelligence Simulator (EISim), the purpose of which is to facilitate the research on intelligent edge orchestration solutions. In its current form, the platform supports simulating deep reinforcement learning based solutions and different orchestration control topologies in scenarios related to task offloading and resource pricing on edge. The platform also includes additional tools for creating simulation environments, running simulations for agent training and evaluation, and plotting results. This thesis gives a comprehensive overview of the state of the art in edge and fog simulation, orchestration, offloading, and resource pricing, which provides a basis for the design of EISim. The methods and tools that form the foundation of the current EISim implementation are also presented, along with a detailed description of the EISim architecture, default implementations, use, and additional tools. Finally, EISim with its default implementations is validated and evaluated through a large-scale simulation study with 24 simulation scenarios. The results of the simulation study verify the end-to-end performance of EISim and show its capability to produce sensible results. The results also illustrate how EISim can help the researcher in controlling and monitoring the training of intelligent agents, as well as in evaluating solutions against different control topologies.Reunaälysimulaattori : alusta älykkäiden reunalaskennan orkestrointiratkaisujen simulointiin. Tiivistelmä. Älykkäiden ratkaisujen täytyy tulla olennaiseksi osaksi reunaympäristön resurssien hallinnointia, jotta tulevaisuuden vuorovaikutteisten ja älykkäiden sovellusten suoritusta voidaan tukea tasolla, joka täyttää sovellusten tiukat suoritusvaatimukset. Älykkäiden orkestrointiratkaisujen kehitys on vaativa ja työläs prosessi, jonka keskiöön kuuluu olennaisesti menetelmien testaaminen ja vertailu muita menetelmiä vasten. Simulointia käytetään tyypillisesti menetelmien arviointiin ja vertailuun, mutta tällä hetkellä ei ole avoimesti saatavilla simulaattoreita, jotka eritoten keskittyisivät tukemaan älykkäiden reunaorkestrointiratkaisujen kehitystä. Tässä opinnäytetyössä esitellään simulaatioalusta nimeltään Edge Intelligence Simulator (EISim; Reunaälysimulaattori), jonka tarkoitus on helpottaa älykkäiden reunaorkestrointiratkaisujen tutkimusta. Nykymuodossaan se tukee vahvistusoppimispohjaisten ratkaisujen sekä erityyppisten orkestroinnin kontrollitopologioiden simulointia skenaarioissa, jotka liittyvät laskennan siirtoon ja resurssien hinnoitteluun reunaympäristössä. Alustan mukana tulee myös lisätyökaluja, joita voi käyttää simulaatioympäristöjen luomiseen, simulaatioiden ajamiseen agenttien koulutusta ja arviointia varten, sekä simulaatiotulosten visualisoimiseen. Tämä opinnäytetyö sisältää kattavan katsauksen reunaympäristön simuloinnin, reunaorkestroinnin, laskennan siirron ja resurssien hinnoittelun nykytilaan kirjallisuudessa, mikä tarjoaa kunnollisen lähtökohdan EISimin toteutukselle. Opinnäytetyö esittelee menetelmät ja työkalut, joihin EISimin tämänhetkinen toteutus perustuu, sekä antaa yksityiskohtaisen kuvauksen EISimin arkkitehtuurista, oletustoteutuksista, käytöstä ja lisätyökaluista. EISimin validointia ja arviointia varten esitellään laaja simulaatiotutkimus, jossa EISimin oletustoteutuksia simuloidaan 24 simulaatioskenaariossa. Simulaatiotutkimuksen tulokset todentavat EISimin kokonaisvaltaisen toimintakyvyn, sekä osoittavat EISimin kyvyn tuottaa järkeviä tuloksia. Tulokset myös havainnollistavat, miten EISim voi auttaa tutkijoita älykkäiden agenttien koulutuksessa ja ratkaisujen arvioinnissa eri kontrollitopologioita vasten

    Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

    Full text link
    Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

    Deep Reinforcement Learning for Distribution Network Operation and Electricity Market

    Full text link
    The conventional distribution network and electricity market operation have become challenging under complicated network operating conditions, due to emerging distributed electricity generations, coupled energy networks, and new market behaviours. These challenges include increasing dynamics and stochastics, and vast problem dimensions such as control points, measurements, and multiple objectives, etc. Previously the optimization models were often formulated as conventional programming problems and then solved mathematically, which could now become highly time-consuming or sometimes infeasible. On the other hand, with the recent advancement of artificial intelligence technologies, deep reinforcement learning (DRL) algorithms have demonstrated their excellent performances in various control and optimization fields. This indicates a potential alternative to address these challenges. In this thesis, DRL-based solutions for distribution network operation and electricity market have been investigated and proposed. Firstly, a DRL-based methodology is proposed for Volt/Var Control (VVC) optimization in a large distribution network, to effectively control bus voltages and reduce network power losses. Further, this thesis proposes a multi-agent (MA)DRL-based methodology under a complex regional coordinated VVC framework, and it can address spatial and temporal uncertainties. The DRL algorithm is also improved to adapt to the applications. Then, an integrated energy and heating systems (IEHS) optimization problem is solved by a MADRL-based methodology, where conventionally this could only be solved by simplifications or iterations. Beyond the applications in distribution network operation, a new electricity market service pricing method based on a DRL algorithm is also proposed. This DRL-based method has demonstrated good performance in this virtual storage rental service pricing problem, whereas this bi-level problem could hardly be solved directly due to a non-convex and non-continuous lower-level problem. These proposed methods have demonstrated advantageous performances under comprehensive case studies, and numerical simulation results have validated the effectiveness and high efficiency under different sophisticated operation conditions, solution robustness against temporal and spatial uncertainties, and optimality under large problem dimensions

    The Green Choice: Learning and Influencing Human Decisions on Shared Roads

    Full text link
    Autonomous vehicles have the potential to increase the capacity of roads via platooning, even when human drivers and autonomous vehicles share roads. However, when users of a road network choose their routes selfishly, the resulting traffic configuration may be very inefficient. Because of this, we consider how to influence human decisions so as to decrease congestion on these roads. We consider a network of parallel roads with two modes of transportation: (i) human drivers who will choose the quickest route available to them, and (ii) ride hailing service which provides an array of autonomous vehicle ride options, each with different prices, to users. In this work, we seek to design these prices so that when autonomous service users choose from these options and human drivers selfishly choose their resulting routes, road usage is maximized and transit delay is minimized. To do so, we formalize a model of how autonomous service users make choices between routes with different price/delay values. Developing a preference-based algorithm to learn the preferences of the users, and using a vehicle flow model related to the Fundamental Diagram of Traffic, we formulate a planning optimization to maximize a social objective and demonstrate the benefit of the proposed routing and learning scheme.Comment: Submitted to CDC 201
    corecore