Search CORE

1,268 research outputs found

Comparing policy gradient and value function based reinforcement learning methods in simulated electrical power trade

Author: Burt Graeme
Galloway Stuart
Lincoln Richard
Stephen Bruce
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2012
Field of study

In electrical power engineering, reinforcement learning algorithms can be used to model the strategies of electricity market participants. However, traditional value function based reinforcement learning algorithms suffer from convergence issues when used with value function approximators. Function approximation is required in this domain to capture the characteristics of the complex and continuous multivariate problem space. The contribution of this paper is the comparison of policy gradient reinforcement learning methods, using artificial neural networks for policy function approximation, with traditional value function based methods in simulations of electricity trade. The methods are compared using an AC optimal power flow based power exchange auction market model and a reference electric power system model

Crossref

University of Strathclyde Institutional Repository

Graph Attention Multi-Agent Fleet Autonomy for Advanced Air Mobility

Author: Choi Heeyoul
Fernando Malintha
Senanayake Ransalu
Swany Martin
Publication venue
Publication date: 16/02/2023
Field of study

Autonomous mobility is emerging as a new mode of urban transportation for moving cargo and passengers. However, such fleet coordination schemes face significant challenges in scaling to accommodate fast-growing fleet sizes that vary in their operational range, capacity, and communication capabilities. We introduce the concept of partially observable advanced air mobility games to coordinate a fleet of aerial vehicle agents accounting for their heterogeneity and self-interest inherent to commercial mobility fleets. We propose a novel heterogeneous graph attention-based encoder-decoder (HetGAT Enc-Dec) neural network to construct a generalizable stochastic policy stemming from the inter- and intra-agent relations within the mobility system. We train our policy by leveraging deep multi-agent reinforcement learning, allowing decentralized decision-making for the agents using their local observations. Through extensive experimentation, we show that the fleets operating under the HetGAT Enc-Dec policy outperform other state-of-the-art graph neural network-based policies by achieving the highest fleet reward and fulfillment ratios in an on-demand mobility network.Comment: 12 pages, 12 figures, 3 table

arXiv.org e-Print Archive

Deep reinforcement learning for investing: A quantamental approach for portfolio management

Author: Maltêz Fábio Alexandre Afonso
Publication venue
Publication date: 05/12/2022
Field of study

The world of investments affects us all. The way surplus capital is allocated by ourselves or investment funds can determine how we eat, innovate and even educate kids. Portfolio management is an integral albeit challenging process in this task (Leković, 2021). It entails managing a basket of financial assets to maximize the returns per unit of risk, considering all the micro and macro economical, societal, political and environmental complex causal relations. This study aims to evaluate how a machine learning technique called deep reinforcement learning (DRL) can improve the activity of portfolio management. It also has a second goal of understanding if financial fundamental features (i.e., revenue, debt, assets, cash flow) improve the model performance. After conducting a literature review to establish the current state-of-the-art, the CRISP-DM method was followed: 1) Business understanding; 2) Data understanding; 3) Data preparation – two datasets were prepared, one with market only features (i.e., close price, daily volume traded) and another with market plus fundamental features; 4) Modeling – Advantage Actor-Critic (A2C), Deep Deterministic Policy Gradient (DDPG) and Twin-delayed DDPG (TD3) DRL models were optimized on both datasets; 5) Evaluation. On average, models had the same sharpe ratio performance in both datasets – average sharpe ratio of 0.35 vs 0.30 for the baseline, in the test set. DRL models outperformed traditional portfolio optimization techniques and financial fundamental features improved model robustness and consistency. Hence, supporting the use of both DRL models and quantamental investment strategies in portfolio management.Todos somos afetados pelo mundo dos investimentos. A forma como o excedente de capital é alocado tanto por nós como por fundos de investimentos determina a forma como comemos, inovamos e até mesmo como fornecemos educação às crianças. Gestão de portfólio é uma tarefa essencial e desafiadora neste processo (Leković, 2021). Envolve gerir um conjunto de ativos financeiros com o objetivo de maximizar os retornos por unidade de risco, tendo em consideração todas as relações complexas entre fatores macro e microeconómicos, sociais, políticos e ambientais. Este estudo pretende avaliar de que forma a técnica de machine learning intitulada de Aprendizagem por Reforço Profunda (ARP) consegue melhorar a tarefa de gestão de portfólios. Também tem um segundo objetivo de entender se variáveis relacionadas com a performance financeira de uma empresa (i.e., vendas, passivos, ativos, fluxos de caixa) melhoram a performance do modelo. Após o estado-de-arte ter sido definido com a revisão de literatura, utilizou-se o método CRISP-DM da seguinte forma: 1) Entendimento do negócio; 2) Entendimento dos dados; 3) Preparação dos dados – dois conjuntos de dados foram preparados, um apenas com variáveis de mercado (i.e., preço de fecho, volume transacionado) e o outro com variáveis de mercado mais variáveis de performance financeira; 4) Modelagem – usou-se os modelos Advantage Actor-Critic (A2C), Deep Deterministic Policy Gradient (DDPG) e Twin-delayed DDPG (TD3) em ambos os conjuntos de dados; 5) Avaliação. Em média, os modelos apresentaram o mesmo índice sharpe nos dois conjuntos de dados – média de 0.35 vs 0.30 para o modelo base, no conjunto de teste. Os modelos ARP apresentaram uma melhor performance do que os modelos tradicionais de otimização de portfólios e a utilização de variáveis de performance financeira melhoraram a robustez e consistência dos modelos. Tais conclusões suportam o uso de modelos ARP e de estratégias de investimentos quantamentais na gestão de portfólios

Repositório Institucional do ISCTE-IUL

Machine Learning for Ad Publishers in Real Time Bidding

Author: Refaei Afshar Reza
Publication venue: Eindhoven University of Technology
Publication date: 17/03/2022
Field of study

Pure OAI Repository