17,943 research outputs found
Deep reinforcement learning for portfolio management
In our paper, we apply deep reinforcement learning approach to optimize
investment decisions in portfolio management. We make several innovations, such
as adding short mechanism and designing an arbitrage mechanism, and applied our
model to make decision optimization for several randomly selected portfolios.
The experimental results show that our model is able to optimize investment
decisions and has the ability to obtain excess return in stock market, and the
optimized agent maintains the asset weights at fixed value throughout the
trading periods and trades at a very low transaction cost rate. In addition, we
redesigned the formula for calculating portfolio asset weights in continuous
trading process which can make leverage trading, that fills the theoretical gap
in the calculation of portfolio weights when shorting
A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem
Financial portfolio management is the process of constant redistribution of a
fund into different financial products. This paper presents a
financial-model-free Reinforcement Learning framework to provide a deep machine
learning solution to the portfolio management problem. The framework consists
of the Ensemble of Identical Independent Evaluators (EIIE) topology, a
Portfolio-Vector Memory (PVM), an Online Stochastic Batch Learning (OSBL)
scheme, and a fully exploiting and explicit reward function. This framework is
realized in three instants in this work with a Convolutional Neural Network
(CNN), a basic Recurrent Neural Network (RNN), and a Long Short-Term Memory
(LSTM). They are, along with a number of recently reviewed or published
portfolio-selection strategies, examined in three back-test experiments with a
trading period of 30 minutes in a cryptocurrency market. Cryptocurrencies are
electronic and decentralized alternatives to government-issued money, with
Bitcoin as the best-known example of a cryptocurrency. All three instances of
the framework monopolize the top three positions in all experiments,
outdistancing other compared trading algorithms. Although with a high
commission rate of 0.25% in the backtests, the framework is able to achieve at
least 4-fold returns in 50 days.Comment: 30 pages, 5 figures, submitting to JML
A deep Q-learning portfolio management framework for the cryptocurrency market
AbstractDeep reinforcement learning is gaining popularity in many different fields. An interesting sector is related to the definition of dynamic decision-making systems. A possible example is dynamic portfolio optimization, where an agent has to continuously reallocate an amount of fund into a number of different financial assets with the final goal of maximizing return and minimizing risk. In this work, a novel deep Q-learning portfolio management framework is proposed. The framework is composed by two elements: a set of local agents that learn assets behaviours and a global agent that describes the global reward function. The framework is tested on a crypto portfolio composed by four cryptocurrencies. Based on our results, the deep reinforcement portfolio management framework has proven to be a promising approach for dynamic portfolio optimization
G-Learner and GIRL: Goal Based Wealth Management with Reinforcement Learning
We present a reinforcement learning approach to goal based wealth management
problems such as optimization of retirement plans or target dated funds. In
such problems, an investor seeks to achieve a financial goal by making periodic
investments in the portfolio while being employed, and periodically draws from
the account when in retirement, in addition to the ability to re-balance the
portfolio by selling and buying different assets (e.g. stocks). Instead of
relying on a utility of consumption, we present G-Learner: a reinforcement
learning algorithm that operates with explicitly defined one-step rewards, does
not assume a data generation process, and is suitable for noisy data. Our
approach is based on G-learning - a probabilistic extension of the Q-learning
method of reinforcement learning.
In this paper, we demonstrate how G-learning, when applied to a quadratic
reward and Gaussian reference policy, gives an entropy-regulated Linear
Quadratic Regulator (LQR). This critical insight provides a novel and
computationally tractable tool for wealth management tasks which scales to high
dimensional portfolios. In addition to the solution of the direct problem of
G-learning, we also present a new algorithm, GIRL, that extends our goal-based
G-learning approach to the setting of Inverse Reinforcement Learning (IRL)
where rewards collected by the agent are not observed, and should instead be
inferred. We demonstrate that GIRL can successfully learn the reward parameters
of a G-Learner agent and thus imitate its behavior. Finally, we discuss
potential applications of the G-Learner and GIRL algorithms for wealth
management and robo-advising
Model-Free Reinforcement Learning for Financial Portfolios: A Brief Survey
Financial portfolio management is one of the problems that are most
frequently encountered in the investment industry. Nevertheless, it is not
widely recognized that both Kelly Criterion and Risk Parity collapse into Mean
Variance under some conditions, which implies that a universal solution to the
portfolio optimization problem could potentially exist. In fact, the process of
sequential computation of optimal component weights that maximize the
portfolio's expected return subject to a certain risk budget can be
reformulated as a discrete-time Markov Decision Process (MDP) and hence as a
stochastic optimal control, where the system being controlled is a portfolio
consisting of multiple investment components, and the control is its component
weights. Consequently, the problem could be solved using model-free
Reinforcement Learning (RL) without knowing specific component dynamics. By
examining existing methods of both value-based and policy-based model-free RL
for the portfolio optimization problem, we identify some of the key unresolved
questions and difficulties facing today's portfolio managers of applying
model-free RL to their investment portfolios
Accelerated Method for Stochastic Composition Optimization with Nonsmooth Regularization
Stochastic composition optimization draws much attention recently and has
been successful in many emerging applications of machine learning, statistical
analysis, and reinforcement learning. In this paper, we focus on the
composition problem with nonsmooth regularization penalty. Previous works
either have slow convergence rate or do not provide complete convergence
analysis for the general problem. In this paper, we tackle these two issues by
proposing a new stochastic composition optimization method for composition
problem with nonsmooth regularization penalty. In our method, we apply variance
reduction technique to accelerate the speed of convergence. To the best of our
knowledge, our method admits the fastest convergence rate for stochastic
composition optimization: for strongly convex composition problem, our
algorithm is proved to admit linear convergence; for general composition
problem, our algorithm significantly improves the state-of-the-art convergence
rate from to . Finally, we apply
our proposed algorithm to portfolio management and policy evaluation in
reinforcement learning. Experimental results verify our theoretical analysis.Comment: AAAI 201
Deep reinforcement learning for investing: A quantamental approach for portfolio management
The world of investments affects us all. The way surplus capital is allocated by ourselves or investment funds can determine how we eat, innovate and even educate kids. Portfolio management is an integral albeit challenging process in this task (Leković, 2021). It entails managing a basket of financial assets to maximize the returns per unit of risk, considering all the micro and macro economical, societal, political and environmental complex causal relations.
This study aims to evaluate how a machine learning technique called deep reinforcement learning (DRL) can improve the activity of portfolio management. It also has a second goal of understanding if financial fundamental features (i.e., revenue, debt, assets, cash flow) improve the model performance. After conducting a literature review to establish the current state-of-the-art, the CRISP-DM method was followed: 1) Business understanding; 2) Data understanding; 3) Data preparation – two datasets were prepared, one with market only features (i.e., close price, daily volume traded) and another with market plus fundamental features; 4) Modeling – Advantage Actor-Critic (A2C), Deep Deterministic Policy Gradient (DDPG) and Twin-delayed DDPG (TD3) DRL models were optimized on both datasets; 5) Evaluation.
On average, models had the same sharpe ratio performance in both datasets – average sharpe ratio of 0.35 vs 0.30 for the baseline, in the test set. DRL models outperformed traditional portfolio optimization techniques and financial fundamental features improved model robustness and consistency. Hence, supporting the use of both DRL models and quantamental investment strategies in portfolio management.Todos somos afetados pelo mundo dos investimentos. A forma como o excedente de capital é alocado tanto por nós como por fundos de investimentos determina a forma como comemos, inovamos e até mesmo como fornecemos educação à s crianças. Gestão de portfólio é uma tarefa essencial e desafiadora neste processo (Leković, 2021). Envolve gerir um conjunto de ativos financeiros com o objetivo de maximizar os retornos por unidade de risco, tendo em consideração todas as relações complexas entre fatores macro e microeconómicos, sociais, polÃticos e ambientais.
Este estudo pretende avaliar de que forma a técnica de machine learning intitulada de Aprendizagem por Reforço Profunda (ARP) consegue melhorar a tarefa de gestão de portfólios. Também tem um segundo objetivo de entender se variáveis relacionadas com a performance financeira de uma empresa (i.e., vendas, passivos, ativos, fluxos de caixa) melhoram a performance do modelo. Após o estado-de-arte ter sido definido com a revisão de literatura, utilizou-se o método CRISP-DM da seguinte forma: 1) Entendimento do negócio; 2) Entendimento dos dados; 3) Preparação dos dados – dois conjuntos de dados foram preparados, um apenas com variáveis de mercado (i.e., preço de fecho, volume transacionado) e o outro com variáveis de mercado mais variáveis de performance financeira; 4) Modelagem – usou-se os modelos Advantage Actor-Critic (A2C), Deep Deterministic Policy Gradient (DDPG) e Twin-delayed DDPG (TD3) em ambos os conjuntos de dados; 5) Avaliação.
Em média, os modelos apresentaram o mesmo Ãndice sharpe nos dois conjuntos de dados – média de 0.35 vs 0.30 para o modelo base, no conjunto de teste. Os modelos ARP apresentaram uma melhor performance do que os modelos tradicionais de otimização de portfólios e a utilização de variáveis de performance financeira melhoraram a robustez e consistência dos modelos. Tais conclusões suportam o uso de modelos ARP e de estratégias de investimentos quantamentais na gestão de portfólios
MetaTrader: An Reinforcement Learning Approach Integrating Diverse Policies for Portfolio Optimization
Portfolio management is a fundamental problem in finance. It involves
periodic reallocations of assets to maximize the expected returns within an
appropriate level of risk exposure. Deep reinforcement learning (RL) has been
considered a promising approach to solving this problem owing to its strong
capability in sequential decision making. However, due to the non-stationary
nature of financial markets, applying RL techniques to portfolio optimization
remains a challenging problem. Extracting trading knowledge from various expert
strategies could be helpful for agents to accommodate the changing markets. In
this paper, we propose MetaTrader, a novel two-stage RL-based approach for
portfolio management, which learns to integrate diverse trading policies to
adapt to various market conditions. In the first stage, MetaTrader incorporates
an imitation learning objective into the reinforcement learning framework.
Through imitating different expert demonstrations, MetaTrader acquires a set of
trading policies with great diversity. In the second stage, MetaTrader learns a
meta-policy to recognize the market conditions and decide on the most proper
learned policy to follow. We evaluate the proposed approach on three real-world
index datasets and compare it to state-of-the-art baselines. The empirical
results demonstrate that MetaTrader significantly outperforms those baselines
in balancing profits and risks. Furthermore, thorough ablation studies validate
the effectiveness of the components in the proposed approach
Application of Deep Q-Network in Portfolio Management
Machine Learning algorithms and Neural Networks are widely applied to many
different areas such as stock market prediction, face recognition and
population analysis. This paper will introduce a strategy based on the classic
Deep Reinforcement Learning algorithm, Deep Q-Network, for portfolio management
in stock market. It is a type of deep neural network which is optimized by Q
Learning. To make the DQN adapt to financial market, we first discretize the
action space which is defined as the weight of portfolio in different assets so
that portfolio management becomes a problem that Deep Q-Network can solve.
Next, we combine the Convolutional Neural Network and dueling Q-net to enhance
the recognition ability of the algorithm. Experimentally, we chose five
lowrelevant American stocks to test the model. The result demonstrates that the
DQN based strategy outperforms the ten other traditional strategies. The profit
of DQN algorithm is 30% more than the profit of other strategies. Moreover, the
Sharpe ratio associated with Max Drawdown demonstrates that the risk of policy
made with DQN is the lowest
Agent Inspired Trading Using Recurrent Reinforcement Learning and LSTM Neural Networks
With the breakthrough of computational power and deep neural networks, many
areas that we haven't explore with various techniques that was researched
rigorously in past is feasible. In this paper, we will walk through possible
concepts to achieve robo-like trading or advising. In order to accomplish
similar level of performance and generality, like a human trader, our agents
learn for themselves to create successful strategies that lead to the
human-level long-term rewards. The learning model is implemented in Long Short
Term Memory (LSTM) recurrent structures with Reinforcement Learning or
Evolution Strategies acting as agents The robustness and feasibility of the
system is verified on GBPUSD trading
- …