18 research outputs found

    Practical Deep Reinforcement Learning Approach for Stock Trading

    Full text link
    Stock trading strategy plays a crucial role in investment companies. However, it is challenging to obtain optimal strategy in the complex and dynamic stock market. We explore the potential of deep reinforcement learning to optimize stock trading strategy and thus maximize investment return. 30 stocks are selected as our trading stocks and their daily prices are used as the training and trading market environment. We train a deep reinforcement learning agent and obtain an adaptive trading strategy. The agent's performance is evaluated and compared with Dow Jones Industrial Average and the traditional min-variance portfolio allocation strategy. The proposed deep reinforcement learning approach is shown to outperform the two baselines in terms of both the Sharpe ratio and cumulative returns

    Taylor-based pseudo-metrics for random process fitting in dynamic programming.

    Get PDF
    Stochastic optimization is the research of xx optimizing E C(x,A)E\ C(x,A), the expectation of C(x,A)C(x,A), wher e AA is a random variable. Typically C(x,a)C(x,a) is the cost related to a strategy xx which faces the reali zation aa of the random process. Many stochastic optimization problems deal with multiple time steps, leading to computationally difficu lt problems ; efficient solutions exist, for example through Bellman's optimality principle, but only provided that the random process is represented by a well structured process, typically an inhomogeneous Markovian process (hopefully with a finite number of states) or a scenario tree. The problem is that in the general case, AA is far from b eing Markovian. So, we look for A′A', "looking like AA", but belonging to a given family \A' which do es not at all contain AA. The problem is the numerical evaluation of "A′A' looks like AA". A classical method is the use of the Kantorovitch-Rubinstein distance or other transportation metrics \c ite{Pflug}, justified by straightforward bounds on the deviation ∣EC(x,A)−EC(x,A′)∣|E C(x,A)-E C(x,A')| through the use of the Kantorovitch-Rubinstein distance and uniform lipschitz conditions. These approaches might be bett er than the use of high-level statistics \cite{Keefer}. We propose other (pseudo-)distances, based upon refined inequalities, guaranteeing a good choice of A′A'. Moreover, as in many cases, we indeed prefer t he optimization with risk management, e.g. optimization of EC(x,noise(A))EC(x,noise(A)) where noise(.)noise(.) is a random noise modelizing the lack of knowledge on the precise random variables, we propose distances which can deal with a user-defined noise. Tests on artificial data sets with realistic loss functions show the rel evance of the method

    Improving Wealth Management Strategies Through the Use of Reinforcement Learning Based Algorithms. A Study on the Romanian Stock Market

    Get PDF
    In the context of the growing pace of technological development and that of the transition to the knowledge-based economy, wealth management strategies have become subject to the application of new ideas. One of the fields of research that are increasing in influence in the scientific community is that of reinforcement learning-based algorithms. This trend is also manifesting in the domain of economics, where the algorithms have found a use in the field of stock trading. The use of algorithms has been tested by researchers in the last decade due to the fact that by applying these new concepts, fund managers could obtain an advantage when compared to using classic management techniques. The present paper will test the effects of applying these algorithms on the Romanian market, taking into account that it is a relatively new market, and compare it to the results obtained by applying classic optimization techniques based on passive wealth management concepts. We chose the Romanian stock market due to its recent evolution regarding the FTSE Russell ratings and the fact that the country is becoming an Eastern European hub of development in the IT sector, these facts could indicate that the Romanian stock market will become even more significant in the future at a local and maybe even at a regional level

    Deep Reinforcement Learning for Gas Trading

    Full text link
    Deep Reinforcement Learning (Deep RL) has been explored for a number of applications in finance and stock trading. In this paper, we present a practical implementation of Deep RL for trading natural gas futures contracts. The Sharpe Ratio obtained exceeds benchmarks given by trend following and mean reversion strategies as well as results reported in literature. Moreover, we propose a simple but effective ensemble learning scheme for trading, which significantly improves performance through enhanced model stability and robustness as well as lower turnover and hence lower transaction cost. We discuss the resulting Deep RL strategy in terms of model explainability, trading frequency and risk measures

    A Multi-agent Q-learning Framework for Optimizing Stock Trading Systems

    Full text link
    Abstract. This paper presents a reinforcement learning framework for stock trading systems. Trading system parameters are optimized by Q-learning algorithm and neural networks are adopted for value approxi-mation. In this framework, cooperative multiple agents are used to ef-ficiently integrate global trend prediction and local trading strategy for obtaining better trading performance. Agents communicate with others sharing training episodes and learned policies, while keeping the overall scheme of conventional Q-learning. Experimental results on KOSPI 200 show that a trading system based on the proposed framework outper-forms the market average and makes appreciable profits. Furthermore, in view of risk management, the system is superior to a system trained by supervised learning.

    Deep reinforcement learning and signal processing applications for investment strategies

    Get PDF
    En este proyecto miramos los fundamentos de las Finanzas, del Deep Reinforcement Larning, y del procesado de señal para construir una estrategia de inversión en bolsa. El estudio parte de un desarrollo hecho por otra universidad en la que se estudia una técnica de ensamblado usando tres algoritmos de DRL (A2C, DDPG y PPO). En nuestro caso el objetivo reside en mejorar los muy prometedores y ambiciosos resultados obtenidos por el anterior académico. Se proponen tres posibles mejoras, como cambiar la función de Reward de los Agentes de DRL al Sharpe Ratio, se propone ensanchar la base de datos a otra que contenga un universo de activos financieros más amplio y por último se propone realizar una estrategia de combinado para los tres algoritmos, usando técnicas de procesado de señal. Tras explorar las dificultades técnicas y proponer soluciones formales, se demuestra en los tres casos que se consigue mejorar el rendimiento de la estrategia original y se compara en todo momento con los anteriores resultados.In this project we look at the fundamentals of Finance, Deep Reinforcement Learning and signal Processing in order to develop an investment strategy for the stock market. The study parts from a development made by another university in which an Ensemble technique is designed using three different DRL algorithms (A2C, DDPG and PPO) In our case, the objective is to improve the very promising results obtained by the author. Three possible improvements are proposed, such as using the Differential Sharpe Ratio as a reward function, expanding the database to another containing a broader universe of financial assets and finally it is proposed to carry out a combination strategy with all three algorithms using signal processing techniques. After exploring the technical difficulties and proposing formal solutions, it is demonstrated that in all three cases performance is improved and results are compared to the previous ones.En aquest projecte fem una ullada als fonaments de les Finances, del Reinforcement Learning i del Processat de Senyal per a desenvolupar una estratègia d'inversió als mercats financers. L'estudi parteix d'un desenvolupament ja fet per una altra universitat en el que es dissenya una tècnica d'acoblament amb tres algorismes de DRL (A2C,PPO i DDPG) En el nostre cas, l'objectiu es millorar els prometedors resultats obtinguts per l'autor. Tres possibles solucions s'expliquen al projecte, la primera fer servir el Sharpe Ratio Diferencial com a funció de reward dels Agents DRL. La segona fer servir una base de dades més gran amb un univers d'accions més extens. Per últim es proposa una tècnica de combinació dels algorismes de DRL basada en processat de senyal. Després de discutir les dificultats tècniques i de fer les propostes formals es demostra en els tres casos la millora de rendiment i es comparen els resultats amb la estratègia original

    Modelling crypto markets by multi-agent reinforcement learning

    Full text link
    Building on a previous foundation work (Lussange et al. 2020), this study introduces a multi-agent reinforcement learning (MARL) model simulating crypto markets, which is calibrated to the Binance's daily closing prices of 153153 cryptocurrencies that were continuously traded between 2018 and 2022. Unlike previous agent-based models (ABM) or multi-agent systems (MAS) which relied on zero-intelligence agents or single autonomous agent methodologies, our approach relies on endowing agents with reinforcement learning (RL) techniques in order to model crypto markets. This integration is designed to emulate, with a bottom-up approach to complexity inference, both individual and collective agents, ensuring robustness in the recent volatile conditions of such markets and during the COVID-19 era. A key feature of our model also lies in the fact that its autonomous agents perform asset price valuation based on two sources of information: the market prices themselves, and the approximation of the crypto assets fundamental values beyond what those market prices are. Our MAS calibration against real market data allows for an accurate emulation of crypto markets microstructure and probing key market behaviors, in both the bearish and bullish regimes of that particular time period
    corecore