11,064 research outputs found

    Reinforcement Learning Applied to Trading Systems: A Survey

    Full text link
    Financial domain tasks, such as trading in market exchanges, are challenging and have long attracted researchers. The recent achievements and the consequent notoriety of Reinforcement Learning (RL) have also increased its adoption in trading tasks. RL uses a framework with well-established formal concepts, which raises its attractiveness in learning profitable trading strategies. However, RL use without due attention in the financial area can prevent new researchers from following standards or failing to adopt relevant conceptual guidelines. In this work, we embrace the seminal RL technical fundamentals, concepts, and recommendations to perform a unified, theoretically-grounded examination and comparison of previous research that could serve as a structuring guide for the field of study. A selection of twenty-nine articles was reviewed under our classification that considers RL's most common formulations and design patterns from a large volume of available studies. This classification allowed for precise inspection of the most relevant aspects regarding data input, preprocessing, state and action composition, adopted RL techniques, evaluation setups, and overall results. Our analysis approach organized around fundamental RL concepts allowed for a clear identification of current system design best practices, gaps that require further investigation, and promising research opportunities. Finally, this review attempts to promote the development of this field of study by facilitating researchers' commitment to standards adherence and helping them to avoid straying away from the RL constructs' firm ground.Comment: 38 page

    The Recurrent Reinforcement Learning Crypto Agent

    Get PDF
    We demonstrate a novel application of online transfer learning for a digital assets trading agent. This agent uses a powerful feature space representation in the form of an echo state network, the output of which is made available to a direct, recurrent reinforcement learning agent. The agent learns to trade the XBTUSD (Bitcoin versus US Dollars) perpetual swap derivatives contract on BitMEX on an intraday basis. By learning from the multiple sources of impact on the quadratic risk-adjusted utility that it seeks to maximise, the agent avoids excessive over-trading, captures a funding profit, and can predict the market’s direction. Overall, our crypto agent realises a total return of 350%, net of transaction costs, over roughly five years, 71% of which is down to funding profit. The annualised information ratio that it achieves is 1.46

    Robust FOREX Trading with Deep Q Network (DQN)

    Get PDF
    Financial trading is one of the most attractive areas in finance. Trading systems development is not an easy task because it requires extensive knowledge in several areas such as quantitative analysis, financial skills, and computer programming. A trading systems expert, as a human, also brings in their own bias when developing the system. There should be another, more effective way to develop the system using artificial intelligence. The aim of this study was to compare the performance of AI agents to the performance of the buy-and-hold strategy and the expert trader. The tested market consisted of 15 years of the Forex data market, from two currency pairs (EURUSD, USDJPY) obtained from Dukascopy Bank SA Switzerland. Both hypotheses were tested with a paired t-Test at the 0.05 significance level. The findings showed that AI can beat the buy & hold strategy with significant superiority, in FOREX for both currency pairs (EURUSD, USDJPY), and that AI can also significantly outperform CTA (experienced trader) for trading in EURUSD. However, the AI could not significantly outperform CTA for USDJPY trading. Limitations, contributions, and further research were recommended

    Deep Reinforcement Learning for Active High Frequency Trading

    Full text link
    We introduce the first end-to-end Deep Reinforcement Learning (DRL) based framework for active high frequency trading. We train DRL agents to trade one unit of Intel Corporation stock by employing the Proximal Policy Optimization algorithm. The training is performed on three contiguous months of high frequency Limit Order Book data, of which the last month constitutes the validation data. In order to maximise the signal to noise ratio in the training data, we compose the latter by only selecting training samples with largest price changes. The test is then carried out on the following month of data. Hyperparameters are tuned using the Sequential Model Based Optimization technique. We consider three different state characterizations, which differ in their LOB-based meta-features. Analysing the agents' performances on test data, we argue that the agents are able to create a dynamic representation of the underlying environment. They identify occasional regularities present in the data and exploit them to create long-term profitable trading strategies. Indeed, agents learn trading strategies able to produce stable positive returns in spite of the highly stochastic and non-stationary environment.Comment: 9 pages, 4 figure
    • …
    corecore