Search CORE

1 research outputs found

Machine learning in stock indices trading and pairs trading

Author: Zong Xiangyu
Publication venue
Publication date: 01/01/2021
Field of study

This thesis focuses on two fields of machine learning in quantitative trading. The first field uses machine learning to forecast financial time series (Chapters 2 and 3), and then builds a simple trading strategy based on the forecast results. The second (Chapter 4) applies machine learning to optimize decision-making for pairs trading. In Chapter 2, a hybrid Support Vector Machine (SVM) model is proposed and applied to the task of forecasting the daily returns of five popular stock indices in the world, including the S&P500, NKY, CAC, FTSE100 and DAX. The trading application covers the 1997 Asian financial crisis and 2007-2008 global financial crisis. The originality of this work is that the Binary Gravity Search Algorithm (BGSA) is utilized, in order to optimize the parameters and inputs of SVM. The results show that the forecasts made by this model are significantly better than the Random Walk (RW), SVM, best predictors and Buy-and-Hold. The average accuracy of BGSA-SVM for five stock indices is 52.6%-53.1%. The performance of the BGSA-SVM model is not affected by the market crisis, which shows the robustness of this model. In general, this study proves that a profitable trading strategy based on BGSA-SVM prediction can be realized in a real stock market. Chapter 3 focuses on the application of Artificial Neural Networks (ANNs) in forecasting stock indices. It applies the Multi-layer Perceptron (MLP), Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM) neural network to the task of forecasting and trading FTSE100 and INDU indices. The forecasting accuracy and trading performances of MLP, CNN and LSTM are compared under the binary classifications architecture and eight classifications architecture. Then, Chapter 3 combines the forecasts of three ANNs (MLP, CNN and LSTM) by Simple Average, Granger-Ramanathan’s Regression Approach (GRR) and the Least Absolute Shrinkage and Selection Operator (LASSO). Finally, this chapter uses different leverage ratios in trading according to the different daily forecasting probability to improve the trading performance. In Chapter 3, the statistical and trading performances are estimated throughout the period 2000-2018. LSTM slightly outperforms MLP and CNN in terms of average accuracy and average annualized returns. The combination methods do not present improved empirical evidence. Trading using different leverage ratios improves the annualized average return, while the volatility increases. Chapter 4 uses five pairs trading strategies to conduct in-sample training and backtesting on 35 commodities in the major commodity markets from 1980 to 2018. The Distance Method (DIM) and the Co-integration Approach (CA) are used for pairs formation. The Simple Thresholds (ST) strategy, Genetic Algorithm (GA) and Deep Reinforcement Learning (DRL) are employed to determine trading actions. Traditional DIM-ST, CA-ST and CA-DIM-ST are used as benchmark models. The GA is used to optimize the trading thresholds in ST strategy, which is called the CA-GA-ST strategy. Chapter 4 proposes a novel DRL structure for determining trading actions, which replaces the ST decision method. This novel DRL structure is then combined with CA and called the CA-DRL trading strategy. The average annualized returns of the traditional DIM-ST, CA-ST and CA-DIM-ST methods are close to zero. CA-GA-ST uses GA to optimize searches for thresholds. GA selects a smaller range of thresholds, which improves the in-sample performance. However, the average out-of-sample performance only improves slightly, with an average annual return of 1.84% but an increased risk. CA-DRL strategy uses CA to select pairs and then employs DRL to trade the pairs, providing a satisfactory trading performance: the average annualized return reaches 12.49%; the Sharpe Ratio reaches 1.853. Thus, the CA-DRL trading strategy is significantly superior to traditional methods and to CA-GA-ST

Glasgow Theses Service