1,758 research outputs found

    The Stock Exchange Prediction using Machine Learning Techniques: A Comprehensive and Systematic Literature Review

    Get PDF
    This literature review identifies and analyzes research topic trends, types of data sets, learning algorithm, methods improvements, and frameworks used in stock exchange prediction. A total of 81 studies were investigated, which were published regarding stock predictions in the period January 2015 to June 2020 which took into account the inclusion and exclusion criteria. The literature review methodology is carried out in three major phases: review planning, implementation, and report preparation, in nine steps from defining systematic review requirements to presentation of results. Estimation or regression, clustering, association, classification, and preprocessing analysis of data sets are the five main focuses revealed in the main study of stock prediction research. The classification method gets a share of 35.80% from related studies, the estimation method is 56.79%, data analytics is 4.94%, the rest is clustering and association is 1.23%. Furthermore, the use of the technical indicator data set is 74.07%, the rest are combinations of datasets. To develop a stock prediction model 48 different methods have been applied, 9 of the most widely applied methods were identified. The best method in terms of accuracy and also small error rate such as SVM, DNN, CNN, RNN, LSTM, bagging ensembles such as RF, boosting ensembles such as XGBoost, ensemble majority vote and the meta-learner approach is ensemble Stacking. Several techniques are proposed to improve prediction accuracy by combining several methods, using boosting algorithms, adding feature selection and using parameter and hyper-parameter optimization

    Market volatility : can machine learning methods enhance volatility forecasting?

    Get PDF
    This dissertation aims to test whether the use of machine learning (ML) techniques can improve volatility forecasting accuracy. More specifically, if it can beat the best econometric model, the Heterogeneous Autoregressive model of Realized Volatility (HAR-RV). Using S&P 500 Index data from May-2007 to August-2022, the superiority of the HAR-RV was tested and attested against competing econometric models EWMA and GARCH(1,1). Next, the performance of the ML Artificial Neural Network algorithms Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) are compared to the performance of the econometric models. Five different variable sets are tested for the ML models. It is found that while both ML models are able to beat the EWMA and GARCH(1,1) models by a significant margin, the HAR-RV model still outperforms LSTM and GRU. Moreover, an analysis is conduced on the models’ predictions on the period corresponding to the Covid-19 crisis. The results did not show any evidence suggesting that ML methods have a particular advantage at predicting during high volatility events. Finally, a plausible cause that could undermine the remarkable qualities of the ML methods in the aim of volatility forecasting is discussed. It is found that the rigorous set of conditions needed to be met for the proper setup of ML models are very difficult to be met using financial data, which hinders the aptitude of ML for this purpose.Esta tese visa testar se o uso de técnicas de Machine Learning (ML) pode melhorar a precisão da previsão da volatilidade. Mais especificamente, se estes algoritmos conseguem superar o melhor modelo econométrico, o Heterogeneous Autoregressive model of Realized Volatility (HAR-RV). Usando dados do Índice S&P 500 de Maio-2007 a Agosto-2022, a superioridade do HAR-RV perante os modelos econométricos concorrentes EWMA e GARCH(1,1), foi testada e confirmada. Em seguida, o desempenho dos algoritmos ML de redes neurais artificiais de Long Short-Term Memory (LSTM) e Gated Recurrent Unit (GRU) são comparados com o desempenho dos modelos econométricos tradicionais. Cinco conjuntos diferentes de variáveis são testados para os modelos ML. Verifica-se que enquanto ambos os modelos ML são capazes de superar os modelos EWMA e GARCH(1,1) por uma margem significante, o modelo HARRV ainda tem um desempenho superior ao LSTM e ao GRU. É ainda feita uma análise das previsões dos modelos durante o período correspondente à crise do Covid-19. Os resultados não mostram qualquer evidência que sugira que os métodos ML têm uma particular vantagem durante eventos de alta volatilidade. Finalmente, é discutida uma possível causa que poderá debilitar as sofisticadas qualidades dos métodos ML para a finalidade de previsão de volatilidade. Verifica-se que o conjunto rigoroso de condições necessárias para a correcta configuração dos modelos ML é muito difícil de se cumprir utilizando series temporais de volatilidade de mercado, o que prejudica a aptidão dos modelos ML para esta finalidade

    Machine Learning and Natural Language Processing in Stock Prediction

    Get PDF
    In this thesis, we first study the two ill-posed natural language processing tasks related to stock prediction, i.e. stock movement prediction and financial document-level event extraction. While implementing stock prediction and event extraction, we encountered difficulties that could be resolved by utilizing out-of-distribution detection. Consequently, we presented a new approach for out-of-distribution detection, which is the third focus of this thesis. First, we systematically build a platform to study the NLP-aided stock auto-trading algorithms. Our platform is characterized by three features: (1) We provide financial news for each specific stock. (2) We provide various stock factors for each stock. (3) We evaluate performance from more financial-relevant metrics. Such a design allows us to develop and evaluate NLP-aided stock auto-trading algorithms in a more realistic setting. We also propose a system to automatically learn a good feature representation from various input information. The key to our algorithm is a method called semantic role labelling Pooling (SRLP), which leverages Semantic Role Labeling (SRL) to create a compact representation of each news paragraph. Based on SRLP, we further incorporate other stock factors to make the stock movement prediction. In addition, we propose a self-supervised learning strategy based on SRLP to enhance the out-of-distribution generalization performance of our system. Through our experimental study, we show that the proposed method achieves better performance and outperforms all strong baselines’ annualized rate of return as well as the maximum drawdown in back-testing. Second, we propose a generative solution for document-level event extraction that takes into account recent developments in generative event extraction, which have been successful at the sentence level but have not yet been explored for document-level extraction. Our proposed solution includes an encoding scheme to capture entity-to-document level information and a decoding scheme that takes into account all relevant contexts. Extensive experimental results demonstrate that our generative-based solution can perform as well as state-of-theart methods that use specialized structures for document event extraction. This allows our method to serve as an easy-to-use and strong baseline for future research in this area. Finally, we propose a new unsupervised OOD detection model that separates, extracts, and learns the semantic role labelling guided fine-grained local feature representation from different sentence arguments and the full sentence using a margin-based contrastive loss. Then we demonstrate the benefit of applying a self-supervised approach to enhance such global-local feature learning by predicting the SRL extracted role. We conduct our experiments and achieve state-of-the-art performance on out-of-distribution benchmarks.Thesis (Ph.D.) -- University of Adelaide, School of Computer and Mathematical Sciences, 202

    Stock Market Prediction via Deep Learning Techniques: A Survey

    Full text link
    The stock market prediction has been a traditional yet complex problem researched within diverse research areas and application domains due to its non-linear, highly volatile and complex nature. Existing surveys on stock market prediction often focus on traditional machine learning methods instead of deep learning methods. Deep learning has dominated many domains, gained much success and popularity in recent years in stock market prediction. This motivates us to provide a structured and comprehensive overview of the research on stock market prediction focusing on deep learning techniques. We present four elaborated subtasks of stock market prediction and propose a novel taxonomy to summarize the state-of-the-art models based on deep neural networks from 2011 to 2022. In addition, we also provide detailed statistics on the datasets and evaluation metrics commonly used in the stock market. Finally, we highlight some open issues and point out several future directions by sharing some new perspectives on stock market prediction

    Adaptive Algorithms For Classification On High-Frequency Data Streams: Application To Finance

    Get PDF
    Mención Internacional en el título de doctorIn recent years, the problem of concept drift has gained importance in the financial domain. The succession of manias, panics and crashes have stressed the nonstationary nature and the likelihood of drastic structural changes in financial markets. The most recent literature suggests the use of conventional machine learning and statistical approaches for this. However, these techniques are unable or slow to adapt to non-stationarities and may require re-training over time, which is computationally expensive and brings financial risks. This thesis proposes a set of adaptive algorithms to deal with high-frequency data streams and applies these to the financial domain. We present approaches to handle different types of concept drifts and perform predictions using up-to-date models. These mechanisms are designed to provide fast reaction times and are thus applicable to high-frequency data. The core experiments of this thesis are based on the prediction of the price movement direction at different intraday resolutions in the SPDR S&P 500 exchange-traded fund. The proposed algorithms are benchmarked against other popular methods from the data stream mining literature and achieve competitive results. We believe that this thesis opens good research prospects for financial forecasting during market instability and structural breaks. Results have shown that our proposed methods can improve prediction accuracy in many of these scenarios. Indeed, the results obtained are compatible with ideas against the efficient market hypothesis. However, we cannot claim that we can beat consistently buy and hold; therefore, we cannot reject it.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Gustavo Recio Isasi.- Secretario: Pedro Isasi Viñuela.- Vocal: Sandra García Rodrígue

    The impact of macroeconomic leading indicators on inventory management

    Get PDF
    Forecasting tactical sales is important for long term decisions such as procurement and informing lower level inventory management decisions. Macroeconomic indicators have been shown to improve the forecast accuracy at tactical level, as these indicators can provide early warnings of changing markets while at the same time tactical sales are sufficiently aggregated to facilitate the identification of useful leading indicators. Past research has shown that we can achieve significant gains by incorporating such information. However, at lower levels, that inventory decisions are taken, this is often not feasible due to the level of noise in the data. To take advantage of macroeconomic leading indicators at this level we need to translate the tactical forecasts into operational level ones. In this research we investigate how to best assimilate top level forecasts that incorporate such exogenous information with bottom level (at Stock Keeping Unit level) extrapolative forecasts. The aim is to demonstrate whether incorporating these variables has a positive impact on bottom level planning and eventually inventory levels. We construct appropriate hierarchies of sales and use that structure to reconcile the forecasts, and in turn the different available information, across levels. We are interested both at the point forecast and the prediction intervals, as the latter inform safety stock decisions. Therefore the contribution of this research is twofold. We investigate the usefulness of macroeconomic leading indicators for SKU level forecasts and alternative ways to estimate the variance of hierarchically reconciled forecasts. We provide evidence using a real case study

    Predicting Forex Currency Fluctuations Using a Novel Bio-inspired Modular Neural Network

    Get PDF
    This thesis explores the intricate interplay of rational choice theory (RCT), brain modularity, and artificial neural networks (ANNs) for modelling and forecasting hourly rate fluctuations in the foreign exchange (Forex) market. While RCT traditionally models human decision-making by emphasising self-interest and rational choices, this study extends its scope to encompass emotions, recognising their significant impact on investor decisions. Recent advances in neuro- science, particularly in understanding the cognitive and emotional processes associated with decision-making, have inspired computational methods to emulate these processes. ANNs, in particular, have shown promise in simulating neuroscience findings and translating them into effective models for financial market dynamics. However, their monolithic architectures of ANNs, characterised by fixed struc- tures, pose challenges in adaptability and flexibility when faced with data perturbations, limiting overall performance. To address these limitations, this thesis proposes a Modular Convolutional orthogonal Recurrent Neural Net- work with Monte Carlo dropout-ANN (MCoRNNMCD-ANN) inspired by recent neuroscience findings. A comprehensive literature review contextualises the challenges associated with monolithic architectures, leading to the identification of neural network structures that could enhance predictions of Forex price fluctuations, such as in the most prominently traded currencies, the EUR/GBP pairing. The proposed MCoRNNMCD-ANN is thoroughly evaluated through a detailed comparative analysis against state-of-the-art techniques, such as BiCuDNNL- STM, CNN–LSTM, LSTM–GRU, CLSTM, and ensemble modelling and single- monolithic CNN and RNN models. Results indicate that the MCoRNNMCD- ANN outperforms competitors. For instance, reducing prediction errors in test sets from 19.70% to an impressive 195.51%, measured by objective evaluation metrics like a mean square error. This innovative neurobiologically-inspired model not only capitalises on modularity but also integrates partial transfer learning to improve forecasting ac- curacy in anticipating Forex price fluctuations when less data occurs in the EUR/USD currency pair. The proposed bio-inspired modular approach, incorporating transfer learning in a similar task, brings advantages such as robust forecasts and enhanced generalisation performance, especially valuable in domains where prior knowledge guides modular learning processes. The proposed model presents a promising avenue for advancing predictive modelling in Forex predictions by incorporating transfer learning principles
    corecore