106,570 research outputs found
Predicting stock market movements using network science: An information theoretic approach
A stock market is considered as one of the highly complex systems, which
consists of many components whose prices move up and down without having a
clear pattern. The complex nature of a stock market challenges us on making a
reliable prediction of its future movements. In this paper, we aim at building
a new method to forecast the future movements of Standard & Poor's 500 Index
(S&P 500) by constructing time-series complex networks of S&P 500 underlying
companies by connecting them with links whose weights are given by the mutual
information of 60-minute price movements of the pairs of the companies with the
consecutive 5,340 minutes price records. We showed that the changes in the
strength distributions of the networks provide an important information on the
network's future movements. We built several metrics using the strength
distributions and network measurements such as centrality, and we combined the
best two predictors by performing a linear combination. We found that the
combined predictor and the changes in S&P 500 show a quadratic relationship,
and it allows us to predict the amplitude of the one step future change in S&P
500. The result showed significant fluctuations in S&P 500 Index when the
combined predictor was high. In terms of making the actual index predictions,
we built ARIMA models. We found that adding the network measurements into the
ARIMA models improves the model accuracy. These findings are useful for
financial market policy makers as an indicator based on which they can
interfere with the markets before the markets make a drastic change, and for
quantitative investors to improve their forecasting models.Comment: 13 pages, 7 figures, 3 table
Financial Trading Model with Stock Bar Chart Image Time Series with Deep Convolutional Neural Networks
Even though computational intelligence techniques have been extensively
utilized in financial trading systems, almost all developed models use the time
series data for price prediction or identifying buy-sell points. However, in
this study we decided to use 2-D stock bar chart images directly without
introducing any additional time series associated with the underlying stock. We
propose a novel algorithmic trading model CNN-BI (Convolutional Neural Network
with Bar Images) using a 2-D Convolutional Neural Network. We generated 2-D
images of sliding windows of 30-day bar charts for Dow 30 stocks and trained a
deep Convolutional Neural Network (CNN) model for our algorithmic trading
model. We tested our model separately between 2007-2012 and 2012-2017 for
representing different market conditions. The results indicate that the model
was able to outperform Buy and Hold strategy, especially in trendless or bear
markets. Since this is a preliminary study and probably one of the first
attempts using such an unconventional approach, there is always potential for
improvement. Overall, the results are promising and the model might be
integrated as part of an ensemble trading model combined with different
strategies.Comment: accepted to be published in Intelligent Automation and Soft Computing
journa
Recommended from our members
Financial predictions using intelligent systems
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.This thesis presents a collection of practical techniques for analysing various market properties in order to design advanced self-evolving trading systems based on neural networks combined with a genetic algorithm optimisation approach. Nonlinear multivariate statistical models have gained increasing importance in financial time series analysis, as it is very hard to fmd statistically significant market inefficiencies using standard linear modes. Nonlinear models capture more of the underlying dynamics of these high dimensional noisy systems than traditional models, whilst at the same time making fewer restrictive assumptions about them. These adaptive trading systems can extract
information about associated time varying processes that may not be readily captured by traditional models. In order to characterise the fmancial time series in terms of its dynamic nature, this research employs various methods such as fractal analysis, chaos theory and dynamical recurrence analysis. These techniques are used for evaluating whether markets are stochastic and deterministic or nonlinear and chaotic, and to discover regularities that are completely hidden in these time series and not detectable using conventional analysis. Particular emphasis is placed on examining the feasibility of prediction in fmancial time series and the analysis of extreme market events. The market's fractal structure and log-periodic oscillations, typical of periods before extreme events occur, are revealed through recurrence plots. Recurrence qualification analysis indicated a strong presence of structure,
recurrence and determinism in the fmancial time series studied. Crucial fmancial time series transition periods were also detected. This research performs several tests on a large number of US and European stocks using methodologies inspired by both fundamental analysis and technical trading rules. Results from the tests show that profitable trading models utilising advanced nonlinear trading systems can be created after accounting for realistic transaction costs. The return achieved by applying the trading model to a portfolio of real price series differs significantly from that achieved by applying it to a randomly generated price series. In some cases, these models are compared against simpler alternative approaches to ensure that there is an added value in the use of these more complex models. The superior performance of multivariate nonlinear models is also demonstrated. The long-short trading strategies performed well in both bull and bear markets, as well as in a sideways market, showing a great degree of flexibility and adjustability to changing market conditions. Empirical evidence shows that information is not instantly incorporated into market pnces and supports the claim that the fmancial time series studied, for the periods analysed, are not entirely random. This research clearly shows that equity markets are partially inefficient and do not behave along lines dictated by the efficient market hypothesis
Time series data mining: preprocessing, analysis, segmentation and prediction. Applications
Currently, the amount of data which is produced for any information system is increasing exponentially. This motivates the development of automatic techniques to process and mine these data correctly. Specifically, in this Thesis, we tackled these problems for time series data, that is, temporal data which is collected chronologically. This kind of data can be found in many fields of science, such as palaeoclimatology, hydrology, financial problems, etc. TSDM consists of several tasks which try to achieve different objectives, such as, classification, segmentation, clustering, prediction, analysis, etc. However, in this Thesis, we focus on time series preprocessing, segmentation and prediction. Time series preprocessing is a prerequisite for other posterior tasks: for example, the reconstruction of missing values in incomplete parts of time series can be essential for clustering them. In this Thesis, we tackled the problem of massive missing data reconstruction in SWH time series from the Gulf of Alaska. It is very common that buoys stop working for different periods, what it is usually related to malfunctioning or bad weather conditions. The relation of the time series of each buoy is analysed and exploited to reconstruct the whole missing time series. In this context, EANNs with PUs are trained, showing that the resulting models are simple and able to recover these values with high precision. In the case of time series segmentation, the procedure consists in dividing the time series into different subsequences to achieve different purposes. This segmentation can be done trying to find useful patterns in the time series. In this Thesis, we have developed novel bioinspired algorithms in this context. For instance, for paleoclimate data, an initial genetic algorithm was proposed to discover early warning signals of TPs, whose detection was supported by expert opinions. However, given that the expert had to individually evaluate every solution given by the algorithm, the evaluation of the results was very tedious. This led to an improvement in the body of the GA to evaluate the procedure automatically. For significant wave height time series, the objective was the detection of groups which contains extreme waves, i.e. those which are relatively large with respect other waves close in time. The main motivation is to design alert systems. This was done using an HA, where an LS process was included by using a likelihood-based segmentation, assuming that the points follow a beta distribution. Finally, the analysis of similarities in different periods of European stock markets was also tackled with the aim of evaluating the influence of different markets in Europe. When segmenting time series with the aim of reducing the number of points, different techniques have been proposed. However, it is an open challenge given the difficulty to operate with large amounts of data in different applications. In this work, we propose a novel statistically-driven CRO algorithm (SCRO), which automatically adapts its parameters during the evolution, taking into account the statistical distribution of the population fitness. This algorithm improves the state-of-the-art with respect to accuracy and robustness. Also, this problem has been tackled using an improvement of the BBPSO algorithm, which includes a dynamical update of the cognitive and social components in the evolution, combined with mathematical tricks to obtain the fitness of the solutions, which
significantly reduces the computational cost of previously proposed coral reef methods.
Also, the optimisation of both objectives (clustering quality and approximation quality),
which are in conflict, could be an interesting open challenge, which will be tackled
in this Thesis. For that, an MOEA for time series segmentation is developed, improving the clustering quality of the solutions and their approximation. The prediction in time series is the estimation of future values by observing and studying the previous ones. In this context, we solve this task by applying prediction over high-order representations of the elements of the time series, i.e. the segments obtained by time series segmentation. This is applied to two challenging problems, i.e. the prediction of extreme wave height and fog prediction. On the one hand, the number of extreme values in SWH time series is less with respect to the number of standard values. In this way, the prediction of these values cannot be done using standard algorithms without taking into account the imbalanced ratio of the dataset. For that, an algorithm that automatically finds the set of segments and then applies EANNs is developed, showing the high ability of the algorithm to detect and predict these special events. On the other hand, fog prediction is affected by the same problem, that is, the number of fog events is much lower tan that of non-fog events, requiring a special treatment too. A preprocessing of different data coming from sensors situated in different parts of the Valladolid airport are used for making a simple ANN model, which is physically corroborated and discussed. The last challenge which opens new horizons is the estimation of the statistical distribution of time series to guide different methodologies. For this, the estimation of a mixed distribution for SWH time series is then used for fixing the threshold of POT approaches. Also, the determination of the fittest distribution for the time series is used for discretising it and making a prediction which treats the problem as ordinal classification. The work developed in this Thesis is supported by twelve papers in international journals, seven papers in international conferences, and four papers in national conferences
Evolutionary data selection for enhancing models of intraday forex time series
The hypothesis in this paper is that a significant amount of intraday market data is either noise or redundant, and that if it is eliminated, then predictive models built using the remaining intraday data will be more accurate. To test this hypothesis, we use an evolutionary method (called Evolutionary Data Selection, EDS) to selectively remove out portions of training data that is to be made available to an intraday market predictor. After performing experiments in which data-selected and non-data-selected versions of the same predictive models are compared, it is shown that EDS is effective and does indeed boost predictor accuracy. It is also shown in the paper that building multiple models using EDS and placing them into an ensemble further increases performance. The datasets for evaluation are large intraday forex time series, specifically series from the EUR/USD, the USD/JPY and the EUR/JPY markets, and predictive models for two primary tasks per market are built: intraday return prediction and intraday volatility prediction
- …