35,835 research outputs found
DeepLOB: Deep Convolutional Neural Networks for Limit Order Books
We develop a large-scale deep learning model to predict price movements from
limit order book (LOB) data of cash equities. The architecture utilises
convolutional filters to capture the spatial structure of the limit order books
as well as LSTM modules to capture longer time dependencies. The proposed
network outperforms all existing state-of-the-art algorithms on the benchmark
LOB dataset [1]. In a more realistic setting, we test our model by using one
year market quotes from the London Stock Exchange and the model delivers a
remarkably stable out-of-sample prediction accuracy for a variety of
instruments. Importantly, our model translates well to instruments which were
not part of the training set, indicating the model's ability to extract
universal features. In order to better understand these features and to go
beyond a "black box" model, we perform a sensitivity analysis to understand the
rationale behind the model predictions and reveal the components of LOBs that
are most relevant. The ability to extract robust features which translate well
to other instruments is an important property of our model which has many other
applications.Comment: 12 pages, 9 figure
The limits of statistical significance of Hawkes processes fitted to financial data
Many fits of Hawkes processes to financial data look rather good but most of
them are not statistically significant. This raises the question of what part
of market dynamics this model is able to account for exactly. We document the
accuracy of such processes as one varies the time interval of calibration and
compare the performance of various types of kernels made up of sums of
exponentials. Because of their around-the-clock opening times, FX markets are
ideally suited to our aim as they allow us to avoid the complications of the
long daily overnight closures of equity markets. One can achieve statistical
significance according to three simultaneous tests provided that one uses
kernels with two exponentials for fitting an hour at a time, and two or three
exponentials for full days, while longer periods could not be fitted within
statistical satisfaction because of the non-stationarity of the endogenous
process. Fitted timescales are relatively short and endogeneity factor is high
but sub-critical at about 0.8
- …