7 research outputs found

    Carry Forward Modeling for High-Frequency Limit-Order Executions: An Emerging Market Perspective

    Get PDF
    In this study, we estimate the order execution probability of a limit-order book (LOB) and analyze its determinants using high-frequency LOB data from the National Stock Exchange (NSE) of India. For this purpose, we propose an algorithm that estimates the LOB execution time. Using a survival function with log-normal distribution, this study analyzes the significant determinants of the limit-order execution times. The average execution probability is found to be higher for stocks belonging to the information technology and telecom sectors. The limit-order execution probability increases with a larger bid–ask spread, lower limit-order size, and deeper opposite order book. On the other hand, multiple factors, including price aggressiveness, inferior price, limit-order size, and spread, have a direct impact on execution times. The findings could help traders understand various factors influencing the probability of execution and execution time of LOBs. This study is unique in that it models limit-order execution using high-frequency tick-by-tick trading data for emerging markets, such as the NSE of India

    Predicting stock price changes based on the limit order book: a survey

    Get PDF
    This survey starts with a general overview of the strategies for stock price change predictions based on market data and in particular Limit Order Book (LOB) data. The main discussion is devoted to the systematic analysis, comparison, and critical evaluation of the state-of-the-art studies in the research area of stock price movement predictions based on LOB data. LOB and Order Flow data are two of the most valuable information sources available to traders on the stock markets. Academic researchers are actively exploring the application of different quantitative methods and algorithms for this type of data to predict stock price movements. With the advancements in machine learning and subsequently in deep learning, the complexity and computational intensity of these models was growing, as well as the claimed predictive power. Some researchers claim accuracy of stock price movement prediction well in excess of 80%. These models are now commonly employed by automated market-making programs to set bids and ask quotes. If these results were also applicable to arbitrage trading strategies, then those algorithms could make a fortune for their developers. Thus, the open question is whether these results could be used to generate buy and sell signals that could be exploited with active trading. Therefore, this survey paper is intended to answer this question by reviewing these results and scrutinising their reliability. The ultimate conclusion from this analysis is that although considerable progress was achieved in this direction, even the state-of-art models can not guarantee a consistent profit in active trading. Taking this into account several suggestions for future research in this area were formulated along the three dimensions: input data, model’s architecture, and experimental setup. In particular, from the input data perspective, it is critical that the dataset is properly processed, up-to-date, and its size is sufficient for the particular model training. From the model architecture perspective, even though deep learning models are demonstrating a stronger performance than classical models, they are also more prone to over-fitting. To avoid over-fitting it is suggested to optimize the feature space, as well as a number of layers and neurons, and apply dropout functionality. The over-fitting problem can be also addressed by optimising the experimental setup in several ways: Introducing the early stopping mechanism; Saving the best weights of the model achieved during the training; Testing the model on the out-of-sample data, which should be separated from the validation and training samples. Finally, it is suggested to always conduct the trading simulation under realistic market conditions considering transactions costs, bid–ask spreads, and market impact

    Mid-Price Movement Prediction in Limit Order Books Using Feature Engineering and Machine Learning

    Get PDF
    The increasing complexity of financial trading in recent years revealed the need for methods that can capture its underlying dynamics. An efficient way to organize this chaotic system is by contracting limit order book ordering mechanisms that operate under price and time filters. Limit order book can be analyzed using linear and nonlinear models. The thesis develops novelmethods for the identification of limit order book characteristics which provide traders and market makers an information edge in their trading. A good proxy for traders and market makers is the prediction of mid-price movement, which is the main target of this thesis. The contributions of this thesis are categorized chronologically into three parts. The first part refers to the introduction in the literature of the first publicly available limit order book dataset for high-frequency trading for the task of mid-price movement prediction. This dataset comes together with the development of an experimental protocol that utilizes methods inspired by ridge regression and a single layer feed-forward neural network as classifiers. These classifiers use state-of-the-art limit order book features as inputs for the target task. The next contribution of this thesis is the use and development of a wide range of technical and quantitative indicators for the task of mid-price movement prediction via an extensive feature selection process. This feature selection process identifies which features improve predictability performance. The results suggest that the newly introduced quantitative feature based on an adaptive logistic regression model for online learning was selected first according to several criteria. These criteria operate according to entropy, linear discriminant analysis, and least mean square error. The third contribution is the introduction of econometric features as inputs to deep learning models for the task of mid-price movement prediction. An extensive comparison against other state-of-the-art hand-crafted features and fully automated feature extraction processes is provided. Furthermore, a new experimental protocol is developed for the task of mid-price prediction, to overcome the problem of time irregularities, which characterizes high-frequency data. Results suggest that advanced hand-crafted features such as econometric indicators can predict movements of proxies, such as mid-price

    Volatility modeling and limit-order book analytics with high-frequency data

    Get PDF
    The vast amount of information characterizing nowadays’s high-frequency financial datasets poses both opportunities and challenges. Among the opportunities, existing methods can be employed to provide new insights and better understanding of market’s complexity under different perspectives, while new methods, capable of fully-exploit all the information embedded in high-frequency datasets and addressing new issues, can be devised. Challenges are driven by data complexity: limit-order book datasets constitute of hundreds of thousands of events, interacting with each other, and affecting the event-flow dynamics. This dissertation aims at improving our understanding over the effective applicability of machine learning methods for mid-price movement prediction, over the nature of long-range autocorrelations in financial time-series, and over the econometric modeling and forecasting of volatility dynamics in high-frequency settings. Our results show that simple machine learning methods can be successfully employed for mid-price forecasting, moreover adopting methods that rely on the natural tensorrepresentation of financial time series, inter-temporal connections captured by this convenient representation are shown to be of relevance for the prediction of future mid-price movements. Furthermore, by using ultra-high-frequency order book data over a considerably long period, a quantitative characterization of the long-range autocorrelation is achieved by extracting the so-called scaling exponent. By jointly considering duration series of both inter- and cross- events, for different stocks, and separately for the bid and ask side, long-range autocorrelations are found to be ubiquitous and qualitatively homogeneous. With respect to the scaling exponent, evidence of three cross-overs is found, and complex heterogeneous associations with a number of relevant economic variables discussed. Lastly, the use of copulas as the main ingredient for modeling and forecasting realized measures of volatility is explored. The modeling background resembles but generalizes, the well-known Heterogeneous Autoregressive (HAR) model. In-sample and out-of-sample analyses, based on several performance measures, statistical tests, and robustness checks, show forecasting improvements of copula-based modeling over the HAR benchmark
    corecore