370 research outputs found

    Essays on model-based clustering for macroeconomic and financial data

    Get PDF
    This thesis is about three-independent chapters. It deals with model-based clustering in different framework and with different approaches. Also employed data are quite different, especially between the first chapter and the others. Basically, the clustering procedures are applied on economic time series and they can represent a fondation for further research topics. In the first chapter we apply a clustering procedure to detect trend changes in macroeconomic data, focusing on the GDP time series for the G-7 countries. Two popular trend-cycle decompositions (i.e., Beveridge and Nelson Decom position and Hodrick and Prescott filter) are considered in a preliminary step of the analysis, and we stress the differences between the two methods in terms of the inferred clustering. A finite mixture of regression models is considered to show different patterns and changes in GDP slopes over time. This approach can be used also to detect structural breaks or change points, and it is an alter native to existing approaches in a probabilistic framework and we also discuss international changes in GDP distribution for the G-7 countries, highlighting similarities, e.g., in break dates, aiming at adding more insights on the economic integration among countries. Our findings are that our model is able to represent economic paths of every countries and by looking at the changes in slope of the long-trend component of the GDP, we are able to investigate change points, also compared with alternative approaches. In the second chapter we provide an empirical analysis on the main univariate and multivariate stylized facts in the two of the largest cryptocurrencies, namely Ethereum and Bitcoin, return series. A Markov-Switching Vector Auto Regres sion model is considered to further explore the dynamic relationships between cryptocurrencies and other financial assets, such as gold, S&P and oil. We 5 estimate the presence of volatility clustering, a rapid decay of the autocorrelation function, an excess of kurtosis and multivariate little cross-correlation across the series, except for contemporaneous returns. The model well represent tha univariate and multivariate stylized facts, giving an insight on the considered crypto-currencies as pure financial asset; moreover, we find a relationship between the response variable and the autoregression part and (some of) the exogeneous variables considered. Finally, in the third chapter we introduce multivariate models for analyzing several stock returns series of italian football teams such as AS Roma, FC Juventus, SS Lazio, in order to describe the relationship across these series and to model the evolution over time of the stock returns in a very particular framework; in fact, stock returns of a football team can be influenced both by football performances (national and international) and by non-football events too, like a change in management or a purchase of a superstar footballer. A natural way to model the dependence over time is by using the hidden Markov models and his generalization, hidden semi-Markov models by relaxing the assumption on the so-called sojourn distribution of the hidden states. Instead for the conditional distributions of the observed data (i.e., the emission distribution) we use the multivariate leptokurtic-normal distribution, a generalization of the multivariate normal, with an additional parameter β which describes the excess of kurtosis. Furthermore, some multivariate stylized facts are also investigated. Parameters estimation is performed by Expectation-Maximization (EM) algorithm type which maximizes the log-likelihood function, allowing us to deal with a classification problem as a missing data problem. R has been employed as software; packages such as flexmix (Gr¨un et al., 2007; Gr¨un and Leisch, 2008; Leisch, 2004b), nhmsar (Ailliot et al., 2015) and mhsmm (O’Connell et al., 2011) along side with some custom functions have been used for computation procedures. As we already said, this thesis is about three independent chapters; in order to avoid misunderstanding, every chapter has its own notation

    Forecasting Volatility in Financial Time Series

    Get PDF

    Limit order books in statistical arbitrage and anomaly detection

    Full text link
    Cette thèse propose des méthodes exploitant la vaste information contenue dans les carnets d’ordres (LOBs). La première partie de cette thèse découvre des inefficacités dans les LOBs qui sont source d’arbitrage statistique pour les traders haute fréquence. Le chapitre 1 développe de nouvelles relations théoriques entre les actions intercotées afin que leurs prix soient exempts d’arbitrage. Toute déviation de prix est capturée par une stratégie novatrice qui est ensuite évaluée dans un nouvel environnement de backtesting permettant l’étude de la latence et de son importance pour les traders haute fréquence. Le chapitre 2 démontre empiriquement l’existence d’arbitrage lead-lag à haute fréquence. Les relations dites lead-lag ont été bien documentées par le passé, mais aucune étude n’a montré leur véritable potentiel économique. Un modèle économétrique original est proposé pour prédire les rendements de l’actif en retard, ce qu’il réalise de manière précise hors échantillon, conduisant à des opportunités d’arbitrage de courte durée. Dans ces deux chapitres, les inefficacités des LOBs découvertes sont démontrées comme étant rentables, fournissant ainsi une meilleure compréhension des activités des traders haute fréquence. La deuxième partie de cette thèse investigue les séquences anormales dans les LOBs. Le chapitre 3 évalue la performance de méthodes d’apprentissage automatique dans la détection d’ordres frauduleux. En raison de la grande quantité de données, les fraudes sont difficilement détectables et peu de cas sont disponibles pour ajuster les modèles de détection. Un nouveau cadre d’apprentissage profond non supervisé est proposé afin de discerner les comportements anormaux du LOB dans ce contexte ardu. Celui-ci est indépendant de l’actif et peut évoluer avec les marchés, offrant alors de meilleures capacités de détection pour les régulateurs financiers.This thesis proposes methods exploiting the vast informational content of limit order books (LOBs). The first part of this thesis discovers LOB inefficiencies that are sources of statistical arbitrage for high-frequency traders. Chapter 1 develops new theoretical relationships between cross-listed stocks, so their prices are arbitrage free. Price deviations are captured by a novel strategy that is then evaluated in a new backtesting environment enabling the study of latency and its importance for high-frequency traders. Chapter 2 empirically demonstrates the existence of lead-lag arbitrage at high-frequency. Lead-lag relationships have been well documented in the past, but no study has shown their true economic potential. An original econometric model is proposed to forecast returns on the lagging asset, and does so accurately out-of-sample, resulting in short-lived arbitrage opportunities. In both chapters, the discovered LOB inefficiencies are shown to be profitable, thus providing a better understanding of high-frequency traders’ activities. The second part of this thesis investigates anomalous patterns in LOBs. Chapter 3 studies the performance of machine learning methods in the detection of fraudulent orders. Because of the large amount of LOB data generated daily, trade frauds are challenging to catch, and very few cases are available to fit detection models. A novel unsupervised deep learning–based framework is proposed to discern abnormal LOB behavior in this difficult context. It is asset independent and can evolve alongside markets, providing better fraud detection capabilities to market regulators

    Quantitative methods in high-frequency financial econometrics: modeling univariate and multivariate time series

    Get PDF

    Essays in Global Commodity Prices and Realised Volatility

    Get PDF
    This thesis consists of three substantive chapters and an Introduction and a Conclusion. The first substantive chapter (Chapter 1) examines in whether high frequency financial and speculative variables convey information that improves the monthly predictions of an aggregate measure of commodity prices (S&PGSCI) by comparing their Root Mean Squared Error (RMSE) to that from the usual benchmark AR (1). The Mixed Data Sampling models (MIDAS) allow us to obtain forecasts by keeping variables at their original frequencies and therefore to explore the richness of high frequency data. The evidence suggests that MIDAS models estimated recursively, and their analogous monthly version seem to capture some predictive information contained in the speculative variables described by the agricultural managed money spread positions. The most interesting finding – larger RMSE reductions during the crisis period - is an improvement in prediction accuracy from use of speculative positions. This suggests speculation contains information that helps in forecasting commodity prices. The second substantive chapter (Chapter 2) focuses on the ability to forecast the daily Realised Volatility of the Bloomberg Commodity Index Excess return (BCOM) using an Heterogeneous Autoregressive model (HAR) and competing models that include an Implied Volatility (IV) measure either from the Commodity or US Stock Market. The former uses the IV for at the money call options of the Dow Jones-UBS Commodity Index published by DataStream while the latter uses the US Stock Market VIX. The Realised Volatility is measured by three different proxies, absolute returns and two range-based estimators, one based on Parkinson (1980) and the Rogers and the other on Satchell (1991). Both are constructed with open, close, high and low daily prices. In-sample results for the 28/07/2011 to 31/10/17 period show that the IV measure estimates are small but statistically significant, suggesting the IV is a biased estimator of future Realised Volatility. The models used to obtain the one-day-ahead out of sample forecasts from 03/03/16 to 31/10/17 were estimated dynamically following a rolling window. To compare the forecasting accuracy of the models, their respective Root Mean Squared Error (RMSE) were computed. These show that the HAR specification does a good job in forecasting the Realised Volatility by offering better forecast in comparison with the IV measures and popular benchmark models such as GARCH (1,1), E-GARCH (1,1). The third substantive (Chapter 3) investigates the linear Granger causal relationship between a popular speculative proxy of 'excess speculation' (Working’sT index) and the weekly log realised volatility and log returns of wheat futures prices. It also examines the impact of managed money spreading positions as a novel measure of speculation on wheat futures causality. Following Granger and Vector Autoregressive (VAR) methodology, I estimate bivariate VAR regressions. The findings show there is a statistically significant unidirectional linear causality between speculative measures and both wheat log returns and the log realised volatility proxy - the Rogers and Satchell’s range-price estimators. Interestingly, the direction of causality runs from managed money spreading positions to log volatility and log returns but in the opposite direction for the Working’s T index

    Characterising and modeling the co-evolution of transportation networks and territories

    Full text link
    The identification of structuring effects of transportation infrastructure on territorial dynamics remains an open research problem. This issue is one of the aspects of approaches on complexity of territorial dynamics, within which territories and networks would be co-evolving. The aim of this thesis is to challenge this view on interactions between networks and territories, both at the conceptual and empirical level, by integrating them in simulation models of territorial systems.Comment: Doctoral dissertation (2017), Universit\'e Paris 7 Denis Diderot. Translated from French. Several papers compose this PhD thesis; overlap with: arXiv:{1605.08888, 1608.00840, 1608.05266, 1612.08504, 1706.07467, 1706.09244, 1708.06743, 1709.08684, 1712.00805, 1803.11457, 1804.09416, 1804.09430, 1805.05195, 1808.07282, 1809.00861, 1811.04270, 1812.01473, 1812.06008, 1908.02034, 2012.13367, 2102.13501, 2106.11996

    Models, Simulations, and the Reduction of Complexity

    Get PDF
    Modern science is a model-building activity. But how are models contructed? How are they related to theories and data? How do they explain complex scientific phenomena, and which role do computer simulations play? To address these questions which are highly relevant to scientists as well as to philosophers of science, 8 leading natural, engineering and social scientists reflect upon their modeling work, and 8 philosophers provide a commentary

    Stories from different worlds in the universe of complex systems: A journey through microstructural dynamics and emergent behaviours in the human heart and financial markets

    Get PDF
    A physical system is said to be complex if it exhibits unpredictable structures, patterns or regularities emerging from microstructural dynamics involving a large number of components. The study of complex systems, known as complexity science, is maturing into an independent and multidisciplinary area of research seeking to understand microscopic interactions and macroscopic emergence across a broad spectrum systems, such as the human brain and the economy, by combining specific modelling techniques, data analytics, statistics and computer simulations. In this dissertation we examine two different complex systems, the human heart and financial markets, and present various research projects addressing specific problems in these areas. Cardiac fibrillation is a diffuse pathology in which the periodic planar electrical conduction across the cardiac tissue is disrupted and replaced by fast and disorganised electrical waves. In spite of a century-long history of research, numerous debates and disputes on the mechanisms of cardiac fibrillation are still unresolved while the outcomes of clinical treatments remain far from satisfactory. In this dissertation we use cellular automata and mean-field models to qualitatively replicate the onset and maintenance of cardiac fibrillation from the interactions among neighboring cells and the underlying topology of the cardiac tissue. We use these models to study the transition from paroxysmal to persistent atrial fibrillation, the mechanisms through which the gap-junction enhancer drug Rotigaptide terminates cardiac fibrillation and how focal and circuital drivers of fibrillation may co-exist as projections of transmural electrical activities. Financial markets are hubs in which heterogeneous participants, such as humans and algorithms, adopt different strategic behaviors to exchange financial assets. In recent decades the widespread adoption of algorithmic trading, the electronification of financial transactions, the increased competition among trading venues and the use of sophisticated financial instruments drove the transformation of financial markets into a global and interconnected complex system. In this thesis we introduce agent-based and state-space models to describe specific microstructural dynamics in the stock and foreign exchange markets. We use these models to replicate the emergence of cross-currency correlations from the interactions between heterogeneous participants in the currency market and to disentangle the relationships between price fluctuations, market liquidity and demand/supply imbalances in the stock market.Open Acces

    Essays in business cycle measurement.

    Get PDF
    This dissertation is concerned with the issue of economic fluctuations; the following related topics are analysed: co-integration and the NAIRU hypothesis: the theoretical implications of different classes of models, some implying that the NAIRU is a structural parameter that can only be influenced by supply-side measures, others that the attainable level of unemployment is a function also of demand variables, are first discussed; co-integration techniques (the Engle-Granger and the Johansen procedure) are then used to test the NAIRU hypothesis; the more powerful maximum likelihood method developed by Johansen shows that the unemployment rate is co-integrated with both supply and demand variables only as well as a combination of the two; supply versus demand shocks as the driving force of business cycles: using two measures of productivity growth (the Solow residual and the dual residual from the cost function), competing theories of the cycle are tested in a number of OECD countries; the issue of market structure and its relevance to explain economic fluctuations is also addressed; the empirical evidence refutes the "stronger" real business cycle (RBC) hypothesis that denies the role of demand shocks; aggregate versus sectoral shocks: their relative importance in the UK economy is evaluated by estimating a vector autoregression (VAR) of the output growth rates of 19 industrial sectors and doing a factor analysis on the innovations; the one-factor model performs quite well when applied to the British data implying that there is an aggregate shock that can account for a high percentage of the fluctuations of output over the cycle; the "seasonal cycle" in the UK economy: the quantitative importance of seasonal fluctuations and the existence of a "seasonal cycle" whose main features are very similar to those of the conventional business cycle are documented by running regressions with seasonal dummies and band spectrum regressions; a one-sector, neo-classical model of capital accumulation in which seasonal preferences are explicitly incorporated (the coefficient of risk aversion depending on the season s) is then set up; the model is not rejected by the data, confirming that seasonality is a feature to be explained within the economic model

    Models, Simulations, and the Reduction of Complexity

    Get PDF
    Modern science is a model-building activity. But how are models contructed? How are they related to theories and data? How do they explain complex scientific phenomena, and which role do computer simulations play? To address these questions which are highly relevant to scientists as well as to philosophers of science, 8 leading natural, engineering and social scientists reflect upon their modeling work, and 8 philosophers provide a commentary
    • …
    corecore