370 research outputs found
Essays on model-based clustering for macroeconomic and financial data
This thesis is about three-independent chapters. It deals with model-based
clustering in different framework and with different approaches. Also employed
data are quite different, especially between the first chapter and the others.
Basically, the clustering procedures are applied on economic time series and they
can represent a fondation for further research topics.
In the first chapter we apply a clustering procedure to detect trend changes
in macroeconomic data, focusing on the GDP time series for the G-7 countries.
Two popular trend-cycle decompositions (i.e., Beveridge and Nelson Decom position and Hodrick and Prescott filter) are considered in a preliminary step of
the analysis, and we stress the differences between the two methods in terms of
the inferred clustering. A finite mixture of regression models is considered to
show different patterns and changes in GDP slopes over time. This approach
can be used also to detect structural breaks or change points, and it is an alter native to existing approaches in a probabilistic framework and we also discuss
international changes in GDP distribution for the G-7 countries, highlighting
similarities, e.g., in break dates, aiming at adding more insights on the economic
integration among countries. Our findings are that our model is able to represent
economic paths of every countries and by looking at the changes in slope of the
long-trend component of the GDP, we are able to investigate change points, also
compared with alternative approaches.
In the second chapter we provide an empirical analysis on the main univariate
and multivariate stylized facts in the two of the largest cryptocurrencies, namely
Ethereum and Bitcoin, return series. A Markov-Switching Vector Auto Regres sion model is considered to further explore the dynamic relationships between
cryptocurrencies and other financial assets, such as gold, S&P and oil. We
5
estimate the presence of volatility clustering, a rapid decay of the autocorrelation
function, an excess of kurtosis and multivariate little cross-correlation across
the series, except for contemporaneous returns. The model well represent tha
univariate and multivariate stylized facts, giving an insight on the considered
crypto-currencies as pure financial asset; moreover, we find a relationship between
the response variable and the autoregression part and (some of) the exogeneous
variables considered.
Finally, in the third chapter we introduce multivariate models for analyzing
several stock returns series of italian football teams such as AS Roma, FC
Juventus, SS Lazio, in order to describe the relationship across these series
and to model the evolution over time of the stock returns in a very particular
framework; in fact, stock returns of a football team can be influenced both by
football performances (national and international) and by non-football events too,
like a change in management or a purchase of a superstar footballer. A natural
way to model the dependence over time is by using the hidden Markov models
and his generalization, hidden semi-Markov models by relaxing the assumption on
the so-called sojourn distribution of the hidden states. Instead for the conditional
distributions of the observed data (i.e., the emission distribution) we use the
multivariate leptokurtic-normal distribution, a generalization of the multivariate
normal, with an additional parameter β which describes the excess of kurtosis.
Furthermore, some multivariate stylized facts are also investigated.
Parameters estimation is performed by Expectation-Maximization (EM)
algorithm type which maximizes the log-likelihood function, allowing us to deal
with a classification problem as a missing data problem.
R has been employed as software; packages such as flexmix (Gr¨un et al.,
2007; Gr¨un and Leisch, 2008; Leisch, 2004b), nhmsar (Ailliot et al., 2015) and
mhsmm (O’Connell et al., 2011) along side with some custom functions have been
used for computation procedures.
As we already said, this thesis is about three independent chapters; in order
to avoid misunderstanding, every chapter has its own notation
Limit order books in statistical arbitrage and anomaly detection
Cette thèse propose des méthodes exploitant la vaste information contenue dans les carnets d’ordres (LOBs). La première partie de cette thèse découvre des inefficacités dans les LOBs qui sont source d’arbitrage statistique pour les traders haute fréquence. Le chapitre 1 développe de nouvelles relations théoriques entre les actions intercotées afin que leurs prix soient exempts d’arbitrage. Toute déviation de prix est capturée par une stratégie novatrice qui est ensuite évaluée dans un nouvel environnement de backtesting permettant l’étude de la latence et de son importance pour les traders haute fréquence. Le chapitre 2 démontre empiriquement l’existence d’arbitrage lead-lag à haute fréquence. Les relations dites lead-lag ont été bien documentées par le passé, mais aucune étude n’a montré leur véritable potentiel économique. Un modèle économétrique original est proposé pour prédire les rendements de l’actif en retard, ce qu’il réalise de manière précise hors échantillon, conduisant à des opportunités d’arbitrage de courte durée. Dans ces deux chapitres, les inefficacités des LOBs découvertes sont démontrées comme étant rentables, fournissant ainsi une meilleure compréhension des activités des traders haute fréquence. La deuxième partie de cette thèse investigue les séquences anormales dans les LOBs. Le chapitre 3 évalue la performance de méthodes d’apprentissage automatique dans la détection d’ordres frauduleux. En raison de la grande quantité de données, les fraudes sont difficilement détectables et peu de cas sont disponibles pour ajuster les modèles de détection. Un nouveau cadre d’apprentissage profond non supervisé est proposé afin de discerner les comportements anormaux du LOB dans ce contexte ardu. Celui-ci est indépendant de l’actif et peut évoluer avec les marchés, offrant alors de meilleures capacités de détection pour les régulateurs financiers.This thesis proposes methods exploiting the vast informational content of limit order books (LOBs). The first part of this thesis discovers LOB inefficiencies that are sources of statistical arbitrage for high-frequency traders. Chapter 1 develops new theoretical relationships between cross-listed stocks, so their prices are arbitrage free. Price deviations are captured by a novel strategy that is then evaluated in a new backtesting environment enabling the study of latency and its importance for high-frequency traders. Chapter 2 empirically demonstrates the existence of lead-lag arbitrage at high-frequency. Lead-lag relationships have been well documented in the past, but no study has shown their true economic potential. An original econometric model is proposed to forecast returns on the lagging asset, and does so accurately out-of-sample, resulting in short-lived arbitrage opportunities. In both chapters, the discovered LOB inefficiencies are shown to be profitable, thus providing a better understanding of high-frequency traders’ activities. The second part of this thesis investigates anomalous patterns in LOBs. Chapter 3 studies the performance of machine learning methods in the detection of fraudulent orders. Because of the large amount of LOB data generated daily, trade frauds are challenging to catch, and very few cases are available to fit detection models. A novel unsupervised deep learning–based framework is proposed to discern abnormal LOB behavior in this difficult context. It is asset independent and can evolve alongside markets, providing better fraud detection capabilities to market regulators
Essays in Global Commodity Prices and Realised Volatility
This thesis consists of three substantive chapters and an Introduction and a Conclusion. The first substantive chapter (Chapter 1) examines in whether high frequency financial and speculative variables convey information that improves the monthly predictions of an aggregate measure of commodity prices (S&PGSCI) by comparing their Root Mean Squared Error (RMSE) to that from the usual benchmark AR (1). The Mixed Data Sampling models (MIDAS) allow us to obtain forecasts by keeping variables at their original frequencies and therefore to explore the richness of high frequency data. The evidence suggests that MIDAS models estimated recursively, and their analogous monthly version seem to capture some predictive information contained in the speculative variables described by the agricultural managed money spread positions. The most interesting finding – larger RMSE reductions during the crisis period - is an improvement in prediction accuracy from use of speculative positions. This suggests speculation contains information that helps in forecasting commodity prices.
The second substantive chapter (Chapter 2) focuses on the ability to forecast the daily Realised Volatility of the Bloomberg Commodity Index Excess return (BCOM) using an Heterogeneous Autoregressive model (HAR) and competing models that include an Implied Volatility (IV) measure either from the Commodity or US Stock Market. The former uses the IV for at the money call options of the Dow Jones-UBS Commodity Index published by DataStream while the latter uses the US Stock Market VIX. The Realised Volatility is measured by three different proxies, absolute returns and two range-based estimators, one based on Parkinson (1980) and the Rogers and the other on Satchell (1991). Both are constructed with open, close, high and low daily prices. In-sample results for the 28/07/2011 to 31/10/17 period show that the IV measure estimates are small but statistically significant, suggesting the IV is a biased estimator of future Realised Volatility. The models used to obtain the one-day-ahead out of sample forecasts from 03/03/16 to 31/10/17 were estimated dynamically following a rolling window. To compare the forecasting accuracy of the models, their respective Root Mean Squared Error (RMSE) were computed. These show that the HAR specification does a good job in forecasting the Realised Volatility by offering better forecast in comparison with the IV measures and popular benchmark models such as GARCH (1,1), E-GARCH (1,1).
The third substantive (Chapter 3) investigates the linear Granger causal relationship between a popular speculative proxy of 'excess speculation' (Working’sT index) and the weekly log realised volatility and log returns of wheat futures prices. It also examines the impact of managed money spreading positions as a novel measure of speculation on wheat futures causality. Following Granger and Vector Autoregressive (VAR) methodology, I estimate bivariate VAR regressions. The findings show there is a statistically significant unidirectional linear causality between speculative measures and both wheat log returns and the log realised volatility proxy - the Rogers and Satchell’s range-price estimators. Interestingly, the direction of causality runs from managed money spreading positions to log volatility and log returns but in the opposite direction for the Working’s T index
Characterising and modeling the co-evolution of transportation networks and territories
The identification of structuring effects of transportation infrastructure on
territorial dynamics remains an open research problem. This issue is one of the
aspects of approaches on complexity of territorial dynamics, within which
territories and networks would be co-evolving. The aim of this thesis is to
challenge this view on interactions between networks and territories, both at
the conceptual and empirical level, by integrating them in simulation models of
territorial systems.Comment: Doctoral dissertation (2017), Universit\'e Paris 7 Denis Diderot.
Translated from French. Several papers compose this PhD thesis; overlap with:
arXiv:{1605.08888, 1608.00840, 1608.05266, 1612.08504, 1706.07467,
1706.09244, 1708.06743, 1709.08684, 1712.00805, 1803.11457, 1804.09416,
1804.09430, 1805.05195, 1808.07282, 1809.00861, 1811.04270, 1812.01473,
1812.06008, 1908.02034, 2012.13367, 2102.13501, 2106.11996
Models, Simulations, and the Reduction of Complexity
Modern science is a model-building activity. But how are models contructed? How are they related to theories and data? How do they explain complex scientific phenomena, and which role do computer simulations play? To address these questions which are highly relevant to scientists as well as to philosophers of science, 8 leading natural, engineering and social scientists reflect upon their modeling work, and 8 philosophers provide a commentary
Stories from different worlds in the universe of complex systems: A journey through microstructural dynamics and emergent behaviours in the human heart and financial markets
A physical system is said to be complex if it exhibits unpredictable structures, patterns or regularities emerging from microstructural dynamics involving a large number of components. The study of complex systems, known as complexity science, is maturing into an independent and multidisciplinary area of research seeking to understand microscopic interactions and macroscopic emergence across a broad spectrum systems, such as the human brain and the economy, by combining specific modelling techniques, data analytics, statistics and computer simulations. In this dissertation we examine two different complex systems, the human heart and financial markets, and present various research projects addressing specific problems in these areas.
Cardiac fibrillation is a diffuse pathology in which the periodic planar electrical conduction across the cardiac tissue is disrupted and replaced by fast and disorganised electrical waves. In spite of a century-long history of research, numerous debates and disputes on the mechanisms of cardiac fibrillation are still unresolved while the outcomes of clinical treatments remain far from satisfactory. In this dissertation we use cellular automata and mean-field models to qualitatively replicate the onset and maintenance of cardiac fibrillation from the interactions among neighboring cells and the underlying topology of the cardiac tissue. We use these models to study the transition from paroxysmal to persistent atrial fibrillation, the mechanisms through which the gap-junction enhancer drug Rotigaptide terminates cardiac fibrillation and how focal and circuital drivers of fibrillation may co-exist as projections of transmural electrical activities.
Financial markets are hubs in which heterogeneous participants, such as humans and algorithms, adopt different strategic behaviors to exchange financial assets. In recent decades the widespread adoption of algorithmic trading, the electronification of financial transactions, the increased competition among trading venues and the use of sophisticated financial instruments drove the transformation of financial markets into a global and interconnected complex system. In this thesis we introduce agent-based and state-space models to describe specific microstructural dynamics in the stock and foreign exchange markets. We use these models to replicate the emergence of cross-currency correlations from the interactions between heterogeneous participants in the currency market and to disentangle the relationships between price fluctuations, market liquidity and demand/supply imbalances in the stock market.Open Acces
Essays in business cycle measurement.
This dissertation is concerned with the issue of economic fluctuations; the following related topics are analysed: co-integration and the NAIRU hypothesis: the theoretical implications of different classes of models, some implying that the NAIRU is a structural parameter that can only be influenced by supply-side measures, others that the attainable level of unemployment is a function also of demand variables, are first discussed; co-integration techniques (the Engle-Granger and the Johansen procedure) are then used to test the NAIRU hypothesis; the more powerful maximum likelihood method developed by Johansen shows that the unemployment rate is co-integrated with both supply and demand variables only as well as a combination of the two; supply versus demand shocks as the driving force of business cycles: using two measures of productivity growth (the Solow residual and the dual residual from the cost function), competing theories of the cycle are tested in a number of OECD countries; the issue of market structure and its relevance to explain economic fluctuations is also addressed; the empirical evidence refutes the "stronger" real business cycle (RBC) hypothesis that denies the role of demand shocks; aggregate versus sectoral shocks: their relative importance in the UK economy is evaluated by estimating a vector autoregression (VAR) of the output growth rates of 19 industrial sectors and doing a factor analysis on the innovations; the one-factor model performs quite well when applied to the British data implying that there is an aggregate shock that can account for a high percentage of the fluctuations of output over the cycle; the "seasonal cycle" in the UK economy: the quantitative importance of seasonal fluctuations and the existence of a "seasonal cycle" whose main features are very similar to those of the conventional business cycle are documented by running regressions with seasonal dummies and band spectrum regressions; a one-sector, neo-classical model of capital accumulation in which seasonal preferences are explicitly incorporated (the coefficient of risk aversion depending on the season s) is then set up; the model is not rejected by the data, confirming that seasonality is a feature to be explained within the economic model
Models, Simulations, and the Reduction of Complexity
Modern science is a model-building activity. But how are models contructed? How are they related to theories and data? How do they explain complex scientific phenomena, and which role do computer simulations play? To address these questions which are highly relevant to scientists as well as to philosophers of science, 8 leading natural, engineering and social scientists reflect upon their modeling work, and 8 philosophers provide a commentary
- …