77,086 research outputs found

    Essays on model-based clustering for macroeconomic and financial data

    Get PDF
    This thesis is about three-independent chapters. It deals with model-based clustering in different framework and with different approaches. Also employed data are quite different, especially between the first chapter and the others. Basically, the clustering procedures are applied on economic time series and they can represent a fondation for further research topics. In the first chapter we apply a clustering procedure to detect trend changes in macroeconomic data, focusing on the GDP time series for the G-7 countries. Two popular trend-cycle decompositions (i.e., Beveridge and Nelson Decom position and Hodrick and Prescott filter) are considered in a preliminary step of the analysis, and we stress the differences between the two methods in terms of the inferred clustering. A finite mixture of regression models is considered to show different patterns and changes in GDP slopes over time. This approach can be used also to detect structural breaks or change points, and it is an alter native to existing approaches in a probabilistic framework and we also discuss international changes in GDP distribution for the G-7 countries, highlighting similarities, e.g., in break dates, aiming at adding more insights on the economic integration among countries. Our findings are that our model is able to represent economic paths of every countries and by looking at the changes in slope of the long-trend component of the GDP, we are able to investigate change points, also compared with alternative approaches. In the second chapter we provide an empirical analysis on the main univariate and multivariate stylized facts in the two of the largest cryptocurrencies, namely Ethereum and Bitcoin, return series. A Markov-Switching Vector Auto Regres sion model is considered to further explore the dynamic relationships between cryptocurrencies and other financial assets, such as gold, S&P and oil. We 5 estimate the presence of volatility clustering, a rapid decay of the autocorrelation function, an excess of kurtosis and multivariate little cross-correlation across the series, except for contemporaneous returns. The model well represent tha univariate and multivariate stylized facts, giving an insight on the considered crypto-currencies as pure financial asset; moreover, we find a relationship between the response variable and the autoregression part and (some of) the exogeneous variables considered. Finally, in the third chapter we introduce multivariate models for analyzing several stock returns series of italian football teams such as AS Roma, FC Juventus, SS Lazio, in order to describe the relationship across these series and to model the evolution over time of the stock returns in a very particular framework; in fact, stock returns of a football team can be influenced both by football performances (national and international) and by non-football events too, like a change in management or a purchase of a superstar footballer. A natural way to model the dependence over time is by using the hidden Markov models and his generalization, hidden semi-Markov models by relaxing the assumption on the so-called sojourn distribution of the hidden states. Instead for the conditional distributions of the observed data (i.e., the emission distribution) we use the multivariate leptokurtic-normal distribution, a generalization of the multivariate normal, with an additional parameter β which describes the excess of kurtosis. Furthermore, some multivariate stylized facts are also investigated. Parameters estimation is performed by Expectation-Maximization (EM) algorithm type which maximizes the log-likelihood function, allowing us to deal with a classification problem as a missing data problem. R has been employed as software; packages such as flexmix (Gr¨un et al., 2007; Gr¨un and Leisch, 2008; Leisch, 2004b), nhmsar (Ailliot et al., 2015) and mhsmm (O’Connell et al., 2011) along side with some custom functions have been used for computation procedures. As we already said, this thesis is about three independent chapters; in order to avoid misunderstanding, every chapter has its own notation

    Beta-Product Poisson-Dirichlet Processes

    Get PDF
    Time series data may exhibit clustering over time and, in a multiple time series context, the clustering behavior may differ across the series. This paper is motivated by the Bayesian non--parametric modeling of the dependence between the clustering structures and the distributions of different time series. We follow a Dirichlet process mixture approach and introduce a new class of multivariate dependent Dirichlet processes (DDP). The proposed DDP are represented in terms of vector of stick-breaking processes with dependent weights. The weights are beta random vectors that determine different and dependent clustering effects along the dimension of the DDP vector. We discuss some theoretical properties and provide an efficient Monte Carlo Markov Chain algorithm for posterior computation. The effectiveness of the method is illustrated with a simulation study and an application to the United States and the European Union industrial production indexes

    Multivariate Approaches to Classification in Extragalactic Astronomy

    Get PDF
    Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono-or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.Comment: Open Access paper. http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>. \<10.3389/fspas.2015.00003 \&g

    Time Series Cluster Kernel for Learning Similarities between Multivariate Time Series with Missing Data

    Get PDF
    Similarity-based approaches represent a promising direction for time series analysis. However, many such methods rely on parameter tuning, and some have shortcomings if the time series are multivariate (MTS), due to dependencies between attributes, or the time series contain missing data. In this paper, we address these challenges within the powerful context of kernel methods by proposing the robust \emph{time series cluster kernel} (TCK). The approach taken leverages the missing data handling properties of Gaussian mixture models (GMM) augmented with informative prior distributions. An ensemble learning approach is exploited to ensure robustness to parameters by combining the clustering results of many GMM to form the final kernel. We evaluate the TCK on synthetic and real data and compare to other state-of-the-art techniques. The experimental results demonstrate that the TCK is robust to parameter choices, provides competitive results for MTS without missing data and outstanding results for missing data.Comment: 23 pages, 6 figure
    • …
    corecore