77,086 research outputs found
Essays on model-based clustering for macroeconomic and financial data
This thesis is about three-independent chapters. It deals with model-based
clustering in different framework and with different approaches. Also employed
data are quite different, especially between the first chapter and the others.
Basically, the clustering procedures are applied on economic time series and they
can represent a fondation for further research topics.
In the first chapter we apply a clustering procedure to detect trend changes
in macroeconomic data, focusing on the GDP time series for the G-7 countries.
Two popular trend-cycle decompositions (i.e., Beveridge and Nelson Decom position and Hodrick and Prescott filter) are considered in a preliminary step of
the analysis, and we stress the differences between the two methods in terms of
the inferred clustering. A finite mixture of regression models is considered to
show different patterns and changes in GDP slopes over time. This approach
can be used also to detect structural breaks or change points, and it is an alter native to existing approaches in a probabilistic framework and we also discuss
international changes in GDP distribution for the G-7 countries, highlighting
similarities, e.g., in break dates, aiming at adding more insights on the economic
integration among countries. Our findings are that our model is able to represent
economic paths of every countries and by looking at the changes in slope of the
long-trend component of the GDP, we are able to investigate change points, also
compared with alternative approaches.
In the second chapter we provide an empirical analysis on the main univariate
and multivariate stylized facts in the two of the largest cryptocurrencies, namely
Ethereum and Bitcoin, return series. A Markov-Switching Vector Auto Regres sion model is considered to further explore the dynamic relationships between
cryptocurrencies and other financial assets, such as gold, S&P and oil. We
5
estimate the presence of volatility clustering, a rapid decay of the autocorrelation
function, an excess of kurtosis and multivariate little cross-correlation across
the series, except for contemporaneous returns. The model well represent tha
univariate and multivariate stylized facts, giving an insight on the considered
crypto-currencies as pure financial asset; moreover, we find a relationship between
the response variable and the autoregression part and (some of) the exogeneous
variables considered.
Finally, in the third chapter we introduce multivariate models for analyzing
several stock returns series of italian football teams such as AS Roma, FC
Juventus, SS Lazio, in order to describe the relationship across these series
and to model the evolution over time of the stock returns in a very particular
framework; in fact, stock returns of a football team can be influenced both by
football performances (national and international) and by non-football events too,
like a change in management or a purchase of a superstar footballer. A natural
way to model the dependence over time is by using the hidden Markov models
and his generalization, hidden semi-Markov models by relaxing the assumption on
the so-called sojourn distribution of the hidden states. Instead for the conditional
distributions of the observed data (i.e., the emission distribution) we use the
multivariate leptokurtic-normal distribution, a generalization of the multivariate
normal, with an additional parameter β which describes the excess of kurtosis.
Furthermore, some multivariate stylized facts are also investigated.
Parameters estimation is performed by Expectation-Maximization (EM)
algorithm type which maximizes the log-likelihood function, allowing us to deal
with a classification problem as a missing data problem.
R has been employed as software; packages such as flexmix (Gr¨un et al.,
2007; Gr¨un and Leisch, 2008; Leisch, 2004b), nhmsar (Ailliot et al., 2015) and
mhsmm (O’Connell et al., 2011) along side with some custom functions have been
used for computation procedures.
As we already said, this thesis is about three independent chapters; in order
to avoid misunderstanding, every chapter has its own notation
Beta-Product Poisson-Dirichlet Processes
Time series data may exhibit clustering over time and, in a multiple time
series context, the clustering behavior may differ across the series. This
paper is motivated by the Bayesian non--parametric modeling of the dependence
between the clustering structures and the distributions of different time
series. We follow a Dirichlet process mixture approach and introduce a new
class of multivariate dependent Dirichlet processes (DDP). The proposed DDP are
represented in terms of vector of stick-breaking processes with dependent
weights. The weights are beta random vectors that determine different and
dependent clustering effects along the dimension of the DDP vector. We discuss
some theoretical properties and provide an efficient Monte Carlo Markov Chain
algorithm for posterior computation. The effectiveness of the method is
illustrated with a simulation study and an application to the United States and
the European Union industrial production indexes
Multivariate Approaches to Classification in Extragalactic Astronomy
Clustering objects into synthetic groups is a natural activity of any
science. Astrophysics is not an exception and is now facing a deluge of data.
For galaxies, the one-century old Hubble classification and the Hubble tuning
fork are still largely in use, together with numerous mono-or bivariate
classifications most often made by eye. However, a classification must be
driven by the data, and sophisticated multivariate statistical tools are used
more and more often. In this paper we review these different approaches in
order to situate them in the general context of unsupervised and supervised
learning. We insist on the astrophysical outcomes of these studies to show that
multivariate analyses provide an obvious path toward a renewal of our
classification of galaxies and are invaluable tools to investigate the physics
and evolution of galaxies.Comment: Open Access paper.
http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>.
\<10.3389/fspas.2015.00003 \&g
Time Series Cluster Kernel for Learning Similarities between Multivariate Time Series with Missing Data
Similarity-based approaches represent a promising direction for time series
analysis. However, many such methods rely on parameter tuning, and some have
shortcomings if the time series are multivariate (MTS), due to dependencies
between attributes, or the time series contain missing data. In this paper, we
address these challenges within the powerful context of kernel methods by
proposing the robust \emph{time series cluster kernel} (TCK). The approach
taken leverages the missing data handling properties of Gaussian mixture models
(GMM) augmented with informative prior distributions. An ensemble learning
approach is exploited to ensure robustness to parameters by combining the
clustering results of many GMM to form the final kernel.
We evaluate the TCK on synthetic and real data and compare to other
state-of-the-art techniques. The experimental results demonstrate that the TCK
is robust to parameter choices, provides competitive results for MTS without
missing data and outstanding results for missing data.Comment: 23 pages, 6 figure
- …