180,820 research outputs found
ClickClust: An R Package for Model-Based Clustering of Categorical Sequences
The R package ClickClust is a new piece of software devoted to finite mixture modeling and model-based clustering of categorical sequences. As a special kind of time series, categorical sequences, also known as categorical time series, exhibit a time-dependent nature and are traditionally modeled by means of Markov chains. Clustering categorical sequences is an important problem with multiple applications, but grouping sequences of sites or web-pages, also known as clickstreams, is one of the most well-known problems that helps discover common navigation patterns and routes taken by users. This popular application is recognized in the package title ClickClust. The paper discusses methodological and algorithmic foundations of the package based on finite mixtures of Markov models. The number of Markov chain states can often be large leading to high-dimensional transition probability matrices. The high number of model parameters can affect clustering performance severely. As a remedy to this problem, backward and forward selection algorithms are proposed for grouping states. This extends the original clustering problem to a biclustering framework. Among other capabilities of ClickClust, there are the estimation of the variance-covariance matrix corresponding to model parameter estimates, prediction of future states visited, and the construction of a display named click-plot that helps illustrate the obtained clustering solutions. All available functions and the utility of the package are thoroughly discussed and illustrated on multiple examples
Joint Estimation of Multiple Graphical Models from High Dimensional Time Series
In this manuscript we consider the problem of jointly estimating multiple
graphical models in high dimensions. We assume that the data are collected from
n subjects, each of which consists of T possibly dependent observations. The
graphical models of subjects vary, but are assumed to change smoothly
corresponding to a measure of closeness between subjects. We propose a kernel
based method for jointly estimating all graphical models. Theoretically, under
a double asymptotic framework, where both (T,n) and the dimension d can
increase, we provide the explicit rate of convergence in parameter estimation.
It characterizes the strength one can borrow across different individuals and
impact of data dependence on parameter estimation. Empirically, experiments on
both synthetic and real resting state functional magnetic resonance imaging
(rs-fMRI) data illustrate the effectiveness of the proposed method.Comment: 40 page
Estimating Time-Varying Effective Connectivity in High-Dimensional fMRI Data Using Regime-Switching Factor Models
Recent studies on analyzing dynamic brain connectivity rely on sliding-window
analysis or time-varying coefficient models which are unable to capture both
smooth and abrupt changes simultaneously. Emerging evidence suggests
state-related changes in brain connectivity where dependence structure
alternates between a finite number of latent states or regimes. Another
challenge is inference of full-brain networks with large number of nodes. We
employ a Markov-switching dynamic factor model in which the state-driven
time-varying connectivity regimes of high-dimensional fMRI data are
characterized by lower-dimensional common latent factors, following a
regime-switching process. It enables a reliable, data-adaptive estimation of
change-points of connectivity regimes and the massive dependencies associated
with each regime. We consider the switching VAR to quantity the dynamic
effective connectivity. We propose a three-step estimation procedure: (1)
extracting the factors using principal component analysis (PCA) and (2)
identifying dynamic connectivity states using the factor-based switching vector
autoregressive (VAR) models in a state-space formulation using Kalman filter
and expectation-maximization (EM) algorithm, and (3) constructing the
high-dimensional connectivity metrics for each state based on subspace
estimates. Simulation results show that our proposed estimator outperforms the
K-means clustering of time-windowed coefficients, providing more accurate
estimation of regime dynamics and connectivity metrics in high-dimensional
settings. Applications to analyzing resting-state fMRI data identify dynamic
changes in brain states during rest, and reveal distinct directed connectivity
patterns and modular organization in resting-state networks across different
states.Comment: 21 page
- …