Joke Heylen, Modeling variability in time profiles: Teasing apart amplitude and shape.
Supervisor: Prof. dr. E. Ceulemans. Cosupervisor: Prof. dr. I. Van Mechelen.
Many research domains witness an increasing interest in the study of dynamical processes. For this purpose, variables are measured at consecutive time points, leading to time profiles. Time profiles can differ from each other in shape and amplitude. While amplitude pertains to the general intensity of the process, shape may reflect critical information on its fundamental dynamic characteristics. Therefore, when modeling time profiles, disentangling these two aspects of variability and accounting for them is a major challenge.
The starting point for this dissertation is K-spectral centroid clustering (KSC; Yang and Leskovec, 2011). KSC assigns each time profile in the data to one out of K clusters, each related to a reference profile that reflects the typical shape of the time profiles in the cluster in question. Moreover, KSC models amplitude differences within the clusters by providing an amplitude coefficient for each profile, which reflects its overall intensity.
As KSC was developed within the field of machine learning, the first goal of this dissertation is to examine the validity and usefulness of this method when applied to psychological research data. In Chapter 1 we analyse emotional intensity profiles that were collected by asking participants to report on different types of anger-eliciting events. Using KSC, we identify two distinct shape clusters, namely early-blooming and late-blooming time profiles. Both the clustering as well as the amplitude coefficients could be meaningfully related to event importance and emotion regulation strategies.
In spite of these promising results, for a number of other datasets, we found evidence that the performance of KSC may be heavily disturbed by outlying time profiles. Due to the iterative nature of the KSC procedure and the eigenvalue decomposition based update of the reference profiles, time profiles with a high overall amplitude and a deviating shape may have a negative impact on both the reference profile of the clusters and the partitioning of time profiles. To overcome this outlier problem, we propose a robust version of KSC, robKSC, in Chapter 2. For this purpose, we combine KSC with ideas from ROBPCA (Hubert, Rousseeuw & Vanden Branden, 2005).
KSC focuses on two-mode data, whereas psychological time profile data often have a more complex structure. Therefore, a third goal of this doctoral thesis is to develop some KSC extensions for more complex data structures. In Chapter 3 we propose an extension for the clustering of hierarchical time profile data (e.g., time profiles nested in persons), KSC-N. This method parsimoniously captures individual differences in profile repertoire by inducing person clusters and by deriving for each cluster separately the profile types that occur most often for the people in it. In Chapter 4 we focus on multivariate time profiles, and analyze data from an intervention study in which a set of depression symptoms are monitored across time for a set of patients (i.e., three-way three-mode data). To obtain an insightful overview on how the symptom severity time profiles vary as a function of both individuals and symptoms, we develop 2M-KSC. 2M-KSC simultaneously assigns the individuals to a few person clusters and the symptoms to a few symptom clusters. Each combination of a person cluster and a symptom is further associated with a single reference profile.Introduction 1
Chapter 1: Variability in anger intensity profiles: Structure and predictive basis 9
Chapter 2: RobKSC: A robust method for clustering time profile data 25
Chapter 3: KSC-N: Clustering of hierarchical time profile data 55
Chapter 4: ¬Two-mode K-Spectral Centroid analysis for studying multivariate 87
General conclusion and discussion 115nrpages: 120status: publishe