151 research outputs found

    Approaches to Sample Size Determination for Multivariate Data:Applications to PCA and PLS-DA of Omics Data

    Get PDF
    Sample size determination is a fundamental step in the design of experiments. Methods for sample size determination are abundant for univariate analysis methods, but scarce in the multivariate case. Omics data are multivariate in nature and are commonly investigated using multivariate statistical methods, such as principal component analysis (PCA) and partial least-squares discriminant analysis (PLS-DA). No simple approaches to sample size determination exist for PCA and PLS-DA. In this paper we will introduce important concepts and offer strategies for (minimally) required sample size estimation when planning experiments to be analyzed using PCA and/or PLS-DA.</p

    Comparison of Estimation Procedures for Multilevel AR(1) Models

    Get PDF
    To estimate a time series model for multiple individuals, a multilevel model may be used.In this paper we compare two estimation methods for the autocorrelation in Multilevel AR(1) models, namely Maximum Likelihood Estimation (MLE) and Bayesian Markov Chain Monte Carlo.Furthermore, we examine the difference between modeling fixed and random individual parameters.To this end, we perform a simulation study with a fully crossed design, in which we vary the length of the time series (10 or 25), the number of individuals per sample (10 or 25), the mean of the autocorrelation (-0.6 to 0.6 inclusive, in steps of 0.3) and the standard deviation of the autocorrelation (0.25 or 0.40).We found that the random estimators of the population autocorrelation show less bias and higher power, compared to the fixed estimators. As expected, the random estimators profit strongly from a higher number of individuals, while this effect is small for the fixed estimators.The fixed estimators profit slightly more from a higher number of time points than the random estimators.When possible, random estimation is preferred to fixed estimation.The difference between MLE and Bayesian estimation is nearly negligible. The Bayesian estimation shows a smaller bias, but MLE shows a smaller variability (i.e., standard deviation of the parameter estimates).Finally, better results are found for a higher number of individuals and time points, and for a lower individual variability of the autocorrelation. The effect of the size of the autocorrelation differs between outcome measures

    A tutorial on regression-based norming of psychological tests with GAMLSS

    Get PDF
    A norm-referenced score expresses the position of an individual test taker in the reference population, thereby enabling a proper interpretation of the test score. Such normed scores are derived from test scores obtained from a sample of the reference population. Typically, multiple reference populations exist for a test, namely when the norm-referenced scores depend on individual characteristic(s), as age (and sex). To derive normed scores, regression-based norming has gained large popularity. The advantages of this method over traditional norming are its flexible nature, yielding potentially more realistic norms, and its efficiency, requiring potentially smaller sample sizes to achieve the same precision. In this tutorial, we introduce the reader to regression-based norming, using the generalized additive models for location, scale, and shape (GAMLSS). This approach has been useful in norm estimation of various psychological tests. We discuss the rationale of regression-based norming, theoretical properties of GAMLSS and their relationships to other regression-based norming models. Based on 6 steps, we describe how to: (a) design a normative study to gather proper normative sample data; (b) select a proper GAMLSS model for an empirical scale; (c) derive the desired normed scores for the scale from the fitted model, including those for a composite scale; and (d) visualize the results to achieve insight into the properties of the scale. Following these steps yields regression-based norms with GAMLSS for a psychological test, as we illustrate with normative data of the intelligence test IDS-2. The complete R code and data set is provided as online supplemental material

    Considering Horn’s parallel analysis from a random matrix theory point of view

    Get PDF
    Horn’s parallel analysis is a widely used method for assessing the number of principal components and common factors. We discuss the theoretical foundations of parallel analysis for principal components based on a covariance matrix by making use of arguments from random matrix theory. In particular, we show that (i) for the first component, parallel analysis is an inferential method equivalent to the Tracy–Widom test, (ii) its use to test high-order eigenvalues is equivalent to the use of the joint distribution of the eigenvalues, and thus should be discouraged, and (iii) a formal test for higher-order components can be obtained based on a Tracy–Widom approximation. We illustrate the performance of the two testing procedures using simulated data generated under both a principal component model and a common factors model. For the principal component model, the Tracy–Widom test performs consistently in all conditions, while parallel analysis shows unpredictable behavior for higher-order components. For the common factor model, including major and minor factors, both procedures are heuristic approaches, with variable performance. We conclude that the Tracy–Widom procedure is preferred over parallel analysis for statistically testing the number of principal components based on a covariance matrix.<p>Horn’s parallel analysis is a widely used method for assessing the number of principal components and common factors. We discuss the theoretical foundations of parallel analysis for principal components based on a covariance matrix by making use of arguments from random matrix theory. In particular, we show that (i) for the first component, parallel analysis is an inferential method equivalent to the Tracy–Widom test, (ii) its use to test high-order eigenvalues is equivalent to the use of the joint distribution of the eigenvalues, and thus should be discouraged, and (iii) a formal test for higher-order components can be obtained based on a Tracy–Widom approximation. We illustrate the performance of the two testing procedures using simulated data generated under both a principal component model and a common factors model. For the principal component model, the Tracy–Widom test performs consistently in all conditions, while parallel analysis shows unpredictable behavior for higher-order components. For the common factor model, including major and minor factors, both procedures are heuristic approaches, with variable performance. We conclude that the Tracy–Widom procedure is preferred over parallel analysis for statistically testing the number of principal components based on a covariance matrix.</p

    Trajectories of Emotion Recognition Training in Virtual Reality and Predictors of Improvement for People with a Psychotic Disorder

    Get PDF
    Meta-analyses have found that social cognition training (SCT) has large effects on the emotion recognition ability of people with a psychotic disorder. Virtual reality (VR) could be a promising tool for delivering SCT. Presently, it is unknown how improvements in emotion recognition develop during (VR-)SCT, which factors impact improvement, and how improvements in VR relate to improvement outside VR. Data were extracted from task logs from a pilot study and randomized controlled trials on VR-SCT (n = 55). Using mixed-effects generalized linear models, we examined the: (a) effect of treatment session (1-5) on VR accuracy and VR response time for correct answers; (b) main effects and moderation of participant and treatment characteristics on VR accuracy; and (c) the association between baseline performance on the Ekman 60 Faces task and accuracy in VR, and the interaction of Ekman 60 Faces change scores (i.e., post-treatment - baseline) with treatment session. Accounting for the task difficulty level and the type of presented emotion, participants became more accurate at the VR task (b = 0.20, p &lt; 0.001) and faster (b = -0.10, p &lt; 0.001) at providing correct answers as treatment sessions progressed. Overall emotion recognition accuracy in VR decreased with age (b = -0.34, p = 0.009); however, no significant interactions between any of the moderator variables and treatment session were found. An association between baseline Ekman 60 Faces and VR accuracy was found (b = 0.04, p = 0.006), but no significant interaction between difference scores and treatment session. Emotion recognition accuracy improved during VR-SCT, but improvements in VR may not generalize to non-VR tasks and daily life.</p

    Insight Into Individual Differences in Emotion Dynamics With Clustering

    Get PDF
    Studying emotion dynamics through time series models is becoming increasingly popular in the social sciences. Across individuals, dynamics can be rather heterogeneous. To enable comparisons and generalizations of dynamics across groups of individuals, one needs sophisticated tools that express the essential similarities and differences. A way to proceed is to identify subgroups of people who are characterized by qualitatively similar emotion dynamics through dynamic clustering. So far, these methods assume equal generating processes for individuals per cluster. To avoid this overly restrictive assumption, we outline a probabilistic clustering approach based on a mixture model that clusters on individuals’ vector autoregressive coefficients. We evaluate the performance of the method and compare it with a nonprobabilistic method in a simulation study. The usefulness of the methods is illustrated using 366 ecological momentary assessment time series with external measures of depression and anxiety

    Inter-Individual Differences in Multivariate Time-Series:Latent Class Vector-Autoregressive Modeling

    Get PDF
    Theories of emotion regulation posit the existence of individual differences in emotion dynamics. Current multi-subject time-series models account for differences in dynamics across individuals only to a very limited extent. This results in an aggregation that may poorly apply at the individual level. We present the exploratory method of latent class vector-autoregressive modeling (LCVAR), which extends the timeseries models to include clustering of individuals with similar dynamic processes. LCVAR can identify individuals with similar emotion dynamics in intensive time-series, which may be of unequal length. The method performs excellently under a range of simulated conditions. The value of identifying clusters in time-series is illustrated using affect measures of 410 individuals, assessed at over 70 time points per individual. LCVAR discerned six clusters of distinct emotion dynamics with regard to diurnal patterns and augmentation and blunting processes between eight emotions

    Bayesian Gaussian distributional regression models for more efficient norm estimation

    Get PDF
    A test score on a psychological test is usually expressed as a normed score, representing its position relative to test scores in a reference population. These typically depend on predictor(s) such as age. The test score distribution conditional on predictors is estimated using regression, which may need large normative samples to estimate the relationships between the predictor(s) and the distribution characteristics properly. In this study, we examine to what extent this burden can be alleviated by using prior information in the estimation of new norms with Bayesian Gaussian distributional regression. In a simulation study, we investigate to what extent this norm estimation is more efficient and how robust it is to prior model deviations. We varied the prior type, prior misspecification and sample size. In our simulated conditions, using a fixed effects prior resulted in more efficient norm estimation than a weakly informative prior as long as the prior misspecification was not age dependent. With the proposed method and reasonable prior information, the same norm precision can be achieved with a smaller normative sample, at least in empirical problems similar to our simulated conditions. This may help test developers to achieve cost‐efficient high‐quality norms. The method is illustrated using empirical normative data from the IDS‐2 intelligence test

    Switching Principal Component Analysis for Modeling Means and Covariance Changes Over Time

    Get PDF
    Many psychological theories predict that cognitions, affect, action tendencies, and other variables change across time in mean level as well as in covariance structure. Often such changes are rather abrupt, because they are caused by sudden events. To capture such changes, one may repeatedly measure the variables under study for a single individual and examine whether the resulting multivariate time series contains a number of phases with different means and covariance structures. The latter task is challenging, however. First, in many cases, it is unknown how many phases there are and when new phases start. Second, often a rather large number of variables is involved, complicating the interpretation of the covariance pattern within each phase. To take up this challenge, we present switching principal component analysis (PCA). Switching PCA detects phases of consecutive observations or time points (in single subject data) with similar means and/or covariation structures, and performs a PCA per phase to yield insight into its covariance structure. An algorithm for fitting switching PCA solutions as well as a model selection procedure are presented and evaluated in a simulation study. Finally, we analyze empirical data on cardiorespiratory recordings
    corecore