745 research outputs found

    Group Factor Analysis

    Full text link
    Factor analysis provides linear factors that describe relationships between individual variables of a data set. We extend this classical formulation into linear factors that describe relationships between groups of variables, where each group represents either a set of related variables or a data set. The model also naturally extends canonical correlation analysis to more than two sets, in a way that is more flexible than previous extensions. Our solution is formulated as variational inference of a latent variable model with structural sparsity, and it consists of two hierarchical levels: The higher level models the relationships between the groups, whereas the lower models the observed variables given the higher level. We show that the resulting solution solves the group factor analysis problem accurately, outperforming alternative factor analysis based solutions as well as more straightforward implementations of group factor analysis. The method is demonstrated on two life science data sets, one on brain activation and the other on systems biology, illustrating its applicability to the analysis of different types of high-dimensional data sources

    Bayesian Group Factor Analysis

    Get PDF
    We introduce a factor analysis model that summarizes the dependencies between observed variable groups, instead of dependencies between individual variables as standard factor analysis does. A group may correspond to one view of the same set of objects, one of many data sets tied by co-occurrence, or a set of alternative variables collected from statistics tables to measure one property of interest. We show that by assuming group-wise sparse factors, active in a subset of the sets, the variation can be decomposed into factors explaining relationships between the sets and factors explaining away set-specific variation. We formulate the assumptions in a Bayesian model which provides the factors, and apply the model to two data analysis tasks, in neuroimaging and chemical systems biology.Comment: 9 pages, 5 figure

    Deep Latent Variable Model for Longitudinal Group Factor Analysis

    Full text link
    In many scientific problems such as video surveillance, modern genomic analysis, and clinical studies, data are often collected from diverse domains across time that exhibit time-dependent heterogeneous properties. It is important to not only integrate data from multiple sources (called multiview data), but also to incorporate time dependency for deep understanding of the underlying system. Latent factor models are popular tools for exploring multi-view data. However, it is frequently observed that these models do not perform well for complex systems and they are not applicable to time-series data. Therefore, we propose a generative model based on variational autoencoder and recurrent neural network to infer the latent dynamic factors for multivariate timeseries data. This approach allows us to identify the disentangled latent embeddings across multiple modalities while accounting for the time factor. We invoke our proposed model for analyzing three datasets on which we demonstrate the effectiveness and the interpretability of the model

    Mozart is still blue: a comparison of sensory and verbal scales to describe qualities in music

    Get PDF
    An experiment was carried out in order to assess the use of non-verbal sensory scales for evaluating perceived music qualities, by comparing them with the analogous verbal scales. Participants were divided into two groups; one group (SV) completed a set of non-verbal scales responses and then a set of verbal scales responses to short musical extracts. A second group (VS) completed the experiment in the reverse order. Our hypothesis was that the ratings of the SV group can provide information unmediated (or less mediated) by verbal association in a much stronger way than the VS group. Factor analysis performed separately on the SV group, the VS group and for all participants shows a recurring patterning of the majority of sensory scales versus the verbal scales into different factors. Such results suggest that the sensory scale items are indicative of a different semantic structure than the verbal scales in describing music, and so they are indexing different qualities (perhaps ineffable), making them potentially special contributors to understanding musical experience

    Ranking and Clustering Australian University Research Performance, 1998-2002

    Get PDF
    This paper clusters and ranks the research performance of thirty-seven Australian universities over the period 1998-2002. Research performance is measured according to audited numbers of PhD completions, publications and grants (in accordance with rules established by the Department of Education, Science and Training) and analysed in both total and per academic staff terms. Hierarchical cluster analysis supports a binary division between fifteen higher and twenty-two lower-performing universities, with the specification in per academic staff terms identifying the self-designated research intensive "Group of Eight" (Go8) universities, plus several others in the better-performing group. Factor analysis indicates that the top-three research performers are the Universities of Melbourne, Sydney and Queensland in terms of total research performance and the Universities of Melbourne, Adelaide and Western Australia in per academic staff terms.Higher education, hierarchical cluster analysis, research performance, factor analysis

    Characterizing unknown events in MEG data with group factor analysis

    Get PDF
    Many current neuroscientic experiments can be seen as data analysis problems with two or more data sources: brain activity and stimulus features or, as in this paper, activity of two brains. These setups have been analyzed with Canonical Correlation Analysis or its multiple-source probabilistic extension Group Factor Analysis, which capture statistical dependencies between the data sources in correlating components. We relax the assumption of global correlations and search for correlating signals related to discrete events. The assumption is that the sources correlate only during events with known timings, nferred from a stimulus stream for instance, but the type or nature of each event is not known. The unsupervised modelling of the events can then be viewed as a generalization of conditional averaging. We apply the model on two-person MEG measurements, in a demonstration task of identifying which of the two persons utters a word.Peer reviewe

    What to do when scalar invariance fails: The extended alignment method for multi-group factor analysis comparison of latent means across many groups

    Get PDF
    Scalar invariance is an unachievable ideal that in practice can only be approximated; often using potentially questionable approaches such as partial invariance based on a stepwise selection of parameter estimates with large modification indices. Study 1 demonstrates an extension of the power and flexibility of the alignment approach for comparing latent factor means in large-scale studies (30 OECD countries, 8 factors, 44 items, N = 249,840), for which scalar invariance is typically not supported in the traditional confirmatory factor analysis approach to measurement invariance(CFA-MI). Importantly, we introduce an alignment-within-CFA (AwC) approach, transforming alignment from a largely exploratory tool into a confirmatory tool, and enabling analyses that previously have not been possible with alignment (testing the invariance of uniquenesses and factor variances/covariances; multiple-group MIMIC models; contrasts on latent means) and structural equation models more generally. Specifically, it also allowed a comparison of gender differences in a 30-country MIMIC AwC (i.e., a SEM with gender as a covariate) and a 60-group AwC CFA (i.e., 30 countries Ă— 2 genders) analysis. Study 2, a simulation study following up issues raised in Study 1, showed that latent means were more accurately estimated with alignment than with the scalar CFA-MI, and particularly with partial invariance scalar models based on the heavily criticized stepwise selection strategy. In summary, alignment augmented by AwC provides applied researchers from diverse disciplines considerable flexibility to address substantively important issues when the traditional CFA-MI scalar model does not fit the data

    A penalized likelihood-based framework for single and multiple-group factor analysis models

    Get PDF
    Penalized factor analysis is an efficient technique that produces a factor loading matrix with many zero elements thanks to the introduction of sparsity-inducing penalties within the estimation process. Penalized models are generally less prone to instability in the estimation process and are easier to interpret and generalize than their unpenalized counterparts. However, sparse solutions and stable model selection procedures are only possible if the employed penalty is singular (non-differentiable) at the origin, which poses certain theoretical and computational challenges. This thesis proposes a general penalized likelihood-based estimation approach for normal linear factor analysis models. The framework builds upon differentiable approximations of non-differentiable penalties and a theoretically founded definition of degrees of freedom. The employed optimization algorithm exploits second-order analytical derivative information and is integrated with an automatic tuning parameter selection procedure that finds the optimal value of the tuning without resorting to grid-searches. Some theoretical aspects of the penalized estimator are discussed. The proposed approach is evaluated in an extensive simulation study and illustrated using a psychometric data set. As a meaningful addition, the illustrated framework is extended to multiple-group factor analysis models, which are commonly used in cross-national surveys. The employed penalty simultaneously induces sparsity and cross-group equality of loadings and intercepts. The automatic procedure proves particularly useful in this challenging context, as it allows for the estimation of the multiple tuning parameters that compose the penalty term in a fast, stable and efficient way. The merits of the proposed technique are demonstrated through numerical and empirical examples. All the necessary routines are integrated into the R package GJRM to enhance reproducible research and transparent dissemination of results
    • …
    corecore