11 research outputs found

    Combining electro- and magnetoencephalography data using directional archetypal analysis

    Get PDF
    Metastable microstates in electro- and magnetoencephalographic (EEG and MEG) measurements are usually determined using modified k-means accounting for polarity invariant states. However, hard state assignment approaches assume that the brain traverses microstates in a discrete rather than continuous fashion. We present multimodal, multisubject directional archetypal analysis as a scale and polarity invariant extension to archetypal analysis using a loss function based on the Watson distribution. With this method, EEG/MEG microstates are modeled using subject- and modality-specific archetypes that are representative, distinct topographic maps between which the brain continuously traverses. Archetypes are specified as convex combinations of unit norm input data based on a shared generator matrix, thus assuming that the timing of neural responses to stimuli is consistent across subjects and modalities. The input data is reconstructed as convex combinations of archetypes using a subject- and modality-specific continuous archetypal mixing matrix. We showcase the model on synthetic data and an openly available face perception event-related potential data set with concurrently recorded EEG and MEG. In synthetic and unimodal experiments, we compare our model to conventional Euclidean multisubject archetypal analysis. We also contrast our model to a directional clustering model with discrete state assignments to highlight the advantages of modeling state trajectories rather than hard assignments. We find that our approach successfully models scale and polarity invariant data, such as microstates, accounting for intersubject and intermodal variability. The model is readily extendable to other modalities ensuring component correspondence while elucidating spatiotemporal signal variability

    A data-driven classification of 3D foot types by archetypal shapes based on landmarks

    Get PDF
    The taxonomy of foot shapes or other parts of the body is important, especially for design purposes. We propose a methodology based on archetypoid analysis (ADA) that overcomes the weaknesses of previous methodologies used to establish typologies. ADA is an objective, data-driven methodology that seeks extreme patterns, the archetypal profiles in the data. ADA also explains the data as percentages of the archetypal patterns, which makes this technique understandable and accessible even for non-experts. Clustering techniques are usually considered for establishing taxonomies, but we will show that finding the purest or most extreme patterns is more appropriate than using the central points returned by clustering techniques. We apply the methodology to an anthropometric database of 775 3D right foot scans representing the Spanish adult female and male population for footwear design. Each foot is described by a 5626 × 3 configuration matrix of landmarks. No multivariate features are used for establishing the taxonomy, but all the information gathered from the 3D scanning is employed. We use ADA for shapes described by landmarks. Women’s and men’s feet are analyzed separately. We have analyzed 3 archetypal feet for both men and women. These archetypal feet could not have been recovered using multivariate techniques

    Finding archetypal patterns for binary questionnaires

    Get PDF
    Archetypal analysis is an exploratory tool that explains a set of observations as mixtures of pure (extreme) patterns. If the patterns are actual observations of the sample, we refer to them as archetypoids. For the first time, we propose to use archetypoid analysis for binary observations. This tool can contribute to the understanding of a binary data set, as in the multivariate case. We illustrate the advantages of the proposed methodology in a simulation study and two applications, one exploring objects (rows) and the other exploring items (columns). One is related to determining student skill set profiles and the other to describing item response functions

    Archetype analysis: A new subspace outlier detection approach

    Get PDF
    The problem of detecting outliers in multivariate data sets with continuous numerical features is addressed by a new method. This method combines projections into relevant subspaces by archetype analysis with a nearest neighbor algorithm, through an appropriate ensemble of the results. Our method is able to detect an anomaly in a simple data set with a linear correlation of two features, while other methods fail to recognize that anomaly. Our method performs among top in an extensive comparison with 23 state-of-the-art outlier detection algorithms with several benchmark data sets. Finally, a novel industrial data set is introduced, and an outlier analysis is carried out to improve the fit of footwear, since this kind of analysis has never been fully exploited in the anthropometric field.Funding for open access charge: CRUE-Universitat Jaume

    Classifying top economists using archetypoid analysis

    Get PDF
    Updating the study by Seiler and Wohlrabe (2013) we use archetypoid analysis to classify top economists. The approach allows us to identify typical characteristics of extreme (archetypal) values in a multivariate data set. In contrast to its predecessor, the archetypal analysis, archetypoids always represent actual observed units in the data. Using bibliometric data from 776 top economists we identify four archetypoids. These types represent solid, low, top and diligent performer. Each economist is assigned to one or more of these archetypoids

    Archetypal analysis with missing data: see all samples by looking at a few based on extreme profiles

    Get PDF
    In this paper we propose several methodologies for handling missing or incomplete data in Archetype analysis (AA) and Archetypoid analysis (ADA). AA seeks to find archetypes, which are convex combinations of data points, and to approximate the samples as mixtures of those archetypes. In ADA, the representative archetypal data belong to the sample, i.e. they are actual data points. With the proposed procedures, missing data are not discarded or previously filled by imputation and the theoretical properties regarding location of archetypes are guaranteed, unlike the previous approaches. The new procedures adapt the AA algorithm either by considering the missing values in the computation of the solution or by skipping them. In the first case, the solutions of previous approaches are modified in order to fulfill the theory and a new procedure is proposed, where the missing values are updated by the fitted values. In this second case, the procedure is based on the estimation of dissimilarities between samples and the projection of these dissimilarities in a new space, where AA or ADA is applied, and those results are used to provide a solution in the original space. A comparative analysis is carried out in a simulation study, with favorable results. The methodology is also applied to two real data sets: a well-known climate data set and a global development data set. We illustrate how these unsupervised methodologies allow complex data to be understood, even by non-experts

    Archetypal shapes based on landmarks and extension to handle missing data

    Get PDF
    Archetype and archetypoid analysis are extended to shapes. The objective is to find representative shapes. Archetypal shapes are pure (extreme) shapes. We focus on the case where the shape of an object is represented by a configuration matrix of landmarks. As shape space is not a vectorial space, we work in the tangent space, the linearized space about the mean shape. Then, each observation is approximated by a convex combination of actual observations (archetypoids) or archetypes, which are a convex combination of observations in the data set. These tools can contribute to the understanding of shapes, as in the usual multivariate case, since they lie somewhere between clustering and matrix factorization methods. A new simplex visualization tool is also proposed to provide a picture of the archetypal analysis results. We also propose new algorithms for performing archetypal analysis with missing data and its extension to incomplete shapes. A well-known data set is used to illustrate the methodologies developed. The proposed methodology is applied to an apparel design problem in children

    Bayesian Modelling of Functional Whole Brain Connectivity

    Get PDF

    Graph learning and its applications : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science, Massey University, Albany, Auckland, New Zealand

    Get PDF
    Since graph features consider the correlations between two data points to provide high-order information, i.e., more complex correlations than the low-order information which considers the correlations in the individual data, they have attracted much attention in real applications. The key of graph feature extraction is the graph construction. Previous study has demonstrated that the quality of the graph usually determines the effectiveness of the graph feature. However, the graph is usually constructed from the original data which often contain noise and redundancy. To address the above issue, graph learning is designed to iteratively adjust the graph and model parameters so that improving the quality of the graph and outputting optimal model parameters. As a result, graph learning has become a very popular research topic in traditional machine learning and deep learning. Although previous graph learning methods have been applied in many fields by adding a graph regularization to the objective function, they still have some issues to be addressed. This thesis focuses on the study of graph learning aiming to overcome the drawbacks in previous methods for different applications. We list the proposed methods as follows. • We propose a traditional graph learning method under supervised learning to consider the robustness and the interpretability of graph learning. Specifically, we propose utilizing self-paced learning to assign important samples with large weights, conducting feature selection to remove redundant features, and learning a graph matrix from the low dimensional data of the original data to preserve the local structure of the data. As a consequence, both important samples and useful features are used to select support vectors in the SVM framework. • We propose a traditional graph learning method under semi-supervised learning to explore parameter-free fusion of graph learning. Specifically, we first employ the discrete wavelet transform and Pearson correlation coefficient to obtain multiple fully connected Functional Connectivity brain Networks (FCNs) for every subject, and then learn a sparsely connected FCN for every subject. Finally, the ℓ1-SVM is employed to learn the important features and conduct disease diagnosis. • We propose a deep graph learning method to consider graph fusion of graph learning. Specifically, we first employ the Simple Linear Iterative Clustering (SLIC) method to obtain multi-scale features for every image, and then design a new graph fusion method to fine-tune features of every scale. As a result, the multi-scale feature fine-tuning, graph learning, and feature learning are embedded into a unified framework. All proposed methods are evaluated on real-world data sets, by comparing to state-of-the-art methods. Experimental results demonstrate that our methods outperformed all comparison methods
    corecore