14 research outputs found

    Clusterwise Independent Component Analysis (C-ICA): using fMRI resting state networks to cluster subjects and find neurofunctional subtypes

    Get PDF
    Background: FMRI resting state networks (RSNs) are used to characterize brain disorders. They also show extensive heterogeneity across patients. Identifying systematic differences between RSNs in patients, i.e. discovering neurofunctional subtypes, may further increase our understanding of disease heterogeneity. Currently, no methodology is available to estimate neurofunctional subtypes and their associated RSNs simultaneously.New method: We present an unsupervised learning method for fMRI data, called Clusterwise Independent Component Analysis (C-ICA). This enables the clustering of patients into neurofunctional subtypes based on differences in shared ICA-derived RSNs. The parameters are estimated simultaneously, which leads to an improved estimation of subtypes and their associated RSNs.Results: In five simulation studies, the C-ICA model is successfully validated using both artificially and realistically simulated data (N = 30-40). The successful performance of the C-ICA model is also illustrated on an empirical data set consisting of Alzheimer's disease patients and elderly control subjects (N = 250). C-ICA is able to uncover a meaningful clustering that partially matches (balanced accuracy = .72) the diagnostic labels and identifies differences in RSNs between the Alzheimer and control cluster. Comparison with other methods: Both in the simulation study and the empirical application, C-ICA yields better results compared to competing clustering methods (i.e., a two step clustering procedure based on single subject ICA's and a Group ICA plus dual regression variant thereof) that do not simultaneously estimate a clustering and associated RSNs. Indeed, the overall mean adjusted Rand Index, a measure for cluster recovery, equals 0.65 for C-ICA and ranges from 0.27 to 0.46 for competing methods.Conclusions: The successful performance of C-ICA indicates that it is a promising method to extract neuro-functional subtypes from multi-subject resting state-fMRI data. This method can be applied on fMRI scans of patient groups to study (neurofunctional) subtypes, which may eventually further increase understanding of disease heterogeneity.Multivariate analysis of psychological dat

    CLADAG 2021 BOOK OF ABSTRACTS AND SHORT PAPERS

    Get PDF
    The book collects the short papers presented at the 13th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS). The meeting has been organized by the Department of Statistics, Computer Science and Applications of the University of Florence, under the auspices of the Italian Statistical Society and the International Federation of Classification Societies (IFCS). CLADAG is a member of the IFCS, a federation of national, regional, and linguistically-based classification societies. It is a non-profit, non-political scientific organization, whose aims are to further classification research

    Potential of psychological information to support knowledge discovery in consumer debt analysis

    Get PDF
    In this work, we develop a Data Mining framework to explore the multifaceted nature of consumer indebtedness. Data Mining with its numerous techniques and methods poses as a powerful toolbox to handle the sensitivity of these data and explore the psychological aspects of this social phenomenon. Thus, we begin with a series of transformations that deal with any inconsistencies the data may contain but more importantly they capture the essential psychological information hidden in the data and represent it in a new feature space as behavioural data. Then, we propose a novel consensus clustering framework to uncover patterns of consumer behaviour which draws upon the ability of cluster ensembles to reveal robust clusters from diffcult datasets. Our Homals Consensus, models successfully the relationships between different clusterings in the cluster ensemble and manages to uncover representative clusters that are more suitable for explaining the complex patterns of a socio-economic dataset. Finally under a supervised learning approach the behavioural aspects of consumer indebtedness are assessed. In more detail, we take advantage of the exibility Neural Networks provide in determining their architecture in order to propose a novel Neural Network solution, named TopDNN, that can handle non-linearities in the data and takes into account the extracted behavioural knowledge by incorporating it in the model. All the above sketch an elaborate framework that can reveal the potential of the behavioural data to support Knowledge Discovery in Consumer Debt Analysis on one hand and the ability of Data Mining to supplement existing models and theories of complex and sensitive nature on the other

    Potential of psychological information to support knowledge discovery in consumer debt analysis

    Get PDF
    In this work, we develop a Data Mining framework to explore the multifaceted nature of consumer indebtedness. Data Mining with its numerous techniques and methods poses as a powerful toolbox to handle the sensitivity of these data and explore the psychological aspects of this social phenomenon. Thus, we begin with a series of transformations that deal with any inconsistencies the data may contain but more importantly they capture the essential psychological information hidden in the data and represent it in a new feature space as behavioural data. Then, we propose a novel consensus clustering framework to uncover patterns of consumer behaviour which draws upon the ability of cluster ensembles to reveal robust clusters from diffcult datasets. Our Homals Consensus, models successfully the relationships between different clusterings in the cluster ensemble and manages to uncover representative clusters that are more suitable for explaining the complex patterns of a socio-economic dataset. Finally under a supervised learning approach the behavioural aspects of consumer indebtedness are assessed. In more detail, we take advantage of the exibility Neural Networks provide in determining their architecture in order to propose a novel Neural Network solution, named TopDNN, that can handle non-linearities in the data and takes into account the extracted behavioural knowledge by incorporating it in the model. All the above sketch an elaborate framework that can reveal the potential of the behavioural data to support Knowledge Discovery in Consumer Debt Analysis on one hand and the ability of Data Mining to supplement existing models and theories of complex and sensitive nature on the other

    An exploration of methodologies to improve semi-supervised hierarchical clustering with knowledge-based constraints

    Get PDF
    Clustering algorithms with constraints (also known as semi-supervised clustering algorithms) have been introduced to the field of machine learning as a significant variant to the conventional unsupervised clustering learning algorithms. They have been demonstrated to achieve better performance due to integrating prior knowledge during the clustering process, that enables uncovering relevant useful information from the data being clustered. However, the research conducted within the context of developing semi-supervised hierarchical clustering techniques are still an open and active investigation area. Majority of current semi-supervised clustering algorithms are developed as partitional clustering (PC) methods and only few research efforts have been made on developing semi-supervised hierarchical clustering methods. The aim of this research is to enhance hierarchical clustering (HC) algorithms based on prior knowledge, by adopting novel methodologies. [Continues.

    Value Measurement for New Product Category: a Conjoint Approach to Eliciting Value Structure

    Get PDF
    Ability to measure value from the customer\u27s point of view is central to the determination of market offerings: Customers will only buy the equivalent of perceived value, and companies can only offer benefits that cost less to provide than customers are willing to pay. Conjoint analysis is the most popular individual-level value measurement method to determine relative impact of product or service attributes on preferences and other dependent variables. This research focuses on how value measurement can be made more accurate and more reliable by measuring the relative influence of selected methodological variations on performance in prediction and on stability of value structure, and by grouping customers with similar value structure into segments which respond to product stimuli in a similar manner. Influences of the type of attributes included in the conjoint task, of the factorial design used to construct the product profiles, of the type and form of model, of the time of measurement, and of the type of cluster-based segmentation method, are evaluated. Data was gathered with a questionnaire that controlled for methodological variations, and with a notebook computer as the measurement object. One repeated measurement was taken. The study was conducted in two phases. In Phase I, influences of methodological variations on accuracy in prediction and on respective value structure were examined. In Phase II, different cluster-based segmentation methods--hierarchical clustering (HIC), non-hierarchical clustering (NHC), and fuzzy c-means clustering (FUC)--and according conjoint models were evaluated for their performance in prediction and in comparison with individual-level conjoint models. Results show the best models for a variety of design parameters are traditional individual-level, main-effects-only conjoint models. Neither modeling of interactions, nor segment-level conjoint models were able to improve on prediction. Best segment-level conjoint models were obtained with a fuzzy clustering method, worst models were obtained with k-means and the most fuzzy clustering approach. In conclusion, conjoint analysis reveals itself as a reliable method to measure individual customer value. It seems more rewarding for improvement of accuracy in prediction to apply repeated measures, or gather additional data about the respondent, than to attempt improvement on methodological variations with a single measurement

    A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium

    Get PDF
    When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its ρ parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available

    A Statistical Approach to the Alignment of fMRI Data

    Get PDF
    Multi-subject functional Magnetic Resonance Image studies are critical. The anatomical and functional structure varies across subjects, so the image alignment is necessary. We define a probabilistic model to describe functional alignment. Imposing a prior distribution, as the matrix Fisher Von Mises distribution, of the orthogonal transformation parameter, the anatomical information is embedded in the estimation of the parameters, i.e., penalizing the combination of spatially distant voxels. Real applications show an improvement in the classification and interpretability of the results compared to various functional alignment methods

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio
    corecore