841 research outputs found

    A Survey on Evolutionary Co-Clustering Formulations for Mining Time-Varying Data Using Sparsity Learning

    Get PDF
    ABSTRACT: The data matrix is considered as static in Traditional clustering and feature selection methods. However, the data matrices evolve smoothly over time in many applications. A simple approach to learn from these time-evolving data matrices is to analyze them separately. Such strategy ignores the time-dependent nature of the underlying data. Two formulations are proposed for evolutionary co-clustering and feature selection based on the fused Lasso regularization. The evolutionary co-clustering formulation is able to identify smoothly varying data embedded into the matrices along with the temporal dimension. Formulation allows for imposing smoothness constraints over temporal dimension of the data matrices. The evolutionary feature selection formulation can uncover shared features in clustering from time-evolving data matrices

    Dynamic Tensor Clustering

    Full text link
    Dynamic tensor data are becoming prevalent in numerous applications. Existing tensor clustering methods either fail to account for the dynamic nature of the data, or are inapplicable to a general-order tensor. Also there is often a gap between statistical guarantee and computational efficiency for existing tensor clustering solutions. In this article, we aim to bridge this gap by proposing a new dynamic tensor clustering method, which takes into account both sparsity and fusion structures, and enjoys strong statistical guarantees as well as high computational efficiency. Our proposal is based upon a new structured tensor factorization that encourages both sparsity and smoothness in parameters along the specified tensor modes. Computationally, we develop a highly efficient optimization algorithm that benefits from substantial dimension reduction. In theory, we first establish a non-asymptotic error bound for the estimator from the structured tensor factorization. Built upon this error bound, we then derive the rate of convergence of the estimated cluster centers, and show that the estimated clusters recover the true cluster structures with a high probability. Moreover, our proposed method can be naturally extended to co-clustering of multiple modes of the tensor data. The efficacy of our approach is illustrated via simulations and a brain dynamic functional connectivity analysis from an Autism spectrum disorder study.Comment: Accepted at Journal of the American Statistical Associatio

    Transfer Learning via Contextual Invariants for One-to-Many Cross-Domain Recommendation

    Full text link
    The rapid proliferation of new users and items on the social web has aggravated the gray-sheep user/long-tail item challenge in recommender systems. Historically, cross-domain co-clustering methods have successfully leveraged shared users and items across dense and sparse domains to improve inference quality. However, they rely on shared rating data and cannot scale to multiple sparse target domains (i.e., the one-to-many transfer setting). This, combined with the increasing adoption of neural recommender architectures, motivates us to develop scalable neural layer-transfer approaches for cross-domain learning. Our key intuition is to guide neural collaborative filtering with domain-invariant components shared across the dense and sparse domains, improving the user and item representations learned in the sparse domains. We leverage contextual invariances across domains to develop these shared modules, and demonstrate that with user-item interaction context, we can learn-to-learn informative representation spaces even with sparse interaction data. We show the effectiveness and scalability of our approach on two public datasets and a massive transaction dataset from Visa, a global payments technology company (19% Item Recall, 3x faster vs. training separate models for each domain). Our approach is applicable to both implicit and explicit feedback settings.Comment: SIGIR 202

    Interpretable Machine Learning for Electro-encephalography

    Get PDF
    While behavioral, genetic and psychological markers can provide important information about brain health, research in that area over the last decades has much focused on imaging devices such as magnetic resonance tomography (MRI) to provide non-invasive information about cognitive processes. Unfortunately, MRI based approaches, able to capture the slow changes in blood oxygenation levels, cannot capture electrical brain activity which plays out on a time scale up to three orders of magnitude faster. Electroencephalography (EEG), which has been available in clinical settings for over 60 years, is able to measure brain activity based on rapidly changing electrical potentials measured non-invasively on the scalp. Compared to MRI based research into neurodegeneration, EEG based research has, over the last decade, received much less interest from the machine learning community. But generally, EEG in combination with sophisticated machine learning offers great potential such that neglecting this source of information, compared to MRI or genetics, is not warranted. In collaborating with clinical experts, the ability to link any results provided by machine learning to the existing body of research is especially important as it ultimately provides an intuitive or interpretable understanding. Here, interpretable means the possibility for medical experts to translate the insights provided by a statistical model into a working hypothesis relating to brain function. To this end, we propose in our first contribution a method allowing for ultra-sparse regression which is applied on EEG data in order to identify a small subset of important diagnostic markers highlighting the main differences between healthy brains and brains affected by Parkinson's disease. Our second contribution builds on the idea that in Parkinson's disease impaired functioning of the thalamus causes changes in the complexity of the EEG waveforms. The thalamus is a small region in the center of the brain affected early in the course of the disease. Furthermore, it is believed that the thalamus functions as a pacemaker - akin to a conductor of an orchestra - such that changes in complexity are expressed and quantifiable based on EEG. We use these changes in complexity to show their association with future cognitive decline. In our third contribution we propose an extension of archetypal analysis embedded into a deep neural network. This generative version of archetypal analysis allows to learn an appropriate representation where every sample of a data set can be decomposed into a weighted sum of extreme representatives, the so-called archetypes. This opens up an interesting possibility of interpreting a data set relative to its most extreme representatives. In contrast, clustering algorithms describe a data set relative to its most average representatives. For Parkinson's disease, we show based on deep archetypal analysis, that healthy brains produce archetypes which are different from those produced by brains affected by neurodegeneration

    Bayesian Approaches For Modeling Variation

    Get PDF
    A core focus of statistics is determining how much of the variation in data may be attributed to the signal of interest, and how much to noise. When the sources of variation are many and complex, a Bayesian approach to data analysis offers a number of advantages. In this thesis, we propose and implement new Bayesian methods for modeling variation in two general settings. The first setting is high-dimensional linear regression where the unknown error variance is also of interest. Here, we show that a commonly used class of conjugate shrinkage priors can lead to underestimation of the error variance. We then extend the Spike-and-Slab Lasso (SSL, Rockova and George, 2018) to the unknown variance case, using an alternative, independent prior framework. This extended procedure outperforms both the fixed variance approach and alternative penalized likelihood methods on both simulated and real data. For the second setting, we move from univariate response data where the predictors are known, to multivariate response data in which potential predictors are unobserved. In this setting, we first consider the problem of biclustering, where a motivating example is to find subsets of genes which have similar expression in a subset of patients. For this task, we propose a new biclustering method called Spike-and-Slab Lasso Biclustering (SSLB). SSLB utilizes the SSL prior to find a doubly-sparse factorization of the data matrix via a fast EM algorithm. Applied to both a microarray dataset and a single-cell RNA-sequencing dataset, SSLB recovers biologically meaningful signal in the data. The second problem we consider in this setting is nonlinear factor analysis. The goal here is to find low-dimensional, unobserved ``factors\u27\u27 which drive the variation in the high-dimensional observed data in a potentially nonlinear fashion. For this purpose, we develop factor analysis BART (faBART), an MCMC algorithm which alternates sampling from the posterior of (a) the factors and (b) a functional approximation to the mapping from the factors to the data. The latter step utilizes Bayesian Additive Regression Trees (BART, Chipman et al., 2010). On a variety of simulation settings, we demonstrate that with only the observed data as the input, faBART is able to recover both the unobserved factors and the nonlinear mapping
    • …
    corecore