93 research outputs found

    Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data

    Full text link
    Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions (i.e., walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Furthermore, interpreting the resulting clusters is difficult, especially when the data is high-dimensional. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through alternating minimization, using a variation of the expectation maximization (EM) algorithm. We derive closed-form solutions to efficiently solve the two resulting subproblems in a scalable way, through dynamic programming and the alternating direction method of multipliers (ADMM), respectively. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile sensor dataset how TICC can be used to learn interpretable clusters in real-world scenarios.Comment: This revised version fixes two small typos in the published versio

    Fast Optimal Transport Averaging of Neuroimaging Data

    Full text link
    Knowing how the Human brain is anatomically and functionally organized at the level of a group of healthy individuals or patients is the primary goal of neuroimaging research. Yet computing an average of brain imaging data defined over a voxel grid or a triangulation remains a challenge. Data are large, the geometry of the brain is complex and the between subjects variability leads to spatially or temporally non-overlapping effects of interest. To address the problem of variability, data are commonly smoothed before group linear averaging. In this work we build on ideas originally introduced by Kantorovich to propose a new algorithm that can average efficiently non-normalized data defined over arbitrary discrete domains using transportation metrics. We show how Kantorovich means can be linked to Wasserstein barycenters in order to take advantage of an entropic smoothing approach. It leads to a smooth convex optimization problem and an algorithm with strong convergence guarantees. We illustrate the versatility of this tool and its empirical behavior on functional neuroimaging data, functional MRI and magnetoencephalography (MEG) source estimates, defined on voxel grids and triangulations of the folded cortical surface.Comment: Information Processing in Medical Imaging (IPMI), Jun 2015, Isle of Skye, United Kingdom. Springer, 201

    Parameter estimation for biochemical reaction networks using Wasserstein distances

    Get PDF
    We present a method for estimating parameters in stochastic models of biochemical reaction networks by fitting steady-state distributions using Wasserstein distances. We simulate a reaction network at different parameter settings and train a Gaussian process to learn the Wasserstein distance between observations and the simulator output for all parameters. We then use Bayesian optimization to find parameters minimizing this distance based on the trained Gaussian process. The effectiveness of our method is demonstrated on the three-stage model of gene expression and a genetic feedback loop for which moment-based methods are known to perform poorly. Our method is applicable to any simulator model of stochastic reaction networks, including Brownian Dynamics.Comment: 22 pages, 8 figures. Slight modifications/additions to the text; added new section (Section 4.4) and Appendi

    Machine-learning of atomic-scale properties based on physical principles

    Full text link
    We briefly summarize the kernel regression approach, as used recently in materials modelling, to fitting functions, particularly potential energy surfaces, and highlight how the linear algebra framework can be used to both predict and train from linear functionals of the potential energy, such as the total energy and atomic forces. We then give a detailed account of the Smooth Overlap of Atomic Positions (SOAP) representation and kernel, showing how it arises from an abstract representation of smooth atomic densities, and how it is related to several popular density-based representations of atomic structure. We also discuss recent generalisations that allow fine control of correlations between different atomic species, prediction and fitting of tensorial properties, and also how to construct structural kernels---applicable to comparing entire molecules or periodic systems---that go beyond an additive combination of local environments

    Generative Embedding for Model-Based Classification of fMRI Data

    Get PDF
    Decoding models, such as those underlying multivariate classification algorithms, have been increasingly used to infer cognitive or clinical brain states from measures of brain activity obtained by functional magnetic resonance imaging (fMRI). The practicality of current classifiers, however, is restricted by two major challenges. First, due to the high data dimensionality and low sample size, algorithms struggle to separate informative from uninformative features, resulting in poor generalization performance. Second, popular discriminative methods such as support vector machines (SVMs) rarely afford mechanistic interpretability. In this paper, we address these issues by proposing a novel generative-embedding approach that incorporates neurobiologically interpretable generative models into discriminative classifiers. Our approach extends previous work on trial-by-trial classification for electrophysiological recordings to subject-by-subject classification for fMRI and offers two key advantages over conventional methods: it may provide more accurate predictions by exploiting discriminative information encoded in ‘hidden’ physiological quantities such as synaptic connection strengths; and it affords mechanistic interpretability of clinical classifications. Here, we introduce generative embedding for fMRI using a combination of dynamic causal models (DCMs) and SVMs. We propose a general procedure of DCM-based generative embedding for subject-wise classification, provide a concrete implementation, and suggest good-practice guidelines for unbiased application of generative embedding in the context of fMRI. We illustrate the utility of our approach by a clinical example in which we classify moderately aphasic patients and healthy controls using a DCM of thalamo-temporal regions during speech processing. Generative embedding achieves a near-perfect balanced classification accuracy of 98% and significantly outperforms conventional activation-based and correlation-based methods. This example demonstrates how disease states can be detected with very high accuracy and, at the same time, be interpreted mechanistically in terms of abnormalities in connectivity. We envisage that future applications of generative embedding may provide crucial advances in dissecting spectrum disorders into physiologically more well-defined subgroups

    Reducing Crowding by Weakening Inhibitory Lateral Interactions in the Periphery with Perceptual Learning

    Get PDF
    We investigated whether lateral masking in the near-periphery, due to inhibitory lateral interactions at an early level of central visual processing, could be weakened by perceptual learning and whether learning transferred to an untrained, higher-level lateral masking known as crowding. The trained task was contrast detection of a Gabor target presented in the near periphery (4°) in the presence of co-oriented and co-aligned high contrast Gabor flankers, which featured different target-to-flankers separations along the vertical axis that varied from 2λ to 8λ. We found both suppressive and facilitatory lateral interactions at target-to-flankers distances (2λ - 4λ and 8λ, respectively) that were larger than those found in the fovea. Training reduces suppression but does not increase facilitation. Most importantly, we found that learning reduces crowding and improves contrast sensitivity, but has no effect on visual acuity (VA). These results suggest a different pattern of connectivity in the periphery with respect to the fovea as well as a different modulation of this connectivity via perceptual learning that not only reduces low-level lateral masking but also reduces crowding. These results have important implications for the rehabilitation of low-vision patients who must use peripheral vision to perform tasks, such as reading and refined figure-ground segmentation, which normal sighted subjects perform in the fovea
    • …
    corecore