5,358 research outputs found

    GC-Flow: A Graph-Based Flow Network for Effective Clustering

    Full text link
    Graph convolutional networks (GCNs) are \emph{discriminative models} that directly model the class posterior p(yx)p(y|\mathbf{x}) for semi-supervised classification of graph data. While being effective, as a representation learning approach, the node representations extracted from a GCN often miss useful information for effective clustering, because the objectives are different. In this work, we design normalizing flows that replace GCN layers, leading to a \emph{generative model} that models both the class conditional likelihood p(xy)p(\mathbf{x}|y) and the class prior p(y)p(y). The resulting neural network, GC-Flow, retains the graph convolution operations while being equipped with a Gaussian mixture representation space. It enjoys two benefits: it not only maintains the predictive power of GCN, but also produces well-separated clusters, due to the structuring of the representation space. We demonstrate these benefits on a variety of benchmark data sets. Moreover, we show that additional parameterization, such as that on the adjacency matrix used for graph convolutions, yields additional improvement in clustering.Comment: ICML 2023. Code is available at https://github.com/xztcwang/GCFlo

    From patterned response dependency to structured covariate dependency: categorical-pattern-matching

    Get PDF
    Data generated from a system of interest typically consists of measurements from an ensemble of subjects across multiple response and covariate features, and is naturally represented by one response-matrix against one covariate-matrix. Likely each of these two matrices simultaneously embraces heterogeneous data types: continuous, discrete and categorical. Here a matrix is used as a practical platform to ideally keep hidden dependency among/between subjects and features intact on its lattice. Response and covariate dependency is individually computed and expressed through mutliscale blocks via a newly developed computing paradigm named Data Mechanics. We propose a categorical pattern matching approach to establish causal linkages in a form of information flows from patterned response dependency to structured covariate dependency. The strength of an information flow is evaluated by applying the combinatorial information theory. This unified platform for system knowledge discovery is illustrated through five data sets. In each illustrative case, an information flow is demonstrated as an organization of discovered knowledge loci via emergent visible and readable heterogeneity. This unified approach fundamentally resolves many long standing issues, including statistical modeling, multiple response, renormalization and feature selections, in data analysis, but without involving man-made structures and distribution assumptions. The results reported here enhance the idea that linking patterns of response dependency to structures of covariate dependency is the true philosophical foundation underlying data-driven computing and learning in sciences.Comment: 32 pages, 10 figures, 3 box picture

    Semi-Supervised Generation with Cluster-aware Generative Models

    Get PDF
    Deep generative models trained with large amounts of unlabelled data have proven to be powerful within the domain of unsupervised learning. Many real life data sets contain a small amount of labelled data points, that are typically disregarded when training generative models. We propose the Cluster-aware Generative Model, that uses unlabelled information to infer a latent representation that models the natural clustering of the data, and additional labelled data points to refine this clustering. The generative performances of the model significantly improve when labelled information is exploited, obtaining a log-likelihood of -79.38 nats on permutation invariant MNIST, while also achieving competitive semi-supervised classification accuracies. The model can also be trained fully unsupervised, and still improve the log-likelihood performance with respect to related methods

    Auxiliary Deep Generative Models

    Get PDF
    Deep generative models parameterized by neural networks have recently achieved state-of-the-art performance in unsupervised and semi-supervised learning. We extend deep generative models with auxiliary variables which improves the variational approximation. The auxiliary variables leave the generative model unchanged but make the variational distribution more expressive. Inspired by the structure of the auxiliary variable we also propose a model with two stochastic layers and skip connections. Our findings suggest that more expressive and properly specified deep generative models converge faster with better results. We show state-of-the-art performance within semi-supervised learning on MNIST, SVHN and NORB datasets.Comment: Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 2016, JMLR: Workshop and Conference Proceedings volume 48, Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 201

    Diffeomorphic Transformations for Time Series Analysis: An Efficient Approach to Nonlinear Warping

    Full text link
    The proliferation and ubiquity of temporal data across many disciplines has sparked interest for similarity, classification and clustering methods specifically designed to handle time series data. A core issue when dealing with time series is determining their pairwise similarity, i.e., the degree to which a given time series resembles another. Traditional distance measures such as the Euclidean are not well-suited due to the time-dependent nature of the data. Elastic metrics such as dynamic time warping (DTW) offer a promising approach, but are limited by their computational complexity, non-differentiability and sensitivity to noise and outliers. This thesis proposes novel elastic alignment methods that use parametric \& diffeomorphic warping transformations as a means of overcoming the shortcomings of DTW-based metrics. The proposed method is differentiable \& invertible, well-suited for deep learning architectures, robust to noise and outliers, computationally efficient, and is expressive and flexible enough to capture complex patterns. Furthermore, a closed-form solution was developed for the gradient of these diffeomorphic transformations, which allows an efficient search in the parameter space, leading to better solutions at convergence. Leveraging the benefits of these closed-form diffeomorphic transformations, this thesis proposes a suite of advancements that include: (a) an enhanced temporal transformer network for time series alignment and averaging, (b) a deep-learning based time series classification model to simultaneously align and classify signals with high accuracy, (c) an incremental time series clustering algorithm that is warping-invariant, scalable and can operate under limited computational and time resources, and finally, (d) a normalizing flow model that enhances the flexibility of affine transformations in coupling and autoregressive layers.Comment: PhD Thesis, defended at the University of Navarra on July 17, 2023. 277 pages, 8 chapters, 1 appendi

    Hybrid Models with Deep and Invertible Features

    Full text link
    We propose a neural hybrid model consisting of a linear model defined on a set of features computed by a deep, invertible transformation (i.e. a normalizing flow). An attractive property of our model is that both p(features), the density of the features, and p(targets | features), the predictive distribution, can be computed exactly in a single feed-forward pass. We show that our hybrid model, despite the invertibility constraints, achieves similar accuracy to purely predictive models. Moreover the generative component remains a good model of the input features despite the hybrid optimization objective. This offers additional capabilities such as detection of out-of-distribution inputs and enabling semi-supervised learning. The availability of the exact joint density p(targets, features) also allows us to compute many quantities readily, making our hybrid model a useful building block for downstream applications of probabilistic deep learning.Comment: ICML 201
    corecore