252 research outputs found

    EFFICIENT APPROXIMATION FOR LARGE-SCALE KERNEL CLUSTERING ANALYSIS

    Get PDF
    Kernel k-means is useful for performing clustering on nonlinearly separable data. The kernel k-means is hard to scale to large data due to the quadratic complexity. In this paper, we propose an approach which utilizes the low-dimensional feature approximation of the Gaussian kernel function to capitalize a fast linear k-means solver to perform the nonlinear kernel k-means. This approach takes advantage of the efficiency of the linear solver and the nonlinear partitioning ability of the kernel clustering. The experimental results show that the proposed approach is much more efficient than a normal kernel k- means solver and achieves similar clustering performance

    Selective inference after convex clustering with 1\ell_1 penalization

    Full text link
    Classical inference methods notoriously fail when applied to data-driven test hypotheses or inference targets. Instead, dedicated methodologies are required to obtain statistical guarantees for these selective inference problems. Selective inference is particularly relevant post-clustering, typically when testing a difference in mean between two clusters. In this paper, we address convex clustering with 1\ell_1 penalization, by leveraging related selective inference tools for regression, based on Gaussian vectors conditioned to polyhedral sets. In the one-dimensional case, we prove a polyhedral characterization of obtaining given clusters, than enables us to suggest a test procedure with statistical guarantees. This characterization also allows us to provide a computationally efficient regularization path algorithm. Then, we extend the above test procedure and guarantees to multi-dimensional clustering with 1\ell_1 penalization, and also to more general multi-dimensional clusterings that aggregate one-dimensional ones. With various numerical experiments, we validate our statistical guarantees and we demonstrate the power of our methods to detect differences in mean between clusters. Our methods are implemented in the R package poclin.Comment: 40 pages, 8 figure

    Roto-Translation Covariant Convolutional Networks for Medical Image Analysis

    Full text link
    We propose a framework for rotation and translation covariant deep learning using SE(2)SE(2) group convolutions. The group product of the special Euclidean motion group SE(2)SE(2) describes how a concatenation of two roto-translations results in a net roto-translation. We encode this geometric structure into convolutional neural networks (CNNs) via SE(2)SE(2) group convolutional layers, which fit into the standard 2D CNN framework, and which allow to generically deal with rotated input samples without the need for data augmentation. We introduce three layers: a lifting layer which lifts a 2D (vector valued) image to an SE(2)SE(2)-image, i.e., 3D (vector valued) data whose domain is SE(2)SE(2); a group convolution layer from and to an SE(2)SE(2)-image; and a projection layer from an SE(2)SE(2)-image to a 2D image. The lifting and group convolution layers are SE(2)SE(2) covariant (the output roto-translates with the input). The final projection layer, a maximum intensity projection over rotations, makes the full CNN rotation invariant. We show with three different problems in histopathology, retinal imaging, and electron microscopy that with the proposed group CNNs, state-of-the-art performance can be achieved, without the need for data augmentation by rotation and with increased performance compared to standard CNNs that do rely on augmentation.Comment: 8 pages, 2 figures, 1 table, accepted at MICCAI 201

    Quantum Cosmological Relational Model of Shape and Scale in 1-d

    Full text link
    Relational particle models are useful toy models for quantum cosmology and the problem of time in quantum general relativity. This paper shows how to extend existing work on concrete examples of relational particle models in 1-d to include a notion of scale. This is useful as regards forming a tight analogy with quantum cosmology and the emergent semiclassical time and hidden time approaches to the problem of time. This paper shows furthermore that the correspondence between relational particle models and classical and quantum cosmology can be strengthened using judicious choices of the mechanical potential. This gives relational particle mechanics models with analogues of spatial curvature, cosmological constant, dust and radiation terms. A number of these models are then tractable at the quantum level. These models can be used to study important issues 1) in canonical quantum gravity: the problem of time, the semiclassical approach to it and timeless approaches to it (such as the naive Schrodinger interpretation and records theory). 2) In quantum cosmology, such as in the investigation of uniform states, robustness, and the qualitative understanding of the origin of structure formation.Comment: References and some more motivation adde

    Land use mapping and modelling for the Phoenix Quadrangle

    Get PDF
    The author has identified the following significant results. The mapping of generalized land use (level 1) from ERTS 1 images was shown to be feasible with better than 95% accuracy in the Phoenix quadrangle. The accuracy of level 2 mapping in urban areas is still a problem. Updating existing maps also proved to be feasible, especially in water categories and agricultural uses; however, expanding urban growth has presented with accuracy. ERTS 1 film images indicated where areas of change were occurring, thus aiding focusing-in for more detailed investigation. ERTS color composite transparencies provided a cost effective source of information for land use mapping of very large regions at small map scales

    Onset of an outline map to get a hold on the wildwood of clustering methods

    Full text link
    The domain of cluster analysis is a meeting point for a very rich multidisciplinary encounter, with cluster-analytic methods being studied and developed in discrete mathematics, numerical analysis, statistics, data analysis and data science, and computer science (including machine learning, data mining, and knowledge discovery), to name but a few. The other side of the coin, however, is that the domain suffers from a major accessibility problem as well as from the fact that it is rife with division across many pretty isolated islands. As a way out, the present paper offers an outline map for the clustering domain as a whole, which takes the form of an overarching conceptual framework and a common language. With this framework we wish to contribute to structuring the domain, to characterizing methods that have often been developed and studied in quite different contexts, to identifying links between them, and to introducing a frame of reference for optimally setting up cluster analyses in data-analytic practice.Comment: 33 pages, 4 figure
    corecore