15 research outputs found

    A graph-cut approach to image segmentation using an affinity graph based on l0−sparse representation of features

    No full text
    International audienceWe propose a graph-cut based image segmentation method by constructing an affinity graph using l0 sparse representation. Computing first oversegmented images, we associate with all segments, that we call superpixels, a collection of features. We find the sparse representation of each set of features over the dictionary of all features by solving a l0-minimization problem. Then, the connection information between superpixels is encoded as the non-zero representation coefficients, and the affinity of connected superpixels is derived by the corresponding representation error. This provides a l0 affinity graph that has interesting properties of long range and sparsity, and a suitable graph cut yields a segmentation. Experimental results on the BSD database demonstrate that our method provides perfectly semantic regions even with a constant segmentation number, but also that very competitive quantitative results are achieved

    Subspace Clustering and Active Learning with Constraints

    Get PDF
    Data representations can often be high-dimensional, whether it is due to the large number of collected / recorded features or due to how the data sources (e.g. images, texts) are processed. It is often the case that the main structure of the data can be summarised well in a lower dimensional subspace or multiple lower dimensional subspaces. Subspace clustering addresses the problem of simultaneously uncovering multiple subspace structures in the data and grouping the data according to their underlying subspace structures. The first contribution of this thesis is the development of a Subspace Clustering with Active Learning (SCAL) framework that is designed for Subspace Clustering. This framework allows clustering performance to improve in an effective and efficient manner over time, with the need to query only a small amount of labelling information. It also has the potential to be applied to more general subspace clustering methods, which has been further explored and developed in our next methodological contribution. The second contribution of this thesis is a unified active learning and constrained clustering framework for spectral-based subspace clustering methods. In this work, we propose a spectral-based subspace clustering methodology named Weighted Sparse Simplex Representation (WSSR). It has been demonstrated to have favourable performance against state-of-the-art spectral-based subspace clustering methods on both synthetic and real data. We also propose a flexible weighting scheme that can incorporate external information into the problem formulation, which leads to a constrained clustering extension of WSSR. We show that it can be applied in conjunction with our previously proposed SCAL strategy when labelling information can be queried sequentially. The third contribution of this thesis is the development of an algebraic subspace clustering methodology – Minimum Angle Clustering (MAC). It is motivated by the application of clustering Amazon products based on their titles when represented using the TF-IDF matrix, which is both sparse and high-dimensional. The proposed methodology is composed of two stages. In the first stage, it identifies a large number of subspaces in the data through the Reduced Row Echelon Form technique. In the second stage, we propose a new subspace proximity measure to construct an affinity matrix for the formed subspaces before spectral clustering is applied to obtain the final cluster labels. The proposed methodology has been shown to enjoy competitive performance against a number of well-established subspace clustering and document clustering techniques on the application of clustering Amazon product names

    l

    Get PDF
    We propose a l0 sparsity based approach to remove additive white Gaussian noise from a given image. To achieve this goal, we combine the local prior and global prior together to recover the noise-free values of pixels. The local prior depends on the neighborhood relationships of a search window to help maintain edges and smoothness. The global prior is generated from a hierarchical l0 sparse representation to help eliminate the redundant information and preserve the global consistency. In addition, to make the correlations between pixels more meaningful, we adopt Principle Component Analysis to measure the similarities, which can be both propitious to reduce the computational complexity and improve the accuracies. Experiments on the benchmark image set show that the proposed approach can achieve superior performance to the state-of-the-art approaches both in accuracy and perception in removing the zero-mean additive white Gaussian noise

    Joint Hypergraph Learning and Sparse Regression for Feature Selection

    Get PDF
    In this paper, we propose a unified framework for improved structure estimation and feature selection. Most existing graph-based feature selection methods utilise a static representation of the structure of the available data based on the Laplacian matrix of a simple graph. Here on the other hand, we perform data structure learning and feature selection simultaneously. To improve the estimation of the manifold representing the structure of the selected features, we use a higher order description of the neighbour- hood structures present in the available data using hypergraph learning. This allows those features which participate in the most significant higher order relations to be se- lected, and the remainder discarded, through a sparsification process. We formulate a single objective function to capture and regularise the hypergraph weight estimation and feature selection processes. Finally, we present an optimization algorithm to re- cover the hyper graph weights and a sparse set of feature selection indicators. This process offers a number of advantages. First, by adjusting the hypergraph weights, we preserve high-order neighborhood relations reflected in the original data, which cannot be modeled by a simple graph. Moreover, our objective function captures the global discriminative structure of the features in the data. Comprehensive experiments on 9 benchmark data sets show that our method achieves statistically significant improve- ment over state-of-art feature selection methods, supporting the effectiveness of the proposed method

    Robust and Efficient Data Clustering with Signal Processing on Graphs

    Get PDF
    Data is pervasive in today's world and has actually been for quite some time. With the increasing volume of data to process, there is a need for faster and at least as accurate techniques than what we already have. In particular, the last decade recorded the effervescence of social networks and ubiquitous sensing (through smartphones and the Internet of Things). These phenomena, including also the progresses in bioinformatics and traffic monitoring, pushed forward the research on graph analysis and called for more efficient techniques. Clustering is an important field of machine learning because it belongs to the unsupervised techniques (i.e., one does not need to possess a ground truth about the data to start learning). With it, one can extract meaningful patterns from large data sources without requiring an expert to annotate a portion of the data, which can be very costly. However, the techniques of clustering designed so far all tend to be computationally demanding and have trouble scaling with the size of today's problems. The emergence of Graph Signal Processing, attempting to apply traditional signal processing techniques on graphs instead of time, provided additional tools for efficient graph analysis. By considering the clustering assignment as a signal lying on the nodes of the graph, one may now apply the tools of GSP to the improvement of graph clustering and more generally data clustering at large. In this thesis, we present several techniques using some of the latest developments of GSP in order to improve the scalability of clustering, while aiming for an accuracy resembling that of Spectral Clustering, a famous graph clustering technique that possess a solid mathematical intuition. On the one hand, we explore the benefits of random signal filtering on a practical and theoretical aspect for the determination of the eigenvectors of the graph Laplacian. In practice, this attempt requires the design of polynomial approximations of the step function for which we provided an accelerated heuristic. We used this series of work in order to reduce the complexity of dynamic graphs clustering, the problem of defining a partition to a graph which is evolving in time at each snapshot. We also used them to propose a fast method for the determination of the subspace generated by the first eigenvectors of any symmetrical matrix. This element is useful for clustering as it serves in Spectral Clustering but it goes beyond that since it also serves in graph visualization (with Laplacian Eigenmaps) and data mining (with Principal Components Projection). On the other hand, we were inspired by the latest works on graph filter localization in order to propose an extremely fast clustering technique. We tried to perform clustering by only using graph filtering and combining the results in order to obtain a partition of the nodes. These different contributions are completed by experiments using both synthetic datasets and real-world problems. Since we think that research should be shared in order to progress, all the experiments made in this thesis are publicly available on my personal Github account

    Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems

    Full text link
    Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences. Today, AI has started to advance natural sciences by improving, accelerating, and enabling our understanding of natural phenomena at a wide range of spatial and temporal scales, giving rise to a new area of research known as AI for science (AI4Science). Being an emerging research paradigm, AI4Science is unique in that it is an enormous and highly interdisciplinary area. Thus, a unified and technical treatment of this field is needed yet challenging. This work aims to provide a technically thorough account of a subarea of AI4Science; namely, AI for quantum, atomistic, and continuum systems. These areas aim at understanding the physical world from the subatomic (wavefunctions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales and form an important subarea of AI4Science. A unique advantage of focusing on these areas is that they largely share a common set of challenges, thereby allowing a unified and foundational treatment. A key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods. We provide an in-depth yet intuitive account of techniques to achieve equivariance to symmetry transformations. We also discuss other common technical challenges, including explainability, out-of-distribution generalization, knowledge transfer with foundation and large language models, and uncertainty quantification. To facilitate learning and education, we provide categorized lists of resources that we found to be useful. We strive to be thorough and unified and hope this initial effort may trigger more community interests and efforts to further advance AI4Science

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum
    corecore