135 research outputs found
Data driven estimation of Laplace-Beltrami operator
Approximations of Laplace-Beltrami operators on manifolds through graph
Lapla-cians have become popular tools in data analysis and machine learning.
These discretized operators usually depend on bandwidth parameters whose tuning
remains a theoretical and practical problem. In this paper, we address this
problem for the unnormalized graph Laplacian by establishing an oracle
inequality that opens the door to a well-founded data-driven procedure for the
bandwidth selection. Our approach relies on recent results by Lacour and
Massart [LM15] on the so-called Lepski's method
Semantic distillation: a method for clustering objects by their contextual specificity
Techniques for data-mining, latent semantic analysis, contextual search of
databases, etc. have long ago been developed by computer scientists working on
information retrieval (IR). Experimental scientists, from all disciplines,
having to analyse large collections of raw experimental data (astronomical,
physical, biological, etc.) have developed powerful methods for their
statistical analysis and for clustering, categorising, and classifying objects.
Finally, physicists have developed a theory of quantum measurement, unifying
the logical, algebraic, and probabilistic aspects of queries into a single
formalism. The purpose of this paper is twofold: first to show that when
formulated at an abstract level, problems from IR, from statistical data
analysis, and from physical measurement theories are very similar and hence can
profitably be cross-fertilised, and, secondly, to propose a novel method of
fuzzy hierarchical clustering, termed \textit{semantic distillation} --
strongly inspired from the theory of quantum measurement --, we developed to
analyse raw data coming from various types of experiments on DNA arrays. We
illustrate the method by analysing DNA arrays experiments and clustering the
genes of the array according to their specificity.Comment: Accepted for publication in Studies in Computational Intelligence,
Springer-Verla
Signaux stationnaires sur graphe : étude d'un cas réel
National audienceBased on a real geographical dataset, we apply the stationarity characterisation of a graph signal, through the analysis of its spectral decomposition. In the course, we identify possible sources of non-stationarity and we elaborate on the impact of the graph used to model the structural coherence of the data.Sur un jeu de données géographiques réelles, nous appliquons la caractérisation de la propriété de stationnarité d'un signal sur graphe via l'analyse de ses coefficients spectraux. Nous identifions différentes sources possibles de non-stationnarité et isolons l'influence qu'a le graphe sous-jacent sur la cohérence structurelle des données
Making Laplacians commute
In this paper, we construct multimodal spectral geometry by finding a pair of
closest commuting operators (CCO) to a given pair of Laplacians. The CCOs are
jointly diagonalizable and hence have the same eigenbasis. Our construction
naturally extends classical data analysis tools based on spectral geometry,
such as diffusion maps and spectral clustering. We provide several synthetic
and real examples of applications in dimensionality reduction, shape analysis,
and clustering, demonstrating that our method better captures the inherent
structure of multi-modal data
Hearing the clusters in a graph: A distributed algorithm
We propose a novel distributed algorithm to cluster graphs. The algorithm
recovers the solution obtained from spectral clustering without the need for
expensive eigenvalue/vector computations. We prove that, by propagating waves
through the graph, a local fast Fourier transform yields the local component of
every eigenvector of the Laplacian matrix, thus providing clustering
information. For large graphs, the proposed algorithm is orders of magnitude
faster than random walk based approaches. We prove the equivalence of the
proposed algorithm to spectral clustering and derive convergence rates. We
demonstrate the benefit of using this decentralized clustering algorithm for
community detection in social graphs, accelerating distributed estimation in
sensor networks and efficient computation of distributed multi-agent search
strategies
- …