30 research outputs found
A Kriging procedure for processes indexed by graphs
International audienceWe provide a new kriging procedure of processes on graphs. Based on the construction of Gaussian random processes indexed by graphs, we extend to this framework the usual linear prediction method for spatial random fields, known as kriging. We provide the expression of the estimator of such a random field at unobserved locations as well as a control for the prediction error
A Statistical Test of Heterogeneous Subgraph Densities to Assess Clusterability
Determining if a graph displays a clustered structure prior to subjecting it to any cluster detection technique has recently gained attention in the literature. Attempts to group graph vertices into clusters when a graph does not have a clustered structure is not only a waste of time; it will also lead to misleading conclusions. To address this problem, we introduce a novel statistical test, the-test, which is based on comparisons of local and global densities. Our goal is to assess whether a given graph meets the necessary conditions to be meaningfully summarized by clusters of vertices. We empirically explore our test’s behavior under a number of graph structures. We also compare it to other recently published tests. From a theoretical standpoint, our test is more general, versatile and transparent than recently published competing techniques. It is based on the examination of intuitive quantities, applies equally to weighted and unweighted graphs and allows comparisons across graphs. More importantly, it does not rely on any distributional assumptions, other than the universally accepted definition of a clustered graph. Empirically, our test is shown to be more responsive to graph structure than other competing tests
Revisiting clustering as matrix factorisation on the Stiefel manifold
International audienceThis paper studies clustering for possibly high dimensional data (e.g. images, time series, gene expression data, and many other settings), and rephrase it as low rank matrix estimation in the PAC-Bayesian framework. Our approach leverages the well known Burer-Monteiro factorisation strategy from large scale optimisation, in the context of low rank estimation. Moreover, our Burer-Monteiro factors are shown to lie on a Stiefel manifold. We propose a new generalized Bayesian estimator for this problem and prove novel prediction bounds for clustering. We also devise a componentwise Langevin sampler on the Stiefel manifold to compute this estimator
Unbound states in quantum heterostructures
We report in this review on the electronic continuum states of semiconductor Quantum Wells and Quantum Dots and highlight the decisive part played by the virtual bound states in the optical properties of these structures. The two particles continuum states of Quantum Dots control the decoherence of the excited electron – hole states. The part played by Auger scattering in Quantum Dots is also discussed
Tests for Gaussian graphical models
Gaussian graphical models are promising tools for analysing genetic networks. In many applications, biologists have some knowledge of the genetic network and may want to assess the quality of their model using gene expression data. This is why one introduces a novel procedure for testing the neighborhoods of a Gaussian graphical model. It is based on the connection between the local Markov property and conditional regression of a Gaussian random variable. Adapting recent results on tests for high-dimensional Gaussian linear models, one proves that the testing procedure inherits appealing theoretical properties. Besides, it applies and is computationally feasible in a high-dimensional setting: the number of nodes may be much larger than the number of observations. A large part of the study is devoted to illustrating and discussing applications to simulated data and to biological data.
Conditional-mean least-squares fitting of Gaussian Markov random fields to Gaussian fields
This article discusses the following problem, often encountered when analyzing spatial lattice data. How can one construct a Gaussian Markov random field (GMRF), on a lattice, that reflects well the spatial-covariance properties present either in data or in prior knowledge? The Markov property on a spatial lattice implies spatial dependence expressed conditionally, which allows intuitively appealing site-by-site model building. There are also cases, such as in biological network analysis, where the Markov property has a deep scientific significance. Moreover, the model is often important for computational efficiency of Markov chain Monte Carlo algorithms. In this article, we introduce a new criterion to fit a GMRF to a given Gaussian field, where the Gaussian field is characterized by its spatial covariances. We establish that this criterion is computationally appealing, it can be used on both regular and irregular lattices, and both stationary and nonstationary fields can be fitted. © 2007 Elsevier B.V. All rights reserved
Community detection in dense random networks
International audienceWe formalize the problem of detecting a community in a network into testing whether in a given (random) graph there is a subgraph that is unusually dense. Specifically, we observe an undirected and unweighted graph on N nodes. Under the null hypothesis, the graph is a realization of an Erdős–Rényi graph with probability p0. Under the (composite) alternative, there is an unknown subgraph of n nodes where the probability of connection is p1>p0. We derive a detection lower bound for detecting such a subgraph in terms of N, n, p0, p1 and exhibit a test that achieves that lower bound. We do this both when p0 is known and unknown. We also consider the problem of testing in polynomial-time. As an aside, we consider the problem of detecting a clique, which is intimately related to the planted clique problem. Our focus in this paper is in the quasi-normal regime where np0 is either bounded away from zero, or tends to zero slowly