Search CORE

42,862 research outputs found

Minimal spanning forests

Author: Lyons Russell
Peres Yuval
Schramm Oded
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

Minimal spanning forests on infinite graphs are weak limits of minimal spanning trees from finite subgraphs. These limits can be taken with free or wired boundary conditions and are denoted FMSF (free minimal spanning forest) and WMSF (wired minimal spanning forest), respectively. The WMSF is also the union of the trees that arise from invasion percolation started at all vertices. We show that on any Cayley graph where critical percolation has no infinite clusters, all the component trees in the WMSF have one end a.s. In

\mathbb{Z}^d

this was proved by Alexander [Ann. Probab. 23 (1995) 87--104], but a different method is needed for the nonamenable case. We also prove that the WMSF components are ``thin'' in a different sense, namely, on any graph, each component tree in the WMSF has

p_{\mathrm{c}}=1

a.s., where

p_{\mathrm{c}}

denotes the critical probability for having an infinite cluster in Bernoulli percolation. On the other hand, the FMSF is shown to be ``thick'': on any connected graph, the union of the FMSF and independent Bernoulli percolation (with arbitrarily small parameter) is a.s. connected. In conjunction with a recent result of Gaboriau, this implies that in any Cayley graph, the expected degree of the FMSF is at least the expected degree of the FSF (the weak limit of uniform spanning trees). We also show that the number of infinite clusters for Bernoulli(

p_{\mathrm{u}}

) percolation is at most the number of components of the FMSF, where

p_{\mathrm{u}}

denotes the critical probability for having a unique infinite cluster. Finally, an example is given to show that the minimal spanning tree measure does not have negative associations.Comment: Published at http://dx.doi.org/10.1214/009117906000000269 in the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics

Author: Boulesteix Anne-Laure
Janitza Silke
Kruppa Jochen
König Inke R.
Publication venue
Publication date: 25/07/2012
Field of study

The Random Forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and returns measures of variable importance. This paper synthesizes ten years of RF development with emphasis on applications to bioinformatics and computational biology. Special attention is given to practical aspects such as the selection of parameters, available RF implementations, and important pitfalls and biases of RF and its variable importance measures (VIMs). The paper surveys recent developments of the methodology relevant to bioinformatics as well as some representative examples of RF applications in this context and possible directions for future research

Open Access LMU