163,997 research outputs found
Colouring the Square of the Cartesian Product of Trees
We prove upper and lower bounds on the chromatic number of the square of the
cartesian product of trees. The bounds are equal if each tree has even maximum
degree
Bounding right-arm rotation distances
Rotation distance measures the difference in shape between binary trees of
the same size by counting the minimum number of rotations needed to transform
one tree to the other. We describe several types of rotation distance where
restrictions are put on the locations where rotations are permitted, and
provide upper bounds on distances between trees with a fixed number of nodes
with respect to several families of these restrictions. These bounds are sharp
in a certain asymptotic sense and are obtained by relating each restricted
rotation distance to the word length of elements of Thompson's group F with
respect to different generating sets, including both finite and infinite
generating sets.Comment: 30 pages, 11 figures. This revised version corrects some typos and
has some clearer proofs of the results for the lower bounds and better
figure
Lower bounds for bootstrap percolation on Galton-Watson trees
Bootstrap percolation is a cellular automaton modelling the spread of an
`infection' on a graph. In this note, we prove a family of lower bounds on the
critical probability for -neighbour bootstrap percolation on Galton--Watson
trees in terms of moments of the offspring distributions. With this result we
confirm a conjecture of Bollob\'as, Gunderson, Holmgren, Janson and Przykucki.
We also show that these bounds are best possible up to positive constants not
depending on the offspring distribution.Comment: 7 page
Optimization bounds from the branching dual
We present a general method for obtaining strong bounds for discrete optimization problems that is based on a concept of branching duality. It can be applied when no useful integer programming model is available, and we illustrate this with the minimum bandwidth problem. The method strengthens a known bound for a given problem by formulating a dual problem whose feasible solutions are partial branching trees. It solves the dual problem with a “worst-bound” local search heuristic that explores neighboring partial trees. After proving some optimality properties of the heuristic, we show that it substantially improves known combinatorial bounds for the minimum bandwidth problem with a modest amount of computation. It also obtains significantly tighter bounds than depth-first and breadth-first branching, demonstrating that the dual perspective can lead to better branching strategies when the object is to find valid bounds.Accepted manuscrip
On PAC-Bayesian Bounds for Random Forests
Existing guarantees in terms of rigorous upper bounds on the generalization
error for the original random forest algorithm, one of the most frequently used
machine learning methods, are unsatisfying. We discuss and evaluate various
PAC-Bayesian approaches to derive such bounds. The bounds do not require
additional hold-out data, because the out-of-bag samples from the bagging in
the training process can be exploited. A random forest predicts by taking a
majority vote of an ensemble of decision trees. The first approach is to bound
the error of the vote by twice the error of the corresponding Gibbs classifier
(classifying with a single member of the ensemble selected at random). However,
this approach does not take into account the effect of averaging out of errors
of individual classifiers when taking the majority vote. This effect provides a
significant boost in performance when the errors are independent or negatively
correlated, but when the correlations are strong the advantage from taking the
majority vote is small. The second approach based on PAC-Bayesian C-bounds
takes dependencies between ensemble members into account, but it requires
estimating correlations between the errors of the individual classifiers. When
the correlations are high or the estimation is poor, the bounds degrade. In our
experiments, we compute generalization bounds for random forests on various
benchmark data sets. Because the individual decision trees already perform
well, their predictions are highly correlated and the C-bounds do not lead to
satisfactory results. For the same reason, the bounds based on the analysis of
Gibbs classifiers are typically superior and often reasonably tight. Bounds
based on a validation set coming at the cost of a smaller training set gave
better performance guarantees, but worse performance in most experiments
- …