7,516 research outputs found

    On PAC-Bayesian Bounds for Random Forests

    Full text link
    Existing guarantees in terms of rigorous upper bounds on the generalization error for the original random forest algorithm, one of the most frequently used machine learning methods, are unsatisfying. We discuss and evaluate various PAC-Bayesian approaches to derive such bounds. The bounds do not require additional hold-out data, because the out-of-bag samples from the bagging in the training process can be exploited. A random forest predicts by taking a majority vote of an ensemble of decision trees. The first approach is to bound the error of the vote by twice the error of the corresponding Gibbs classifier (classifying with a single member of the ensemble selected at random). However, this approach does not take into account the effect of averaging out of errors of individual classifiers when taking the majority vote. This effect provides a significant boost in performance when the errors are independent or negatively correlated, but when the correlations are strong the advantage from taking the majority vote is small. The second approach based on PAC-Bayesian C-bounds takes dependencies between ensemble members into account, but it requires estimating correlations between the errors of the individual classifiers. When the correlations are high or the estimation is poor, the bounds degrade. In our experiments, we compute generalization bounds for random forests on various benchmark data sets. Because the individual decision trees already perform well, their predictions are highly correlated and the C-bounds do not lead to satisfactory results. For the same reason, the bounds based on the analysis of Gibbs classifiers are typically superior and often reasonably tight. Bounds based on a validation set coming at the cost of a smaller training set gave better performance guarantees, but worse performance in most experiments

    Moment-Based Spectral Analysis of Random Graphs with Given Expected Degrees

    Get PDF
    In this paper, we analyze the limiting spectral distribution of the adjacency matrix of a random graph ensemble, proposed by Chung and Lu, in which a given expected degree sequence w‾nT=(w1(n),…,wn(n))\overline{w}_n^{^{T}} = (w^{(n)}_1,\ldots,w^{(n)}_n) is prescribed on the ensemble. Let ai,j=1\mathbf{a}_{i,j} =1 if there is an edge between the nodes {i,j}\{i,j\} and zero otherwise, and consider the normalized random adjacency matrix of the graph ensemble: An\mathbf{A}_n == [ai,j/n]i,j=1n [\mathbf{a}_{i,j}/\sqrt{n}]_{i,j=1}^{n}. The empirical spectral distribution of An\mathbf{A}_n denoted by Fn(⋅)\mathbf{F}_n(\mathord{\cdot}) is the empirical measure putting a mass 1/n1/n at each of the nn real eigenvalues of the symmetric matrix An\mathbf{A}_n. Under some technical conditions on the expected degree sequence, we show that with probability one, Fn(⋅)\mathbf{F}_n(\mathord{\cdot}) converges weakly to a deterministic distribution F(⋅)F(\mathord{\cdot}). Furthermore, we fully characterize this distribution by providing explicit expressions for the moments of F(⋅)F(\mathord{\cdot}). We apply our results to well-known degree distributions, such as power-law and exponential. The asymptotic expressions of the spectral moments in each case provide significant insights about the bulk behavior of the eigenvalue spectrum

    Next nearest neighbour Ising models on random graphs

    Full text link
    This paper develops results for the next nearest neighbour Ising model on random graphs. Besides being an essential ingredient in classic models for frustrated systems, second neighbour interactions interactions arise naturally in several applications such as the colour diversity problem and graphical games. We demonstrate ensembles of random graphs, including regular connectivity graphs, that have a periodic variation of free energy, with either the ratio of nearest to next nearest couplings, or the mean number of nearest neighbours. When the coupling ratio is integer paramagnetic phases can be found at zero temperature. This is shown to be related to the locked or unlocked nature of the interactions. For anti-ferromagnetic couplings, spin glass phases are demonstrated at low temperature. The interaction structure is formulated as a factor graph, the solution on a tree is developed. The replica symmetric and energetic one-step replica symmetry breaking solution is developed using the cavity method. We calculate within these frameworks the phase diagram and demonstrate the existence of dynamical transitions at zero temperature for cases of anti-ferromagnetic coupling on regular and inhomogeneous random graphs.Comment: 55 pages, 15 figures, version 2 with minor revisions, to be published J. Stat. Mec

    Machine learning techniques in joint default assessment

    Full text link
    This paper studies the consequences of capturing non-linear dependence among the covariates that drive the default of different obligors and the overall riskiness of their credit portfolio. Joint default modeling is, without loss of generality, the classical Bernoulli mixture model. Using an application to a credit card dataset we show that, even when Machine Learning techniques perform only slightly better than Logistic Regression in classifying individual defaults as a function of the covariates, they do outperform it at the portfolio level. This happens because they capture linear and non-linear dependence among the covariates, whereas Logistic Regression only captures linear dependence. The ability of Machine Learning methods to capture non-linear dependence among the covariates produces higher default correlation compared with Logistic Regression. As a consequence, on our data, Logistic Regression underestimates the riskiness of the credit portfolio

    Smoothed Complexity Theory

    Get PDF
    Smoothed analysis is a new way of analyzing algorithms introduced by Spielman and Teng (J. ACM, 2004). Classical methods like worst-case or average-case analysis have accompanying complexity classes, like P and AvgP, respectively. While worst-case or average-case analysis give us a means to talk about the running time of a particular algorithm, complexity classes allows us to talk about the inherent difficulty of problems. Smoothed analysis is a hybrid of worst-case and average-case analysis and compensates some of their drawbacks. Despite its success for the analysis of single algorithms and problems, there is no embedding of smoothed analysis into computational complexity theory, which is necessary to classify problems according to their intrinsic difficulty. We propose a framework for smoothed complexity theory, define the relevant classes, and prove some first hardness results (of bounded halting and tiling) and tractability results (binary optimization problems, graph coloring, satisfiability). Furthermore, we discuss extensions and shortcomings of our model and relate it to semi-random models.Comment: to be presented at MFCS 201
    • …
    corecore