33 research outputs found

    Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary β\beta-Mixing Processes

    Full text link
    Pac-Bayes bounds are among the most accurate generalization bounds for classifiers learned from independently and identically distributed (IID) data, and it is particularly so for margin classifiers: there have been recent contributions showing how practical these bounds can be either to perform model selection (Ambroladze et al., 2007) or even to directly guide the learning of linear classifiers (Germain et al., 2009). However, there are many practical situations where the training data show some dependencies and where the traditional IID assumption does not hold. Stating generalization bounds for such frameworks is therefore of the utmost interest, both from theoretical and practical standpoints. In this work, we propose the first - to the best of our knowledge - Pac-Bayes generalization bounds for classifiers trained on data exhibiting interdependencies. The approach undertaken to establish our results is based on the decomposition of a so-called dependency graph that encodes the dependencies within the data, in sets of independent data, thanks to graph fractional covers. Our bounds are very general, since being able to find an upper bound on the fractional chromatic number of the dependency graph is sufficient to get new Pac-Bayes bounds for specific settings. We show how our results can be used to derive bounds for ranking statistics (such as Auc) and classifiers trained on data distributed according to a stationary {\ss}-mixing process. In the way, we show how our approach seemlessly allows us to deal with U-processes. As a side note, we also provide a Pac-Bayes generalization bound for classifiers learned on data from stationary φ\varphi-mixing distributions.Comment: Long version of the AISTATS 09 paper: http://jmlr.csail.mit.edu/proceedings/papers/v5/ralaivola09a/ralaivola09a.pd

    Bounds for graph regularity and removal lemmas

    Get PDF
    We show, for any positive integer k, that there exists a graph in which any equitable partition of its vertices into k parts has at least ck^2/\log^* k pairs of parts which are not \epsilon-regular, where c,\epsilon>0 are absolute constants. This bound is tight up to the constant c and addresses a question of Gowers on the number of irregular pairs in Szemer\'edi's regularity lemma. In order to gain some control over irregular pairs, another regularity lemma, known as the strong regularity lemma, was developed by Alon, Fischer, Krivelevich, and Szegedy. For this lemma, we prove a lower bound of wowzer-type, which is one level higher in the Ackermann hierarchy than the tower function, on the number of parts in the strong regularity lemma, essentially matching the upper bound. On the other hand, for the induced graph removal lemma, the standard application of the strong regularity lemma, we find a different proof which yields a tower-type bound. We also discuss bounds on several related regularity lemmas, including the weak regularity lemma of Frieze and Kannan and the recently established regular approximation theorem. In particular, we show that a weak partition with approximation parameter \epsilon may require as many as 2^{\Omega(\epsilon^{-2})} parts. This is tight up to the implied constant and solves a problem studied by Lov\'asz and Szegedy.Comment: 62 page

    Sparse Volterra and Polynomial Regression Models: Recoverability and Estimation

    Full text link
    Volterra and polynomial regression models play a major role in nonlinear system identification and inference tasks. Exciting applications ranging from neuroscience to genome-wide association analysis build on these models with the additional requirement of parsimony. This requirement has high interpretative value, but unfortunately cannot be met by least-squares based or kernel regression methods. To this end, compressed sampling (CS) approaches, already successful in linear regression settings, can offer a viable alternative. The viability of CS for sparse Volterra and polynomial models is the core theme of this work. A common sparse regression task is initially posed for the two models. Building on (weighted) Lasso-based schemes, an adaptive RLS-type algorithm is developed for sparse polynomial regressions. The identifiability of polynomial models is critically challenged by dimensionality. However, following the CS principle, when these models are sparse, they could be recovered by far fewer measurements. To quantify the sufficient number of measurements for a given level of sparsity, restricted isometry properties (RIP) are investigated in commonly met polynomial regression settings, generalizing known results for their linear counterparts. The merits of the novel (weighted) adaptive CS algorithms to sparse polynomial modeling are verified through synthetic as well as real data tests for genotype-phenotype analysis.Comment: 20 pages, to appear in IEEE Trans. on Signal Processin

    Equitable defective coloring of sparse planar graphs

    Get PDF
    A graph has an equitable, defective k-coloring (an ED-k-coloring) if there is a k-coloring of V(G) that is defective (every vertex shares the same color with at most one neighbor) and equitable (the sizes of all color classes differ by at most one). A graph may have an ED-k-coloring, but no ED-(k + 1)-coloring. In this paper, we prove that planar graphs with minimum degree at least 2 and girth at least 10 are ED-k-colorable for any integer k \u3e= 3. The proof uses the method of discharging. We are able to simplify the normally lengthy task of enumerating forbidden substructures by using Hall\u27s Theorem, an unusual approach. Published by Elsevier B.V

    Equitable colorings of Kronecker products of graphs

    Get PDF
    AbstractFor a positive integer k, a graph G is equitably k-colorable if there is a mapping f:V(G)→{1,2,…,k} such that f(x)≠f(y) whenever xy∈E(G) and ||f−1(i)|−|f−1(j)||≤1 for 1≤i<j≤k. The equitable chromatic number of a graph G, denoted by χ=(G), is the minimum k such that G is equitably k-colorable. The equitable chromatic threshold of a graph G, denoted by χ=∗(G), is the minimum t such that G is equitably k-colorable for k≥t. The current paper studies equitable chromatic numbers of Kronecker products of graphs. In particular, we give exact values or upper bounds on χ=(G×H) and χ=∗(G×H) when G and H are complete graphs, bipartite graphs, paths or cycles
    corecore