33 research outputs found
Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary -Mixing Processes
Pac-Bayes bounds are among the most accurate generalization bounds for
classifiers learned from independently and identically distributed (IID) data,
and it is particularly so for margin classifiers: there have been recent
contributions showing how practical these bounds can be either to perform model
selection (Ambroladze et al., 2007) or even to directly guide the learning of
linear classifiers (Germain et al., 2009). However, there are many practical
situations where the training data show some dependencies and where the
traditional IID assumption does not hold. Stating generalization bounds for
such frameworks is therefore of the utmost interest, both from theoretical and
practical standpoints. In this work, we propose the first - to the best of our
knowledge - Pac-Bayes generalization bounds for classifiers trained on data
exhibiting interdependencies. The approach undertaken to establish our results
is based on the decomposition of a so-called dependency graph that encodes the
dependencies within the data, in sets of independent data, thanks to graph
fractional covers. Our bounds are very general, since being able to find an
upper bound on the fractional chromatic number of the dependency graph is
sufficient to get new Pac-Bayes bounds for specific settings. We show how our
results can be used to derive bounds for ranking statistics (such as Auc) and
classifiers trained on data distributed according to a stationary {\ss}-mixing
process. In the way, we show how our approach seemlessly allows us to deal with
U-processes. As a side note, we also provide a Pac-Bayes generalization bound
for classifiers learned on data from stationary -mixing distributions.Comment: Long version of the AISTATS 09 paper:
http://jmlr.csail.mit.edu/proceedings/papers/v5/ralaivola09a/ralaivola09a.pd
Bounds for graph regularity and removal lemmas
We show, for any positive integer k, that there exists a graph in which any
equitable partition of its vertices into k parts has at least ck^2/\log^* k
pairs of parts which are not \epsilon-regular, where c,\epsilon>0 are absolute
constants. This bound is tight up to the constant c and addresses a question of
Gowers on the number of irregular pairs in Szemer\'edi's regularity lemma.
In order to gain some control over irregular pairs, another regularity lemma,
known as the strong regularity lemma, was developed by Alon, Fischer,
Krivelevich, and Szegedy. For this lemma, we prove a lower bound of
wowzer-type, which is one level higher in the Ackermann hierarchy than the
tower function, on the number of parts in the strong regularity lemma,
essentially matching the upper bound. On the other hand, for the induced graph
removal lemma, the standard application of the strong regularity lemma, we find
a different proof which yields a tower-type bound.
We also discuss bounds on several related regularity lemmas, including the
weak regularity lemma of Frieze and Kannan and the recently established regular
approximation theorem. In particular, we show that a weak partition with
approximation parameter \epsilon may require as many as
2^{\Omega(\epsilon^{-2})} parts. This is tight up to the implied constant and
solves a problem studied by Lov\'asz and Szegedy.Comment: 62 page
Sparse Volterra and Polynomial Regression Models: Recoverability and Estimation
Volterra and polynomial regression models play a major role in nonlinear
system identification and inference tasks. Exciting applications ranging from
neuroscience to genome-wide association analysis build on these models with the
additional requirement of parsimony. This requirement has high interpretative
value, but unfortunately cannot be met by least-squares based or kernel
regression methods. To this end, compressed sampling (CS) approaches, already
successful in linear regression settings, can offer a viable alternative. The
viability of CS for sparse Volterra and polynomial models is the core theme of
this work. A common sparse regression task is initially posed for the two
models. Building on (weighted) Lasso-based schemes, an adaptive RLS-type
algorithm is developed for sparse polynomial regressions. The identifiability
of polynomial models is critically challenged by dimensionality. However,
following the CS principle, when these models are sparse, they could be
recovered by far fewer measurements. To quantify the sufficient number of
measurements for a given level of sparsity, restricted isometry properties
(RIP) are investigated in commonly met polynomial regression settings,
generalizing known results for their linear counterparts. The merits of the
novel (weighted) adaptive CS algorithms to sparse polynomial modeling are
verified through synthetic as well as real data tests for genotype-phenotype
analysis.Comment: 20 pages, to appear in IEEE Trans. on Signal Processin
Equitable defective coloring of sparse planar graphs
A graph has an equitable, defective k-coloring (an ED-k-coloring) if there is a k-coloring of V(G) that is defective (every vertex shares the same color with at most one neighbor) and equitable (the sizes of all color classes differ by at most one). A graph may have an ED-k-coloring, but no ED-(k + 1)-coloring. In this paper, we prove that planar graphs with minimum degree at least 2 and girth at least 10 are ED-k-colorable for any integer k \u3e= 3. The proof uses the method of discharging. We are able to simplify the normally lengthy task of enumerating forbidden substructures by using Hall\u27s Theorem, an unusual approach. Published by Elsevier B.V
Equitable colorings of Kronecker products of graphs
AbstractFor a positive integer k, a graph G is equitably k-colorable if there is a mapping f:V(G)→{1,2,…,k} such that f(x)≠f(y) whenever xy∈E(G) and ||f−1(i)|−|f−1(j)||≤1 for 1≤i<j≤k. The equitable chromatic number of a graph G, denoted by χ=(G), is the minimum k such that G is equitably k-colorable. The equitable chromatic threshold of a graph G, denoted by χ=∗(G), is the minimum t such that G is equitably k-colorable for k≥t. The current paper studies equitable chromatic numbers of Kronecker products of graphs. In particular, we give exact values or upper bounds on χ=(G×H) and χ=∗(G×H) when G and H are complete graphs, bipartite graphs, paths or cycles