Search CORE

5,551 research outputs found

Shattering Thresholds for Random Systems of Sets, Words, and Permutations

Author: Godbole Anant P.
Pinella Samantha
Zhuang Yan
Publication venue
Publication date: 06/05/2013
Field of study

This paper considers a problem that relates to the theories of covering arrays, permutation patterns, Vapnik-Chervonenkis (VC) classes, and probability thresholds. Specifically, we want to find the number of subsets of [n]:={1,2,....,n} we need to randomly select, in a certain probability space, so as to respectively "shatter" all t-subsets of [n]. Moving from subsets to words, we ask for the number of n-letter words on a q-letter alphabet that are needed to shatter all t-subwords of the q^n words of length n. Finally, we explore the number of random permutations of [n] needed to shatter (specializing to t=3), all length 3 permutation patterns in specified positions. We uncover a very sharp zero-one probability threshold for the emergence of such shattering; Talagrand's isoperimetric inequality in product spaces is used as a key tool.Comment: 25 page

arXiv.org e-Print Archive

CiteSeerX

Set Systems and Families of Permutations with Small Traces

Author: Cheong Otfried
Goaoc Xavier
Nicaud Cyril
Publication venue
Publication date: 01/01/2009
Field of study

We study the maximum size of a set system on

n

elements whose trace on any

b

elements has size at most

k

. We show that if for some

b \ge i \ge 0

the shatter function

f_R

of a set system

([n],R)

satisfies

f_R(b) < 2^i(b-i+1)

then

|R| = O(n^i)

; this generalizes Sauer's Lemma on the size of set systems with bounded VC-dimension. We use this bound to delineate the main growth rates for the same problem on families of permutations, where the trace corresponds to the inclusion for permutations. This is related to a question of Raz on families of permutations with bounded VC-dimension that generalizes the Stanley-Wilf conjecture on permutations with excluded patterns

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Tight bounds on the maximum size of a set of permutations with bounded VC-dimension

Author: Agarwal
Bienstock
Cibulka
Davenport
Efrat
Füredi
Füredi
Geneson
Hart
Jan Kynčl
Josef Cibulka
Klazar
Klazar
Klazar
Klazar
Marcus
Matoušek
Nivasch
Pach
Pettie
Pettie
Pettie
Raz
Sharir
Tardos
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

The VC-dimension of a family P of n-permutations is the largest integer k such that the set of restrictions of the permutations in P on some k-tuple of positions is the set of all k! permutation patterns. Let r_k(n) be the maximum size of a set of n-permutations with VC-dimension k. Raz showed that r_2(n) grows exponentially in n. We show that r_3(n)=2^Theta(n log(alpha(n))) and for every s >= 4, we have almost tight upper and lower bounds of the form 2^{n poly(alpha(n))}. We also study the maximum number p_k(n) of 1-entries in an n x n (0,1)-matrix with no (k+1)-tuple of columns containing all (k+1)-permutation matrices. We determine that p_3(n) = Theta(n alpha(n)) and that p_s(n) can be bounded by functions of the form n 2^poly(alpha(n)) for every fixed s >= 4. We also show that for every positive s there is a slowly growing function zeta_s(m) (of the form 2^poly(alpha(m)) for every fixed s >= 5) satisfying the following. For all positive integers n and B and every n x n (0,1)-matrix M with zeta_s(n)Bn 1-entries, the rows of M can be partitioned into s intervals so that at least B columns contain at least B 1-entries in each of the intervals.Comment: 22 pages, 4 figures, correction of the bound on r_3 in the abstract and other minor change

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Crossref

An Active Learning Algorithm for Ranking from Pairwise Preferences with an Almost Optimal Query Complexity

Author: Ailon Nir
Publication venue
Publication date: 17/05/2011
Field of study

We study the problem of learning to rank from pairwise preferences, and solve a long-standing open problem that has led to development of many heuristics but no provable results for our particular problem. Given a set

V

n

elements, we wish to linearly order them given pairwise preference labels. A pairwise preference label is obtained as a response, typically from a human, to the question "which if preferred, u or v?

for two elements

u,v\in V

. We assume possible non-transitivity paradoxes which may arise naturally due to human mistakes or irrationality. The goal is to linearly order the elements from the most preferred to the least preferred, while disagreeing with as few pairwise preference labels as possible. Our performance is measured by two parameters: The loss and the query complexity (number of pairwise preference labels we obtain). This is a typical learning problem, with the exception that the space from which the pairwise preferences is drawn is finite, consisting of

{n\choose 2}$ possibilities only. We present an active learning algorithm for this problem, with query bounds significantly beating general (non active) bounds for the same error guarantee, while almost achieving the information theoretical lower bound. Our main construct is a decomposition of the input s.t. (i) each block incurs high loss at optimum, and (ii) the optimal solution respecting the decomposition is not much worse than the true opt. The decomposition is done by adapting a recent result by Kenyon and Schudy for a related combinatorial optimization problem to the query efficient setting. We thus settle an open problem posed by learning-to-rank theoreticians and practitioners: What is a provably correct way to sample preference labels? To further show the power and practicality of our solution, we show how to use it in concert with an SVM relaxation.Comment: Fixed a tiny error in theorem 3.1 statemen

arXiv.org e-Print Archive

CiteSeerX

Class Proportion Estimation with Application to Multiclass Anomaly Rejection

Author: Sanderson Tyler
Scott Clayton
Publication venue
Publication date: 22/02/2014
Field of study

This work addresses two classification problems that fall under the heading of domain adaptation, wherein the distributions of training and testing examples differ. The first problem studied is that of class proportion estimation, which is the problem of estimating the class proportions in an unlabeled testing data set given labeled examples of each class. Compared to previous work on this problem, our approach has the novel feature that it does not require labeled training data from one of the classes. This property allows us to address the second domain adaptation problem, namely, multiclass anomaly rejection. Here, the goal is to design a classifier that has the option of assigning a "reject" label, indicating that the instance did not arise from a class present in the training data. We establish consistent learning strategies for both of these domain adaptation problems, which to our knowledge are the first of their kind. We also implement the class proportion estimation technique and demonstrate its performance on several benchmark data sets.Comment: Accepted to AISTATS 2014. 15 pages. 2 figure

arXiv.org e-Print Archive

CiteSeerX

Unification and Logarithmic Space

Author: C. Dwork
C. Dwork
J. Hartmanis
J.Y. Girard
J.Y. Girard
J.Y. Girard
J.Y. Girard
K. Knight
K.J. Lange
O. Laurent
P. Baillot
P. Baillot
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

We present an algebraic characterization of the complexity classes Logspace and NLogspace, using an algebra with a composition law based on unification. This new bridge between unification and complexity classes is inspired from proof theory and more specifically linear logic and Geometry of Interaction. We show how unification can be used to build a model of computation by means of specific subalgebras associated to finite permutations groups. We then prove that whether an observation (the algebraic counterpart of a program) accepts a word can be decided within logarithmic space. We also show that the construction can naturally represent pointer machines, an intuitive way of understanding logarithmic space computing

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL AMU