105 research outputs found
Nearly Optimal Bounds for Sample-Based Testing and Learning of -Monotone Functions
We study monotonicity testing of functions
using sample-based algorithms, which are only allowed to observe the value of
on points drawn independently from the uniform distribution. A classic
result by Bshouty-Tamon (J. ACM 1996) proved that monotone functions can be
learned with samples and it
is not hard to show that this bound extends to testing. Prior to our work the
only lower bound for this problem was in
the small parameter regime, when , due
to Goldreich-Goldwasser-Lehman-Ron-Samorodnitsky (Combinatorica 2000). Thus,
the sample complexity of monotonicity testing was wide open for . We resolve this question, obtaining a tight lower bound of
for all
at most a sufficiently small constant. In fact, we prove a much more general
result, showing that the sample complexity of -monotonicity testing and
learning for functions is
. For testing with
one-sided error we show that the sample complexity is .
Beyond the hypercube, we prove nearly tight bounds (up to polylog factors of
in the exponent) of
on the
sample complexity of testing and learning measurable -monotone functions under product distributions. Our upper bound
improves upon the previous bound of
by
Harms-Yoshida (ICALP 2022) for Boolean functions ()
Separations of Matroid Freeness Properties
Properties of Boolean functions on the hypercube invariant with respect to
linear transformations of the domain are among the most well-studied properties
in the context of property testing. In this paper, we study the fundamental
class of linear-invariant properties called matroid freeness properties. These
properties have been conjectured to essentially coincide with all testable
linear-invariant properties, and a recent sequence of works has established
testability for increasingly larger subclasses. One question left open,
however, is whether the infinitely many syntactically different properties
recently shown testable in fact correspond to new, semantically distinct ones.
This is a crucial issue since it has also been shown that there exist
subclasses of these properties for which an infinite set of syntactically
different representations collapse into one of a small, finite set of
properties, all previously known to be testable.
An important question is therefore to understand the semantics of matroid
freeness properties, and in particular when two syntactically different
properties are truly distinct. We shed light on this problem by developing a
method for determining the relation between two matroid freeness properties P
and Q. Furthermore, we show that there is a natural subclass of matroid
freeness properties such that for any two properties P and Q from this
subclass, a strong dichotomy must hold: either P is contained in Q or the two
properties are "well separated." As an application of this method, we exhibit
new, infinite hierarchies of testable matroid freeness properties such that at
each level of the hierarchy, there are functions that are far from all
functions lying in lower levels of the hierarchy. Our key technical tool is an
apparently new notion of maps between linear matroids, called matroid
homomorphisms, that might be of independent interest
Property testing : theory and applications
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.Includes bibliographical references (p. 107-111).(cont.) We show upper and lower bounds for the general problem and for specific partial orders. A few of our intermediate results are of independent interest. 1. If strings with a property form a vector space, adaptive 2-sided error tests for the property have no more power than non-adaptive 1-sided error tests. 2. Random LDPC codes with linear distance and constant rate are not locally testable. 3. There exist graphs with many edge-disjoint induced matchings of linear size. In the final part of the thesis, we initiate an investigation of property testing as applied to images. We study visual properties of discretized images represented by n x n matrices of binary pixel values. We obtain algorithms with the number of queries independent of n for several basic properties: being a half-plane, connectedness and convexity.Property testers are algorithms that distinguish inputs with a given property from those that are far from satisfying the property. Far means that many characters of the input must be changed before the property arises in it. Property testing was introduced by Rubinfeld and Sudan in the context of linearity testing and first studied in a variety of other contexts by Goldreich, Goldwasser and Ron. The query complexity of a property tester is the number of input characters it reads. This thesis is a detailed investigation of properties that are and are not testable with sublinear query complexity. We begin by characterizing properties of strings over the binary alphabet in terms of their formula complexity. Every such property can be represented by a CNF formula. We show that properties of n-bit strings defined by 2CNF formulas are testable with O([square root of]n) queries, whereas there are 3CNF formulas for which the corresponding properties require Q(n) queries, even for adaptive tests. We show that testing properties defined by 2CNF formulas is equivalent, with respect to the number of required queries, to several other function and graph testing problems. These problems include: testing whether Boolean functions over general partial orders are close to monotone, testing whether a set of vertices is close to one that is a vertex cover of a specific graph, and testing whether a set of vertices is close to a clique. Testing properties that are defined in terms of monotonicity has been extensively investigated in the context of the monotonicity of a sequence of integers and the monotonicity of a function over the m-dimensional hypercube (1,... , a)m. We study the query complexity of monotonicity testing of both Boolean and integer functions over general partial orders.by Sofya Raskhodnikova.Ph.D
Testing k-Monotonicity
A Boolean k-monotone function defined over a finite poset domain D alternates between the values 0 and 1 at most k times on any ascending chain in D. Therefore, k-monotone functions are natural generalizations of the classical monotone functions, which are the 1-monotone functions.
Motivated by the recent interest in k-monotone functions in the context of circuit complexity and learning theory, and by the central role that monotonicity testing plays in the context of property testing, we initiate a systematic study of k-monotone functions, in the property testing model. In this model, the goal is to distinguish functions that are k-monotone (or are close to being k-monotone) from functions that are far from being k-monotone.
Our results include the following:
1. We demonstrate a separation between testing k-monotonicity and testing monotonicity, on the hypercube domain {0,1}^d, for k >= 3;
2. We demonstrate a separation between testing and learning on {0,1}^d, for k=omega(log d): testing k-monotonicity can be performed with 2^{O(sqrt d . log d . log{1/eps})} queries, while learning k-monotone functions requires 2^{Omega(k . sqrt d .{1/eps})} queries (Blais et al. (RANDOM 2015)).
3. We present a tolerant test for functions fcolon[n]^dto {0,1}$with complexity independent of n, which makes progress on a problem left open by Berman et al. (STOC 2014).
Our techniques exploit the testing-by-learning paradigm, use novel applications of Fourier analysis on the grid [n]^d, and draw connections to distribution testing techniques.
Our techniques exploit the testing-by-learning paradigm, use novel applications of Fourier analysis on the grid [n]^d, and draw connections to distribution testing techniques
05291 Abstracts Collection -- Sublinear Algorithms
From 17.07.05 to 22.07.05, the Dagstuhl Seminar
05291 ``Sublinear Algorithms\u27\u27 was held
in the International Conference and Research Center (IBFI),
Schloss Dagstuhl.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Analyzing massive datasets with missing entries: models and algorithms
We initiate a systematic study of computational models to analyze algorithms for massive datasets with missing or erased entries and study the relationship of our models with existing algorithmic models for large datasets.
We focus on algorithms whose inputs are naturally represented as functions, codewords, or graphs. First, we generalize the property testing model, one of the most widely studied models of sublinear-time algorithms, to account for the presence of adversarially erased function values. We design efficient erasure-resilient property testing algorithms for several fundamental properties of real-valued functions such as monotonicity, Lipschitz property, convexity, and linearity.
We then investigate the problems of local decoding and local list decoding of codewords containing erasures. We show that, in some cases, these problems are strictly easier than the corresponding problems of decoding codewords containing errors. Moreover, we use this understanding to show a separation between our erasure-resilient property testing model and the (error) tolerant property testing model. The philosophical message of this separation is that errors occurring in large datasets are, in general, harder to deal with, than erasures.
Finally, we develop models and notions to reason about algorithms that are intended to run on large graphs with missing edges. While running algorithms on large graphs containing several missing edges, it is desirable to output solutions that are close to the solutions output when there are no missing edges. With this motivation, we define average sensitivity, a robustness metric for graph algorithms. We discuss various useful features of our definition and design approximation algorithms with good average sensitivity bounds for several optimization problems on graphs. We also define a model of erasure-resilient sublinear-time graph algorithms and design an efficient algorithm for testing connectivity of graphs
Testing Submodularity and Other Properties of Valuation Functions
We show that for any constant epsilon > 0 and p ge 1, it is possible to distinguish functions f : {0,1}^n to [0,1] that are submodular from those that are epsilon-far from every submodular function in ell_p distance with a constant number of queries.
More generally, we extend the testing-by-implicit-learning framework of Diakonikolas et al.(2007) to show that every property of real-valued functions that is well-approximated in ell_2 distance by a class of k-juntas for some k = O(1) can be tested in the ell_p-testing model with a constant number of queries. This result, combined with a recent junta theorem of Feldman and Vondrak (2016), yields the constant-query testability of submodularity. It also yields constant-query testing algorithms for a variety of other natural properties of valuation functions, including fractionally additive (XOS) functions, OXS functions, unit demand functions, coverage functions, and self-bounding functions
Mildly Exponential Lower Bounds on Tolerant Testers for Monotonicity, Unateness, and Juntas
We give the first super-polynomial (in fact, mildly exponential) lower bounds
for tolerant testing (equivalently, distance estimation) of monotonicity,
unateness, and juntas with a constant separation between the "yes" and "no"
cases. Specifically, we give
A -query lower bound for
non-adaptive, two-sided tolerant monotonicity testers and unateness testers
when the "gap" parameter is equal to
, for any ;
A -query lower bound for non-adaptive,
two-sided tolerant junta testers when the gap parameter is an absolute
constant.
In the constant-gap regime no non-trivial prior lower bound was known for
monotonicity, the best prior lower bound known for unateness was
queries, and the best prior lower bound known for
juntas was queries.Comment: 20 pages, 1 figur
- …