13 research outputs found
Pre-Reduction Graph Products: Hardnesses of Properly Learning DFAs and Approximating EDP on DAGs
The study of graph products is a major research topic and typically concerns
the term , e.g., to show that . In this paper, we
study graph products in a non-standard form where is a
"reduction", a transformation of any graph into an instance of an intended
optimization problem. We resolve some open problems as applications.
(1) A tight -approximation hardness for the minimum
consistent deterministic finite automaton (DFA) problem, where is the
sample size. Due to Board and Pitt [Theoretical Computer Science 1992], this
implies the hardness of properly learning DFAs assuming (the
weakest possible assumption).
(2) A tight hardness for the edge-disjoint paths (EDP)
problem on directed acyclic graphs (DAGs), where denotes the number of
vertices.
(3) A tight hardness of packing vertex-disjoint -cycles for large .
(4) An alternative (and perhaps simpler) proof for the hardness of properly
learning DNF, CNF and intersection of halfspaces [Alekhnovich et al., FOCS 2004
and J. Comput.Syst.Sci. 2008]
NP-hardness of circuit minimization for multi-output functions
Can we design efficient algorithms for finding fast algorithms? This question is captured by various circuit minimization problems, and algorithms for the corresponding tasks have significant practical applications. Following the work of Cook and Levin in the early 1970s, a central question is whether minimizing the circuit size of an explicitly given function is NP-complete. While this is known to hold in restricted models such as DNFs, making progress with respect to more expressive classes of circuits has been elusive.
In this work, we establish the first NP-hardness result for circuit minimization of total functions in the setting of general (unrestricted) Boolean circuits. More precisely, we show that computing the minimum circuit size of a given multi-output Boolean function f : {0,1}^n ? {0,1}^m is NP-hard under many-one polynomial-time randomized reductions. Our argument builds on a simpler NP-hardness proof for the circuit minimization problem for (single-output) Boolean functions under an extended set of generators.
Complementing these results, we investigate the computational hardness of minimizing communication. We establish that several variants of this problem are NP-hard under deterministic reductions. In particular, unless ? = ??, no polynomial-time computable function can approximate the deterministic two-party communication complexity of a partial Boolean function up to a polynomial. This has consequences for the class of structural results that one might hope to show about the communication complexity of partial functions
Tight Bounds on Proper Equivalence Query Learning of DNF
We prove a new structural lemma for partial Boolean functions , which we
call the seed lemma for DNF. Using the lemma, we give the first subexponential
algorithm for proper learning of DNF in Angluin's Equivalence Query (EQ) model.
The algorithm has time and query complexity , which
is optimal. We also give a new result on certificates for DNF-size, a simple
algorithm for properly PAC-learning DNF, and new results on EQ-learning -term DNF and decision trees
On the hardness of learning intersections of two halfspaces
AbstractWe show that unless NP=RP, it is hard to (even) weakly PAC-learn intersection of two halfspaces in Rn using a hypothesis which is a function of up to ℓ halfspaces (linear threshold functions) for any integer ℓ. Specifically, we show that for every integer ℓ and an arbitrarily small constant ε>0, unless NP=RP, no polynomial time algorithm can distinguish whether there is an intersection of two halfspaces that correctly classifies a given set of labeled points in Rn, or whether any function of ℓ halfspaces can correctly classify at most 12+ε fraction of the points
Harnessing the Power of Choices in Decision Tree Learning
We propose a simple generalization of standard and empirically successful
decision tree learning algorithms such as ID3, C4.5, and CART. These
algorithms, which have been central to machine learning for decades, are greedy
in nature: they grow a decision tree by iteratively splitting on the best
attribute. Our algorithm, Top-, considers the best attributes as
possible splits instead of just the single best attribute. We demonstrate,
theoretically and empirically, the power of this simple generalization. We
first prove a {\sl greediness hierarchy theorem} showing that for every , Top- can be dramatically more powerful than Top-: there
are data distributions for which the former achieves accuracy ,
whereas the latter only achieves accuracy . We then
show, through extensive experiments, that Top- outperforms the two main
approaches to decision tree learning: classic greedy algorithms and more recent
"optimal decision tree" algorithms. On one hand, Top- consistently enjoys
significant accuracy gains over greedy algorithms across a wide range of
benchmarks. On the other hand, Top- is markedly more scalable than optimal
decision tree algorithms and is able to handle dataset and feature set sizes
that remain far beyond the reach of these algorithms.Comment: NeurIPS 202
PAC Quasi-automatizability of Resolution over Restricted Distributions
We consider principled alternatives to unsupervised learning in data mining
by situating the learning task in the context of the subsequent analysis task.
Specifically, we consider a query-answering (hypothesis-testing) task: In the
combined task, we decide whether an input query formula is satisfied over a
background distribution by using input examples directly, rather than invoking
a two-stage process in which (i) rules over the distribution are learned by an
unsupervised learning algorithm and (ii) a reasoning algorithm decides whether
or not the query formula follows from the learned rules. In a previous work
(2013), we observed that the learning task could satisfy numerous desirable
criteria in this combined context -- effectively matching what could be
achieved by agnostic learning of CNFs from partial information -- that are not
known to be achievable directly. In this work, we show that likewise, there are
reasoning tasks that are achievable in such a combined context that are not
known to be achievable directly (and indeed, have been seriously conjectured to
be impossible, cf. (Alekhnovich and Razborov, 2008)). Namely, we test for a
resolution proof of the query formula of a given size in quasipolynomial time
(that is, "quasi-automatizing" resolution). The learning setting we consider is
a partial-information, restricted-distribution setting that generalizes
learning parities over the uniform distribution from partial information,
another task that is known not to be achievable directly in various models (cf.
(Ben-David and Dichterman, 1998) and (Michael, 2010))
Consistency-Checking Problems: A Gateway to Parameterized Sample Complexity
Recently, Brand, Ganian and Simonov introduced a parameterized refinement of
the classical PAC-learning sample complexity framework. A crucial outcome of
their investigation is that for a very wide range of learning problems, there
is a direct and provable correspondence between fixed-parameter
PAC-learnability (in the sample complexity setting) and the fixed-parameter
tractability of a corresponding "consistency checking" search problem (in the
setting of computational complexity). The latter can be seen as generalizations
of classical search problems where instead of receiving a single instance, one
receives multiple yes- and no-examples and is tasked with finding a solution
which is consistent with the provided examples.
Apart from a few initial results, consistency checking problems are almost
entirely unexplored from a parameterized complexity perspective. In this
article, we provide an overview of these problems and their connection to
parameterized sample complexity, with the primary aim of facilitating further
research in this direction. Afterwards, we establish the fixed-parameter
(in)-tractability for some of the arguably most natural consistency checking
problems on graphs, and show that their complexity-theoretic behavior is
surprisingly very different from that of classical decision problems. Our new
results cover consistency checking variants of problems as diverse as (k-)Path,
Matching, 2-Coloring, Independent Set and Dominating Set, among others
A Strong Composition Theorem for Junta Complexity and the Boosting of Property Testers
We prove a strong composition theorem for junta complexity and show how such
theorems can be used to generically boost the performance of property testers.
The -approximate junta complexity of a function is the
smallest integer such that is -close to a function that
depends only on variables. A strong composition theorem states that if
has large -approximate junta complexity, then has even
larger -approximate junta complexity, even for . We develop a fairly complete understanding of this behavior,
proving that the junta complexity of is characterized by that of
along with the multivariate noise sensitivity of . For the important
case of symmetric functions , we relate their multivariate noise sensitivity
to the simpler and well-studied case of univariate noise sensitivity.
We then show how strong composition theorems yield boosting algorithms for
property testers: with a strong composition theorem for any class of functions,
a large-distance tester for that class is immediately upgraded into one for
small distances. Combining our contributions yields a booster for junta
testers, and with it new implications for junta testing. This is the first
boosting-type result in property testing, and we hope that the connection to
composition theorems adds compelling motivation to the study of both topics.Comment: 44 pages, 1 figure, FOCS 202