7 research outputs found
Multiplicative-Additive Proof Equivalence is Logspace-complete, via Binary Decision Trees
Given a logic presented in a sequent calculus, a natural question is that of
equivalence of proofs: to determine whether two given proofs are equated by any
denotational semantics, ie any categorical interpretation of the logic
compatible with its cut-elimination procedure. This notion can usually be
captured syntactically by a set of rule permutations.
Very generally, proofnets can be defined as combinatorial objects which
provide canonical representatives of equivalence classes of proofs. In
particular, the existence of proof nets for a logic provides a solution to the
equivalence problem of this logic. In certain fragments of linear logic, it is
possible to give a notion of proofnet with good computational properties,
making it a suitable representation of proofs for studying the cut-elimination
procedure, among other things.
It has recently been proved that there cannot be such a notion of proofnets
for the multiplicative (with units) fragment of linear logic, due to the
equivalence problem for this logic being Pspace-complete.
We investigate the multiplicative-additive (without unit) fragment of linear
logic and show it is closely related to binary decision trees: we build a
representation of proofs based on binary decision trees, reducing proof
equivalence to decision tree equivalence, and give a converse encoding of
binary decision trees as proofs. We get as our main result that the complexity
of the proof equivalence problem of the studied fragment is Logspace-complete.Comment: arXiv admin note: text overlap with arXiv:1502.0199
Harnessing the Power of Choices in Decision Tree Learning
We propose a simple generalization of standard and empirically successful
decision tree learning algorithms such as ID3, C4.5, and CART. These
algorithms, which have been central to machine learning for decades, are greedy
in nature: they grow a decision tree by iteratively splitting on the best
attribute. Our algorithm, Top-, considers the best attributes as
possible splits instead of just the single best attribute. We demonstrate,
theoretically and empirically, the power of this simple generalization. We
first prove a {\sl greediness hierarchy theorem} showing that for every , Top- can be dramatically more powerful than Top-: there
are data distributions for which the former achieves accuracy ,
whereas the latter only achieves accuracy . We then
show, through extensive experiments, that Top- outperforms the two main
approaches to decision tree learning: classic greedy algorithms and more recent
"optimal decision tree" algorithms. On one hand, Top- consistently enjoys
significant accuracy gains over greedy algorithms across a wide range of
benchmarks. On the other hand, Top- is markedly more scalable than optimal
decision tree algorithms and is able to handle dataset and feature set sizes
that remain far beyond the reach of these algorithms.Comment: NeurIPS 202
Properly Learning Decision Trees with Queries Is NP-Hard
We prove that it is NP-hard to properly PAC learn decision trees with
queries, resolving a longstanding open problem in learning theory (Bshouty
1993; Guijarro-Lavin-Raghavan 1999; Mehta-Raghavan 2002; Feldman 2016). While
there has been a long line of work, dating back to (Pitt-Valiant 1988),
establishing the hardness of properly learning decision trees from random
examples, the more challenging setting of query learners necessitates different
techniques and there were no previous lower bounds. En route to our main
result, we simplify and strengthen the best known lower bounds for a different
problem of Decision Tree Minimization (Zantema-Bodlaender 2000; Sieling 2003).
On a technical level, we introduce the notion of hardness distillation, which
we study for decision tree complexity but can be considered for any complexity
measure: for a function that requires large decision trees, we give a general
method for identifying a small set of inputs that is responsible for its
complexity. Our technique even rules out query learners that are allowed
constant error. This contrasts with existing lower bounds for the setting of
random examples which only hold for inverse-polynomial error.
Our result, taken together with a recent almost-polynomial time query
algorithm for properly learning decision trees under the uniform distribution
(Blanc-Lange-Qiao-Tan 2022), demonstrates the dramatic impact of distributional
assumptions on the problem.Comment: 41 pages, 10 figures, FOCS 202
Truth Table Minimization of Computational Models
Complexity theory offers a variety of concise computational models for
computing boolean functions - branching programs, circuits, decision trees and
ordered binary decision diagrams to name a few. A natural question that arises
in this context with respect to any such model is this:
Given a function f:{0,1}^n \to {0,1}, can we compute the optimal complexity
of computing f in the computational model in question? (according to some
desirable measure).
A critical issue regarding this question is how exactly is f given, since a
more elaborate description of f allows the algorithm to use more computational
resources. Among the possible representations are black-box access to f (such
as in computational learning theory), a representation of f in the desired
computational model or a representation of f in some other model. One might
conjecture that if f is given as its complete truth table (i.e., a list of f's
values on each of its 2^n possible inputs), the most elaborate description
conceivable, then any computational model can be efficiently computed, since
the algorithm computing it can run poly(2^n) time. Several recent studies show
that this is far from the truth - some models have efficient and simple
algorithms that yield the desired result, others are believed to be hard, and
for some models this problem remains open.
In this thesis we will discuss the computational complexity of this question
regarding several common types of computational models. We shall present
several new hardness results and efficient algorithms, as well as new proofs
and extensions for known theorems, for variants of decision trees, formulas and
branching programs
Automated Segmentation of Large 3D Images of Nervous Systems Using a Higher-order Graphical Model
This thesis presents a new mathematical model for segmenting volume images. The model is an energy function defined on the state space of all possibilities to remove or preserve splitting faces from an initial over-segmentation of the 3D image into supervoxels. It decomposes into potential functions that are learned automatically from a small amount of empirical training data. The learning is based on features of the distribution of gray values in the volume image and on features of the geometry and topology of the supervoxel segmentation. To be able to extract these features from large 3D images that consist of several billion voxels, a new algorithm is presented that constructs a suitable representation of the geometry and topology of volume segmentations in a block-wise fashion, in log-linear runtime (in the number of voxels) and in parallel, using only a prescribed amount of memory. At the core of this thesis is the optimization problem of finding, for a learned energy function, a segmentation with minimal energy. This optimization problem is difficult because the energy function consists of 3rd and 4th order potential functions that are not submodular. For sufficiently small problems with 10,000 degrees of freedom, it can be solved to global optimality using Mixed Integer Linear Programming. For larger models with 10,000,000 degrees of freedom, an approximate optimizer is proposed and compared to state-of-the-art alternatives. Using these new techniques and a unified data structure for multi-variate data and functions, a complete processing chain for segmenting large volume images, from the restoration of the raw volume image to the visualization of the final segmentation, has been implemented in C++. Results are shown for an application in neuroscience, namely the segmentation of a part of the inner plexiform layer of rabbit retina in a volume image of 2048 x 1792 x 2048 voxels that was acquired by means of Serial Block Face Scanning Electron Microscopy (Denk and Horstmann, 2004) with a resolution of 22nm x 22nm x 30nm. The quality of the automated segmentation as well as the improvement over a simpler model that does not take geometric context into account, are confirmed by a quantitative comparison with the gold standard
Electronic Colloquium on Computational Complexity, Report No. 54 (2002) Minimization of Decision Trees is Hard to Approximate
Decision trees are representations of discrete functions with widespread applications in, e.g., complexity theory and data mining and exploration. In these areas it is important to obtain decision trees of small size. The minimization problem for decision trees is known to be NP-hard. In this paper the problem is shown to be even hard to approximate up to any constant factor. 1