Search CORE

22 research outputs found

Hardness and inapproximability results for minimum verification set and minimum path decision tree problems

Author: Turker Uraz Cengiz
Türker Uraz Cengiz
Yenigun Husnu
Yenigün Hüsnü
Publication venue: 'Sabanci University Information Center'
Publication date: 01/09/2012
Field of study

Minimization of decision trees is a well studied problem. In this work, we introduce two new problems related to minimization of decision trees. The problems are called minimum verification set (MinVS) and minimum path decision tree (MinPathDT) problems. Decision tree problems ask the question "What is the unknown given object?". MinVS problem on the other hand asks the question "Is the unknown object z?", for a given object z. Hence it is not an identification, but rather a verification problem. MinPathDT problem aims to construct a decision tree where only the cost of the root-to-leaf path corresponding to a given object is minimized, whereas decision tree problems in general try to minimize the overall cost of decision trees considering all the objects. Therefore, MinVS and MinPathDT are seemingly easier problems. However, in this work we prove that MinVS and MinPathDT problems are both NP-complete and cannot be approximated within a factor in o(lg n) unless P = NP

Sabanci University Research Database

Approximation Algorithms for Stochastic Boolean Function Evaluation and Stochastic Submodular Set Cover

Author: Deshpande Amol
Hellerstein Lisa
Kletenik Devorah
Publication venue
Publication date: 09/08/2013
Field of study

Stochastic Boolean Function Evaluation is the problem of determining the value of a given Boolean function f on an unknown input x, when each bit of x_i of x can only be determined by paying an associated cost c_i. The assumption is that x is drawn from a given product distribution, and the goal is to minimize the expected cost. This problem has been studied in Operations Research, where it is known as "sequential testing" of Boolean functions. It has also been studied in learning theory in the context of learning with attribute costs. We consider the general problem of developing approximation algorithms for Stochastic Boolean Function Evaluation. We give a 3-approximation algorithm for evaluating Boolean linear threshold formulas. We also present an approximation algorithm for evaluating CDNF formulas (and decision trees) achieving a factor of O(log kd), where k is the number of terms in the DNF formula, and d is the number of clauses in the CNF formula. In addition, we present approximation algorithms for simultaneous evaluation of linear threshold functions, and for ranking of linear functions. Our function evaluation algorithms are based on reductions to the Stochastic Submodular Set Cover (SSSC) problem. This problem was introduced by Golovin and Krause. They presented an approximation algorithm for the problem, called Adaptive Greedy. Our main technical contribution is a new approximation algorithm for the SSSC problem, which we call Adaptive Dual Greedy. It is an extension of the Dual Greedy algorithm for Submodular Set Cover due to Fujito, which is a generalization of Hochbaum's algorithm for the classical Set Cover Problem. We also give a new bound on the approximation achieved by the Adaptive Greedy algorithm of Golovin and Krause

arXiv.org e-Print Archive

CiteSeerX

Crossref

Efficient Algorithms for Battleship

Author: da Fonseca Guilherme D.
Gerard Yan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 10th International Conference on Fun with Algorithms (FUN 2021)
Publication date: 01/01/2020
Field of study

We consider an algorithmic problem inspired by the Battleship game. In the variant of the problem that we investigate, there is a unique ship of shape

S \subset Z^2

which has been translated in the lattice

Z^2

. We assume that a player has already hit the ship with a first shot and the goal is to sink the ship using as few shots as possible, that is, by minimizing the number of missed shots. While the player knows the shape

S

, which position of

S

has been hit is not known. Given a shape

S

n

lattice points, the minimum number of misses that can be achieved in the worst case by any algorithm is called the Battleship complexity of the shape

S

and denoted

c(S)

. We prove three bounds on

c(S)

, each considering a different class of shapes. First, we have

c(S) \leq n-1

for arbitrary shapes and the bound is tight for parallelogram-free shapes. Second, we provide an algorithm that shows that

c(S) = O(\log n)

S

is an HV-convex polyomino. Third, we provide an algorithm that shows that

c(S) = O(\log \log n)

S

is a digital convex set. This last result is obtained through a novel discrete version of the Blaschke-Lebesgue inequality relating the area and the width of any convex body.Comment: Conference version at 10th International Conference on Fun with Algorithms (FUN 2020

arXiv.org e-Print Archive

HAL AMU

HAL Clermont Université

Dagstuhl Research Online Publication Server

On the Complexity of Searching in Trees: Average-case Minimization

Author: A. Garsia
A. Schäffer
D. Dereniowski
D. Knuth
E. Arkin
E. Laber
E. Laber
L. Hyafil
M. Adler
M. Garey
M. Garey
M. Lipman
O. Ibarra
P. Torre de la
R. Carmo
R. Kosaraju
Y. Ben-Asher
Publication venue
Publication date: 01/01/2009
Field of study

We focus on the average-case analysis: A function w : V -> Z+ is given which defines the likelihood for a node to be the one marked, and we want the strategy that minimizes the expected number of queries. Prior to this paper, very little was known about this natural question and the complexity of the problem had remained so far an open question. We close this question and prove that the above tree search problem is NP-complete even for the class of trees with diameter at most 4. This results in a complete characterization of the complexity of the problem with respect to the diameter size. In fact, for diameter not larger than 3 the problem can be shown to be polynomially solvable using a dynamic programming approach. In addition we prove that the problem is NP-complete even for the class of trees of maximum degree at most 16. To the best of our knowledge, the only known result in this direction is that the tree search problem is solvable in O(|V| log|V|) time for trees with degree at most 2 (paths). We match the above complexity results with a tight algorithmic analysis. We first show that a natural greedy algorithm attains a 2-approximation. Furthermore, for the bounded degree instances, we show that any optimal strategy (i.e., one that minimizes the expected number of queries) performs at most O(\Delta(T) (log |V| + log w(T))) queries in the worst case, where w(T) is the sum of the likelihoods of the nodes of T and \Delta(T) is the maximum degree of T. We combine this result with a non-trivial exponential time algorithm to provide an FPTAS for trees with bounded degree

arXiv.org e-Print Archive

CiteSeerX

Crossref

Catalogo dei prodotti della ricerca

Publications at Bielefeld University

Archivio della Ricerca - Università di Salerno

Harnessing the Power of Choices in Decision Tree Learning

Author: Blanc Guy
Lange Jane
Pabbaraju Chirag
Sullivan Colin
Tan Li-Yang
Tiwari Mo
Publication venue
Publication date: 25/10/2023
Field of study

We propose a simple generalization of standard and empirically successful decision tree learning algorithms such as ID3, C4.5, and CART. These algorithms, which have been central to machine learning for decades, are greedy in nature: they grow a decision tree by iteratively splitting on the best attribute. Our algorithm, Top-

k

, considers the

k

best attributes as possible splits instead of just the single best attribute. We demonstrate, theoretically and empirically, the power of this simple generalization. We first prove a {\sl greediness hierarchy theorem} showing that for every

k \in \mathbb{N}

, Top-

(k+1)

can be dramatically more powerful than Top-

k

: there are data distributions for which the former achieves accuracy

1-\varepsilon

, whereas the latter only achieves accuracy

\frac1{2}+\varepsilon

. We then show, through extensive experiments, that Top-

k

outperforms the two main approaches to decision tree learning: classic greedy algorithms and more recent "optimal decision tree" algorithms. On one hand, Top-

k

consistently enjoys significant accuracy gains over greedy algorithms across a wide range of benchmarks. On the other hand, Top-

k

is markedly more scalable than optimal decision tree algorithms and is able to handle dataset and feature set sizes that remain far beyond the reach of these algorithms.Comment: NeurIPS 202

arXiv.org e-Print Archive

On the Huffman and Alphabetic Tree Problem with General Cost Functions

Author: Fujiwara Hiroshi
Jacobs Tobias
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

We address generalized versions of the Huffman and Alphabetic Tree Problem where the cost caused by each individual leaf i, instead of being linear, depends on its depth in the tree by an arbitrary function. The objective is to minimize either the total cost or the maximum cost among all leaves. We review and extend the known results in this direction and devise a number of new algorithms and hardness proofs. It turns out that the Dynamic Programming approach for the Alphabetic Tree Problem can be extended to arbitrary cost functions, resulting in a time O(n (4)) optimal algorithm using space O(n (3)). We identify classes of cost functions where the well-known trick to reduce the runtime by a factor of n via a "monotonicity" property can be applied. For the generalized Huffman Tree Problem we show that even the k-ary version can be solved by a generalized version of the Coin Collector Algorithm of Larmore and Hirschberg (in Proc. SODA'90, pp. 310-318, 1990) when the cost functions are nondecreasing and convex. Furthermore, we give an O(n (2)logn) algorithm for the worst case minimization variants of both the Huffman and Alphabetic Tree Problem with nondecreasing cost functions. Investigating the limits of computational tractability, we show that the Huffman Tree Problem in its full generality is inapproximable unless P = NP, no matter if the objective function is the sum of leaf costs or their maximum. The alphabetic version becomes NP-hard when the leaf costs are interdependent.ArticleALGORITHMICA. 69(3): 582-604 (2014)journal articl

Shinshu University Institutional Repository