45 research outputs found

    Greedy Algorithm for Set Cover in Context of Knowledge Discovery Problems

    Get PDF
    AbstractIn the paper some problems connected with a process of knowledge discovery are considered. These problems are reduced to the set cover problem. It is known that under a plausible assumption on the class N P the greedy algorithm is close to best approximate polynomial algorithms for the set cover problem solving. Unfortunately, the performance ratio of this algorithm grows almost as natural logarithm on the cardinality of covered set. Instead of usual greedy algorithm we consider greedy algorithm with threshold. This algorithm constructs a partial cover, which covers at least a fixed part (for example, 90%) of the set. We prove that the cardinality of constructed partial cover is bounded from above by a linear function on the minimal cardinality of exact cover Cmin. In the case of 90% -cover, for example, in the capacity of such function we can take the function 2.31,·,Cmin+1. This bound is independent of the cardinality of covered set. Notice that the concept of partial cover in context of knowledge discovery problems is very close to the concept of approximate reduct

    Bounds on Depth of Decision Trees Derived from Decision Rule Systems

    Full text link
    Systems of decision rules and decision trees are widely used as a means for knowledge representation, as classifiers, and as algorithms. They are among the most interpretable models for classifying and representing knowledge. The study of relationships between these two models is an important task of computer science. It is easy to transform a decision tree into a decision rule system. The inverse transformation is a more difficult task. In this paper, we study unimprovable upper and lower bounds on the minimum depth of decision trees derived from decision rule systems depending on the various parameters of these systems

    Greedy Algorithm for Inference of Decision Trees from Decision Rule Systems

    Full text link
    Decision trees and decision rule systems play important roles as classifiers, knowledge representation tools, and algorithms. They are easily interpretable models for data analysis, making them widely used and studied in computer science. Understanding the relationships between these two models is an important task in this field. There are well-known methods for converting decision trees into systems of decision rules. In this paper, we consider the inverse transformation problem, which is not so simple. Instead of constructing an entire decision tree, our study focuses on a greedy polynomial time algorithm that simulates the operation of a decision tree on a given tuple of attribute values.Comment: arXiv admin note: substantial text overlap with arXiv:2305.01721, arXiv:2302.0706

    Comparative Analysis of Deterministic and Nondeterministic Decision Trees for Decision Tables from Closed Classes

    Full text link
    In this paper, we consider classes of decision tables with many-valued decisions closed under operations of removal of columns, changing of decisions, permutation of columns, and duplication of columns. We study relationships among three parameters of these tables: the complexity of a decision table (if we consider the depth of decision trees, then the complexity of a decision table is the number of columns in it), the minimum complexity of a deterministic decision tree, and the minimum complexity of a nondeterministic decision tree. We consider rough classification of functions characterizing relationships and enumerate all possible seven types of the relationships

    A Local Approach to Studying the Time and Space Complexity of Deterministic and Nondeterministic Decision Trees

    Full text link
    In this paper, we study arbitrary infinite binary information systems each of which consists of an infinite set called universe and an infinite set of two-valued functions (attributes) defined on the universe. We consider the notion of a problem over information system, which is described by a finite number of attributes and a mapping associating a decision to each tuple of attribute values. As algorithms for problem solving, we investigate deterministic and nondeterministic decision trees that use only attributes from the problem description. Nondeterministic decision trees are representations of decision rule systems that sometimes have less space complexity than the original rule systems. As time and space complexity, we study the depth and the number of nodes in the decision trees. In the worst case, with the growth of the number of attributes in the problem description, (i) the minimum depth of deterministic decision trees grows either as a logarithm or linearly, (ii) the minimum depth of nondeterministic decision trees either is bounded from above by a constant or grows linearly, (iii) the minimum number of nodes in deterministic decision trees has either polynomial or exponential growth, and (iv) the minimum number of nodes in nondeterministic decision trees has either polynomial or exponential growth. Based on these results, we divide the set of all infinite binary information systems into three complexity classes. This allows us to identify nontrivial relationships between deterministic decision trees and decision rules systems represented by nondeterministic decision trees. For each class, we study issues related to time-space trade-off for deterministic and nondeterministic decision trees.Comment: arXiv admin note: substantial text overlap with arXiv:2201.0101

    Learning Decision Rules from Sets of Decision Trees

    Get PDF
    This paper is devoted to the study of the problems of learning inner and general decision rules that are true for the maximum number of decision trees from a given set. Inner rules correspond to paths in decision trees from the root to terminal nodes. General rules are arbitrary rules that use attributes from the considered decision trees. We propose a polynomial time algorithm for the optimization of inner rules, show that the problem of optimization of general rules is NP-hard, and describe a heuristic for this problem. We compare the considered algorithm and heuristic experimentally on artificially generated datasets and induced from them decision trees with Gini index as a splitting criterion

    Critical properties and complexity measures of read-once Boolean functions

    Get PDF
    In this paper, we define a quasi-order on the set of read-once Boolean functions and show that this is a well-quasi-order. This implies that every parameter measuring complexity of the functions can be characterized by a finite set of minimal subclasses of read-once functions, where this parameter is unbounded. We focus on two parameters related to certificate complexity and characterize each of them in the terminology of minimal classes

    On Testing Membership to Maximal Consistent Extensions of Information Systems

    Get PDF
    Abstract. This paper provides a new algorithm for testing membership to maximal consistent extensions of information systems. A maximal consistent extension of a given information system includes all objects corresponding to known attribute values which are consistent with all true and realizable rules extracted from the original information system. An algorithm presented here does not involve computing any rules, and has polynomial time complexity. This algorithm is based on a simpler criterion for membership testing than the algorithm described i

    WQO is decidable for factorial languages

    Get PDF
    A language is factorial if it is closed under taking factors, i.e. contiguous subwords. Every factorial language can be described by an antidictionary, i.e. a minimal set of forbidden factors. We show that the problem of deciding whether a factorial language given by a finite antidictionary is well-quasi-ordered under the factor containment relation can be solved in polynomial time. We also discuss possible ways to extend our solution to permutations and graphs

    Decision rules derived from optimal decision trees with hypotheses

    Get PDF
    Conventional decision trees use queries each of which is based on one attribute. In this study, we also examine decision trees that handle additional queries based on hypotheses. This kind of query is similar to the equivalence queries considered in exact learning. Earlier, we designed dynamic programming algorithms for the computation of the minimum depth and the minimum number of internal nodes in decision trees that have hypotheses. Modification of these algorithms considered in the present paper permits us to build decision trees with hypotheses that are optimal relative to the depth or relative to the number of the internal nodes. We compare the length and coverage of decision rules extracted from optimal decision trees with hypotheses and decision rules extracted from optimal conventional decision trees to choose the ones that are preferable as a tool for the representation of information. To this end, we conduct computer experiments on various decision tables from the UCI Machine Learning Repository. In addition, we also consider decision tables for randomly generated Boolean functions. The collected results show that the decision rules derived from decision trees with hypotheses in many cases are better than the rules extracted from conventional decision trees
    corecore