162,986 research outputs found

    On the parity complexity measures of Boolean functions

    Get PDF
    The parity decision tree model extends the decision tree model by allowing the computation of a parity function in one step. We prove that the deterministic parity decision tree complexity of any Boolean function is polynomially related to the non-deterministic complexity of the function or its complement. We also show that they are polynomially related to an analogue of the block sensitivity. We further study parity decision trees in their relations with an intermediate variant of the decision trees, as well as with communication complexity.Comment: submitted to TCS on 16-MAR-200

    Learning using Local Membership Queries

    Full text link
    We introduce a new model of membership query (MQ) learning, where the learning algorithm is restricted to query points that are \emph{close} to random examples drawn from the underlying distribution. The learning model is intermediate between the PAC model (Valiant, 1984) and the PAC+MQ model (where the queries are allowed to be arbitrary points). Membership query algorithms are not popular among machine learning practitioners. Apart from the obvious difficulty of adaptively querying labelers, it has also been observed that querying \emph{unnatural} points leads to increased noise from human labelers (Lang and Baum, 1992). This motivates our study of learning algorithms that make queries that are close to examples generated from the data distribution. We restrict our attention to functions defined on the nn-dimensional Boolean hypercube and say that a membership query is local if its Hamming distance from some example in the (random) training data is at most O(log(n))O(\log(n)). We show the following results in this model: (i) The class of sparse polynomials (with coefficients in R) over {0,1}n\{0,1\}^n is polynomial time learnable under a large class of \emph{locally smooth} distributions using O(log(n))O(\log(n))-local queries. This class also includes the class of O(log(n))O(\log(n))-depth decision trees. (ii) The class of polynomial-sized decision trees is polynomial time learnable under product distributions using O(log(n))O(\log(n))-local queries. (iii) The class of polynomial size DNF formulas is learnable under the uniform distribution using O(log(n))O(\log(n))-local queries in time nO(log(log(n)))n^{O(\log(\log(n)))}. (iv) In addition we prove a number of results relating the proposed model to the traditional PAC model and the PAC+MQ model

    Short-Range Interactions and Decision Tree-Based Protein Contact Map Predictor

    Get PDF
    In this paper, we focus on protein contact map prediction, one of the most important intermediate steps of the protein folding prob lem. The objective of this research is to know how short-range interac tions can contribute to a system based on decision trees to learn about the correlation among the covalent structures of a protein residues. We propose a solution to predict protein contact maps that combines the use of decision trees with a new input codification for short-range in teractions. The method’s performance was very satisfactory, improving the accuracy instead using all information of the protein sequence. For a globulin data set the method can predict contacts with a maximal accu racy of 43%. The presented predictive model illustrates that short-range interactions play the predominant role in determining protein structur

    Fourier Growth of Parity Decision Trees

    Get PDF
    We prove that for every parity decision tree of depth d on n variables, the sum of absolute values of Fourier coefficients at level ? is at most d^{?/2} ? O(? ? log(n))^?. Our result is nearly tight for small values of ? and extends a previous Fourier bound for standard decision trees by Sherstov, Storozhenko, and Wu (STOC, 2021). As an application of our Fourier bounds, using the results of Bansal and Sinha (STOC, 2021), we show that the k-fold Forrelation problem has (randomized) parity decision tree complexity ??(n^{1-1/k}), while having quantum query complexity ? k/2?. Our proof follows a random-walk approach, analyzing the contribution of a random path in the decision tree to the level-? Fourier expression. To carry the argument, we apply a careful cleanup procedure to the parity decision tree, ensuring that the value of the random walk is bounded with high probability. We observe that step sizes for the level-? walks can be computed by the intermediate values of level ? ?-1 walks, which calls for an inductive argument. Our approach differs from previous proofs of Tal (FOCS, 2020) and Sherstov, Storozhenko, and Wu (STOC, 2021) that relied on decompositions of the tree. In particular, for the special case of standard decision trees we view our proof as slightly simpler and more intuitive. In addition, we prove a similar bound for noisy decision trees of cost at most d - a model that was recently introduced by Ben-David and Blais (FOCS, 2020)

    On Rotation Distance of Rank Bounded Trees

    Full text link
    Computing the rotation distance between two binary trees with nn internal nodes efficiently (in poly(n)poly(n) time) is a long standing open question in the study of height balancing in tree data structures. In this paper, we initiate the study of this problem bounding the rank of the trees given at the input (defined by Ehrenfeucht and Haussler (1989) in the context of decision trees). We define the rank-bounded rotation distance between two given binary trees T1T_1 and T2T_2 (with nn internal nodes) of rank at most rr, denoted by dr(T1,T2)d_r(T_1,T_2), as the length of the shortest sequence of rotations that transforms T1T_1 to T2T_2 with the restriction that the intermediate trees must be of rank at most rr. We show that the rotation distance problem reduces in polynomial time to the rank bounded rotation distance problem. This motivates the study of the problem in the combinatorial and algorithmic frontiers. Observing that trees with rank 11 coincide exactly with skew trees (binary trees where every internal node has at least one leaf as a child), we show the following results in this frontier : We present an O(n2)O(n^2) time algorithm for computing d1(T1,T2)d_1(T_1,T_2). That is, when the given trees are skew trees (we call this variant as skew rotation distance problem) - where the intermediate trees are restricted to be skew as well. In particular, our techniques imply that for any two skew trees d(T1,T2)n2d(T_1,T_2) \le n^2. We show the following upper bound : for any two trees T1T_1 and T2T_2 of rank at most r1r_1 and r2r_2 respectively, we have that: dr(T1,T2)n2(1+(2n+1)(r1+r22))d_r(T_1,T_2) \le n^2 (1+(2n+1)(r_1+r_2-2)) where r=max{r1,r2}r = max\{r_1,r_2\}. This bound is asymptotically tight for r=1r=1. En route our proof of the above theorems, we associate binary trees to permutations and bivariate polynomials, and prove several characterizations in the case of skew trees.Comment: 25 pages, 2 figures, Abstract shortened to meet arxiv requirement

    Sequent Calculus in the Topos of Trees

    Full text link
    Nakano's "later" modality, inspired by G\"{o}del-L\"{o}b provability logic, has been applied in type systems and program logics to capture guarded recursion. Birkedal et al modelled this modality via the internal logic of the topos of trees. We show that the semantics of the propositional fragment of this logic can be given by linear converse-well-founded intuitionistic Kripke frames, so this logic is a marriage of the intuitionistic modal logic KM and the intermediate logic LC. We therefore call this logic KMlin\mathrm{KM}_{\mathrm{lin}}. We give a sound and cut-free complete sequent calculus for KMlin\mathrm{KM}_{\mathrm{lin}} via a strategy that decomposes implication into its static and irreflexive components. Our calculus provides deterministic and terminating backward proof-search, yields decidability of the logic and the coNP-completeness of its validity problem. Our calculus and decision procedure can be restricted to drop linearity and hence capture KM.Comment: Extended version, with full proof details, of a paper accepted to FoSSaCS 2015 (this version edited to fix some minor typos

    Bayesian networks and decision trees in the diagnosis of female urinary incontinence

    Full text link
    This study compares the effectiveness of Bayesian networks versus Decision Trees in modeling the Integral Theory of Female Urinary Incontinence diagnostic algorithm. Bayesian networks and Decision Trees were developed and trained using data from 58 adult women presenting with urinary incontinence symptoms. A Bayesian Network was developed in collaboration with an expert specialist who regularly utilizes a non-automated diagnostic algorithm in clinical practice. The original Bayesian network was later refined using a more connected approach. Diagnoses determined from all automated approaches were compared with the diagnoses of a single human expert. In most cases, Bayesian networks were found to be at least as accurate as the Decision Tree approach. The refined Connected Bayesian Network was found to be more accurate than the Original Bayesian Network accurately discriminated between diagnoses despite the small sample size. In contrast, the Connected and Decision Tree approaches were less able to discriminate between diagnoses. The Original Bayesian Network was found to provide an excellent basis for graphically communicating the correlation between symptoms and laxity defects in a given anatomical zone. Performance measures in both networks indicate that Bayesian networks could provide a potentially useful tool in the management of female pelvic floor dysfunction. Before the technique can be utilized in practice, well-established learning algorithms should be applied to improve network structure. A larger training data set should also improve network accuracy, sensitivity, and specificity

    Decision support methods in diabetic patient management by insulin administration neural network vs. induction methods for knowledge classification

    Get PDF
    Diabetes mellitus is now recognised as a major worldwide public health problem. At present, about 100 million people are registered as diabetic patients. Many clinical, social and economic problems occur as a consequence of insulin-dependent diabetes. Treatment attempts to prevent or delay complications by applying ‘optimal’ glycaemic control. Therefore, there is a continuous need for effective monitoring of the patient. Given the popularity of decision tree learning algorithms as well as neural networks for knowledge classification which is further used for decision support, this paper examines their relative merits by applying one algorithm from each family on a medical problem; that of recommending a particular diabetes regime. For the purposes of this study, OC1 a descendant of Quinlan’s ID3 algorithm was chosen as decision tree learning algorithm and a generating shrinking algorithm for learning arbitrary classifications as a neural network algorithm. These systems were trained on 646 cases derived from two countries in Europe and were tested on 100 cases which were different from the original 646 cases
    corecore