162,986 research outputs found
On the parity complexity measures of Boolean functions
The parity decision tree model extends the decision tree model by allowing
the computation of a parity function in one step. We prove that the
deterministic parity decision tree complexity of any Boolean function is
polynomially related to the non-deterministic complexity of the function or its
complement. We also show that they are polynomially related to an analogue of
the block sensitivity. We further study parity decision trees in their
relations with an intermediate variant of the decision trees, as well as with
communication complexity.Comment: submitted to TCS on 16-MAR-200
Learning using Local Membership Queries
We introduce a new model of membership query (MQ) learning, where the
learning algorithm is restricted to query points that are \emph{close} to
random examples drawn from the underlying distribution. The learning model is
intermediate between the PAC model (Valiant, 1984) and the PAC+MQ model (where
the queries are allowed to be arbitrary points).
Membership query algorithms are not popular among machine learning
practitioners. Apart from the obvious difficulty of adaptively querying
labelers, it has also been observed that querying \emph{unnatural} points leads
to increased noise from human labelers (Lang and Baum, 1992). This motivates
our study of learning algorithms that make queries that are close to examples
generated from the data distribution.
We restrict our attention to functions defined on the -dimensional Boolean
hypercube and say that a membership query is local if its Hamming distance from
some example in the (random) training data is at most . We show the
following results in this model:
(i) The class of sparse polynomials (with coefficients in R) over
is polynomial time learnable under a large class of \emph{locally smooth}
distributions using -local queries. This class also includes the
class of -depth decision trees.
(ii) The class of polynomial-sized decision trees is polynomial time
learnable under product distributions using -local queries.
(iii) The class of polynomial size DNF formulas is learnable under the
uniform distribution using -local queries in time
.
(iv) In addition we prove a number of results relating the proposed model to
the traditional PAC model and the PAC+MQ model
Short-Range Interactions and Decision Tree-Based Protein Contact Map Predictor
In this paper, we focus on protein contact map prediction,
one of the most important intermediate steps of the protein folding prob lem. The objective of this research is to know how short-range interac tions can contribute to a system based on decision trees to learn about
the correlation among the covalent structures of a protein residues. We
propose a solution to predict protein contact maps that combines the
use of decision trees with a new input codification for short-range in teractions. The method’s performance was very satisfactory, improving
the accuracy instead using all information of the protein sequence. For a
globulin data set the method can predict contacts with a maximal accu racy of 43%. The presented predictive model illustrates that short-range
interactions play the predominant role in determining protein structur
Fourier Growth of Parity Decision Trees
We prove that for every parity decision tree of depth d on n variables, the sum of absolute values of Fourier coefficients at level ? is at most d^{?/2} ? O(? ? log(n))^?. Our result is nearly tight for small values of ? and extends a previous Fourier bound for standard decision trees by Sherstov, Storozhenko, and Wu (STOC, 2021).
As an application of our Fourier bounds, using the results of Bansal and Sinha (STOC, 2021), we show that the k-fold Forrelation problem has (randomized) parity decision tree complexity ??(n^{1-1/k}), while having quantum query complexity ? k/2?.
Our proof follows a random-walk approach, analyzing the contribution of a random path in the decision tree to the level-? Fourier expression. To carry the argument, we apply a careful cleanup procedure to the parity decision tree, ensuring that the value of the random walk is bounded with high probability. We observe that step sizes for the level-? walks can be computed by the intermediate values of level ? ?-1 walks, which calls for an inductive argument. Our approach differs from previous proofs of Tal (FOCS, 2020) and Sherstov, Storozhenko, and Wu (STOC, 2021) that relied on decompositions of the tree. In particular, for the special case of standard decision trees we view our proof as slightly simpler and more intuitive.
In addition, we prove a similar bound for noisy decision trees of cost at most d - a model that was recently introduced by Ben-David and Blais (FOCS, 2020)
On Rotation Distance of Rank Bounded Trees
Computing the rotation distance between two binary trees with internal
nodes efficiently (in time) is a long standing open question in the
study of height balancing in tree data structures. In this paper, we initiate
the study of this problem bounding the rank of the trees given at the input
(defined by Ehrenfeucht and Haussler (1989) in the context of decision trees).
We define the rank-bounded rotation distance between two given binary trees
and (with internal nodes) of rank at most , denoted by
, as the length of the shortest sequence of rotations that
transforms to with the restriction that the intermediate trees must
be of rank at most . We show that the rotation distance problem reduces in
polynomial time to the rank bounded rotation distance problem. This motivates
the study of the problem in the combinatorial and algorithmic frontiers.
Observing that trees with rank coincide exactly with skew trees (binary
trees where every internal node has at least one leaf as a child), we show the
following results in this frontier :
We present an time algorithm for computing . That is,
when the given trees are skew trees (we call this variant as skew rotation
distance problem) - where the intermediate trees are restricted to be skew as
well. In particular, our techniques imply that for any two skew trees
.
We show the following upper bound : for any two trees and of rank
at most and respectively, we have that: where . This bound is asymptotically
tight for .
En route our proof of the above theorems, we associate binary trees to
permutations and bivariate polynomials, and prove several characterizations in
the case of skew trees.Comment: 25 pages, 2 figures, Abstract shortened to meet arxiv requirement
Sequent Calculus in the Topos of Trees
Nakano's "later" modality, inspired by G\"{o}del-L\"{o}b provability logic,
has been applied in type systems and program logics to capture guarded
recursion. Birkedal et al modelled this modality via the internal logic of the
topos of trees. We show that the semantics of the propositional fragment of
this logic can be given by linear converse-well-founded intuitionistic Kripke
frames, so this logic is a marriage of the intuitionistic modal logic KM and
the intermediate logic LC. We therefore call this logic
. We give a sound and cut-free complete sequent
calculus for via a strategy that decomposes
implication into its static and irreflexive components. Our calculus provides
deterministic and terminating backward proof-search, yields decidability of the
logic and the coNP-completeness of its validity problem. Our calculus and
decision procedure can be restricted to drop linearity and hence capture KM.Comment: Extended version, with full proof details, of a paper accepted to
FoSSaCS 2015 (this version edited to fix some minor typos
Bayesian networks and decision trees in the diagnosis of female urinary incontinence
This study compares the effectiveness of Bayesian networks versus Decision Trees in modeling the Integral Theory of Female Urinary Incontinence diagnostic algorithm. Bayesian networks and Decision Trees were developed and trained using data from 58 adult women presenting with urinary incontinence symptoms. A Bayesian Network was developed in collaboration with an expert specialist who regularly utilizes a non-automated diagnostic algorithm in clinical practice. The original Bayesian network was later refined using a more connected approach. Diagnoses determined from all automated approaches were compared with the diagnoses of a single human expert. In most cases, Bayesian networks were found to be at least as accurate as the Decision Tree approach. The refined Connected Bayesian Network was found to be more accurate than the Original Bayesian Network accurately discriminated between diagnoses despite the small sample size. In contrast, the Connected and Decision Tree approaches were less able to discriminate between diagnoses. The Original Bayesian Network was found to provide an excellent basis for graphically communicating the correlation between symptoms and laxity defects in a given anatomical zone. Performance measures in both networks indicate that Bayesian networks could provide a potentially useful tool in the management of female pelvic floor dysfunction. Before the technique can be utilized in practice, well-established learning algorithms should be applied to improve network structure. A larger training data set should also improve network accuracy, sensitivity, and specificity
Decision support methods in diabetic patient management by insulin administration neural network vs. induction methods for knowledge classification
Diabetes mellitus is now recognised as a major worldwide
public health problem. At present, about 100
million people are registered as diabetic patients. Many
clinical, social and economic problems occur as a
consequence of insulin-dependent diabetes. Treatment
attempts to prevent or delay complications by applying
‘optimal’ glycaemic control. Therefore, there is a
continuous need for effective monitoring of the patient.
Given the popularity of decision tree learning
algorithms as well as neural networks for knowledge
classification which is further used for decision
support, this paper examines their relative merits by
applying one algorithm from each family on a medical
problem; that of recommending a particular diabetes
regime. For the purposes of this study, OC1 a
descendant of Quinlan’s ID3 algorithm was chosen as
decision tree learning algorithm and a generating
shrinking algorithm for learning arbitrary
classifications as a neural network algorithm. These
systems were trained on 646 cases derived from two
countries in Europe and were tested on 100 cases
which were different from the original 646 cases
- …