6 research outputs found
Grafting for Combinatorial Boolean Model using Frequent Itemset Mining
This paper introduces the combinatorial Boolean model (CBM), which is defined
as the class of linear combinations of conjunctions of Boolean attributes. This
paper addresses the issue of learning CBM from labeled data. CBM is of high
knowledge interpretability but na\"{i}ve learning of it requires exponentially
large computation time with respect to data dimension and sample size. To
overcome this computational difficulty, we propose an algorithm GRAB (GRAfting
for Boolean datasets), which efficiently learns CBM within the
-regularized loss minimization framework. The key idea of GRAB is to
reduce the loss minimization problem to the weighted frequent itemset mining,
in which frequent patterns are efficiently computable. We employ benchmark
datasets to empirically demonstrate that GRAB is effective in terms of
computational efficiency, prediction accuracy and knowledge discovery
Learning DNF Expressions from Fourier Spectrum
Since its introduction by Valiant in 1984, PAC learning of DNF expressions
remains one of the central problems in learning theory. We consider this
problem in the setting where the underlying distribution is uniform, or more
generally, a product distribution. Kalai, Samorodnitsky and Teng (2009) showed
that in this setting a DNF expression can be efficiently approximated from its
"heavy" low-degree Fourier coefficients alone. This is in contrast to previous
approaches where boosting was used and thus Fourier coefficients of the target
function modified by various distributions were needed. This property is
crucial for learning of DNF expressions over smoothed product distributions, a
learning model introduced by Kalai et al. (2009) and inspired by the seminal
smoothed analysis model of Spielman and Teng (2001).
We introduce a new approach to learning (or approximating) a polynomial
threshold functions which is based on creating a function with range [-1,1]
that approximately agrees with the unknown function on low-degree Fourier
coefficients. We then describe conditions under which this is sufficient for
learning polynomial threshold functions. Our approach yields a new, simple
algorithm for approximating any polynomial-size DNF expression from its "heavy"
low-degree Fourier coefficients alone. Our algorithm greatly simplifies the
proof of learnability of DNF expressions over smoothed product distributions.
We also describe an application of our algorithm to learning monotone DNF
expressions over product distributions. Building on the work of Servedio
(2001), we give an algorithm that runs in time \poly((s \cdot
\log{(s/\eps)})^{\log{(s/\eps)}}, n), where is the size of the target DNF
expression and \eps is the accuracy. This improves on \poly((s \cdot
\log{(ns/\eps)})^{\log{(s/\eps)} \cdot \log{(1/\eps)}}, n) bound of Servedio
(2001).Comment: Appears in Conference on Learning Theory (COLT) 201
Translating between Horn Representations and their Characteristic Models
Characteristic models are an alternative, model based, representation for
Horn expressions. It has been shown that these two representations are
incomparable and each has its advantages over the other. It is therefore
natural to ask what is the cost of translating, back and forth, between these
representations. Interestingly, the same translation questions arise in
database theory, where it has applications to the design of relational
databases. This paper studies the computational complexity of these problems.
Our main result is that the two translation problems are equivalent under
polynomial reductions, and that they are equivalent to the corresponding
decision problem. Namely, translating is equivalent to deciding whether a given
set of models is the set of characteristic models for a given Horn expression.
We also relate these problems to the hypergraph transversal problem, a well
known problem which is related to other applications in AI and for which no
polynomial time algorithm is known. It is shown that in general our translation
problems are at least as hard as the hypergraph transversal problem, and in a
special case they are equivalent to it.Comment: See http://www.jair.org/ for any accompanying file
A tableau method for the realizability and synthesis of reactive safety specifications
Reactive systems are systems that continuously interact with the environment. In general, as they are critical systems, a failure or malfunction can result in serious consequences, such as loss of human lives or large economic investments. Therefore, correctly modeling the behavior and verification of the system is crucial and, for this, Linear-time Temporal Logic (LTL) and Realizabilty and Synthesis problem represent a promising approach for obtaining confidence in the correctness of a reactive system. The Realizability and Synthesis problem decides if there is a model that satisfies the given specification under all possible environmental behaviours. Moreover, it can be seen as a game between two players; the player who controls the inputs of the system to be synthesized (environment player) and the player who controls the outputs and tries to satisfy the specification for each environmental behaviour (system player).
In this Master thesis, we present both a tableau decision method for deciding the realizability of specifications expressed in a safety fragment of LTL and a prototype that builds a Realizability Tableau from a safety specification input. The prototype returns an open tableau (meaning the specification is realizable) or a closed tableau (when the specification is unrealizable). Finally, we present the future of the work and some of the improvements that will be implemented
On the Learnability of Disjunctive Normal Form Formulas and Decision Trees
132 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1993.The learnability of disjunctive normal form formulas and decision trees is investigated. Polynomial time algorithms are given, and nonlearnability results are obtained, for restricted versions of these general learning problems.Polynomial time algorithms are presented for exactly learning (with membership and equivalence queries) read-twice DNF and read-k-disjoint DNF. A read-twice DNF formula is a boolean formula in disjunctive normal form where each variable appears at most twice. A read-k disjoint DNF formula f is a DNF formula where each variable appears at most k times (for an arbitrary positive integer k) and every assignment to the variables satisfies at most one term of f. The read-k disjoint DNF result also applies for a generalization of this class, which we call read-k sat-j DNF.For a similar learning protocol, it is shown that, assuming NP co-NP, there does not exist a polynomial time algorithm for learning read-thrice DNF formulas-boolean formulas in disjunctive normal form where each variable appears at most three times. This result contrasts with our polynomial time algorithm for learning read-twice DNF, and adds evidence to the conjecture that DNF is hard to learn in the membership and equivalence query model. Nonlearnability results are also obtained for the class of read-k decision trees. It is shown that this class is hard to learn in the membership and equivalence query model, provided that the equivalence queries are also required to be read-k decision trees. It is also shown that read-k decision trees are hard to learn in the PAC model (without membership queries).A different type of nonlearnability result is obtained for the class of arbitrary DNF formulas. A natural approach for learning DNF formulas (suggested by Valiant in a seminal paper of learning theory) is to greedily collect the prime implicants of the hidden function. We show that no algorithm using such an approach can learn DNF in polynomial time. Results which suggest that DNF formulas are hard to learn rely on the construction of rare hard-to-learn formulas. This raises the question of whether most DNF formulas are learnable. For certain natural definitions of most DNF formulas, this question is answered affirmatively.U of I OnlyRestricted to the U of I community idenfinitely during batch ingest of legacy ETD