1,123 research outputs found
Learning Unions of -Dimensional Rectangles
We consider the problem of learning unions of rectangles over the domain
, in the uniform distribution membership query learning setting, where
both b and n are "large". We obtain poly-time algorithms for the
following classes:
- poly-way Majority of -dimensional rectangles.
- Union of poly many -dimensional rectangles.
- poly-way Majority of poly-Or of disjoint
-dimensional rectangles.
Our main algorithmic tool is an extension of Jackson's boosting- and
Fourier-based Harmonic Sieve algorithm [Jackson 1997] to the domain ,
building on work of [Akavia, Goldwasser, Safra 2003]. Other ingredients used to
obtain the results stated above are techniques from exact learning [Beimel,
Kushilevitz 1998] and ideas from recent work on learning augmented
circuits [Jackson, Klivans, Servedio 2002] and on representing Boolean
functions as thresholds of parities [Klivans, Servedio 2001].Comment: 25 pages. Some corrections. Recipient of E. M. Gold award ALT 2006.
To appear in Journal of Theoretical Computer Scienc
Complexity of Equivalence and Learning for Multiplicity Tree Automata
We consider the complexity of equivalence and learning for multiplicity tree
automata, i.e., weighted tree automata over a field. We first show that the
equivalence problem is logspace equivalent to polynomial identity testing, the
complexity of which is a longstanding open problem. Secondly, we derive lower
bounds on the number of queries needed to learn multiplicity tree automata in
Angluin's exact learning model, over both arbitrary and fixed fields.
Habrard and Oncina (2006) give an exact learning algorithm for multiplicity
tree automata, in which the number of queries is proportional to the size of
the target automaton and the size of a largest counterexample, represented as a
tree, that is returned by the Teacher. However, the smallest
tree-counterexample may be exponential in the size of the target automaton.
Thus the above algorithm does not run in time polynomial in the size of the
target automaton, and has query complexity exponential in the lower bound.
Assuming a Teacher that returns minimal DAG representations of
counterexamples, we give a new exact learning algorithm whose query complexity
is quadratic in the target automaton size, almost matching the lower bound, and
improving the best previously-known algorithm by an exponential factor
Inferring Symbolic Automata
We study the learnability of symbolic finite state automata, a model shown useful in many applications in software verification. The state-of-the-art literature on this topic follows the query learning paradigm, and so far all obtained results are positive. We provide a necessary condition for efficient learnability of SFAs in this paradigm, from which we obtain the first negative result. The main focus of our work lies in the learnability of SFAs under the paradigm of identification in the limit using polynomial time and data. We provide a necessary condition and a sufficient condition for efficient learnability of SFAs in this paradigm, from which we derive a positive and a negative result
Active Learning with Multiple Views
Active learners alleviate the burden of labeling large amounts of data by
detecting and asking the user to label only the most informative examples in
the domain. We focus here on active learning for multi-view domains, in which
there are several disjoint subsets of features (views), each of which is
sufficient to learn the target concept. In this paper we make several
contributions. First, we introduce Co-Testing, which is the first approach to
multi-view active learning. Second, we extend the multi-view learning framework
by also exploiting weak views, which are adequate only for learning a concept
that is more general/specific than the target concept. Finally, we empirically
show that Co-Testing outperforms existing active learners on a variety of real
world domains such as wrapper induction, Web page classification, advertisement
removal, and discourse tree parsing
Inferring Symbolic Automata
We study the learnability of symbolic finite state automata, a model shown useful in many applications in software verification. The state-of-the-art literature on this topic follows the query learning paradigm, and so far all obtained results are positive. We provide a necessary condition for efficient learnability of SFAs in this paradigm, from which we obtain the first negative result. The main focus of our work lies in the learnability of SFAs under the paradigm of identification in the limit using polynomial time and data. We provide a necessary condition and a sufficient condition for efficient learnability of SFAs in this paradigm, from which we derive a positive and a negative result
Four Lessons in Versatility or How Query Languages Adapt to the Web
Exposing not only human-centered information, but machine-processable data on the Web is one of the commonalities of recent Web trends. It has enabled a new kind of applications and businesses where the data is used in ways not foreseen by the data providers. Yet this exposition has fractured the Web into islands of data, each in different Web formats: Some providers choose XML, others RDF, again others JSON or OWL, for their data, even in similar domains. This fracturing stifles innovation as application builders have to cope not only with one Web stack (e.g., XML technology) but with several ones, each of considerable complexity. With Xcerpt we have developed a rule- and pattern based query language that aims to give shield application builders from much of this complexity: In a single query language XML and RDF data can be accessed, processed, combined, and re-published. Though the need for combined access to XML and RDF data has been recognized in previous work (including the W3C’s GRDDL), our approach differs in four main aspects: (1) We provide a single language (rather than two separate or embedded languages), thus minimizing the conceptual overhead of dealing with disparate data formats. (2) Both the declarative (logic-based) and the operational semantics are unified in that they apply for querying XML and RDF in the same way. (3) We show that the resulting query language can be implemented reusing traditional database technology, if desirable. Nevertheless, we also give a unified evaluation approach based on interval labelings of graphs that is at least as fast as existing approaches for tree-shaped XML data, yet provides linear time and space querying also for many RDF graphs. We believe that Web query languages are the right tool for declarative data access in Web applications and that Xcerpt is a significant step towards a more convenient, yet highly efficient data access in a “Web of Data”
Characterising Modal Formulas with Examples
We study the existence of finite characterisations for modal formulas. A
finite characterisation of a modal formula is a finite collection of
positive and negative examples that distinguishes from every other,
non-equivalent modal formula, where an example is a finite pointed Kripke
structure. This definition can be restricted to specific frame classes and to
fragments of the modal language: a modal fragment admits finite
characterisations with respect to a frame class if every formula
has a finite characterisation with respect to consting of
examples that are based on frames in . Finite characterisations are useful
for illustration, interactive specification, and debugging of formal
specifications, and their existence is a precondition for exact learnability
with membership queries. We show that the full modal language admits finite
characterisations with respect to a frame class only when the modal logic
of is locally tabular. We then study which modal fragments, freely
generated by some set of connectives, admit finite characterisations. Our main
result is that the positive modal language without the truth-constants
and admits finite characterisations w.r.t. the class of all frames. This
result is essentially optimal: finite characterizability fails when the
language is extended with the truth constant or with all but very
limited forms of negation.Comment: Expanded version of material from Raoul Koudijs's MSc thesis (2022
- …