143 research outputs found

    Computational Learning Theory Lecture Notes for CS 582 Spring Semester, 1991

    Get PDF
    This manuscript is a compilation of lecture notes from the graduate level course CS 582, Computational Learning Theory, I taught at Washington University in the spring of 1991. Students taking the course were assumed to have background in the design and analysis of algorithms as well as good mathematical background. Given that there is no text available on this subject, the course material was drawn from recent research papers. I selected the first twelve topics and the remainder were selected by the students from a list of provided topics. This list of topics is given at the end of these notes. These notes were mostly written by the students in the class and then reviewed by me. However, there are likely to be errors and omissions, particularly with regard to the references. Readers finding errors in the notes are encouraged to notify me by electronic mail ([email protected]) so that later versions may be corrected

    Learning k-term DNF Formulas with an Incomplete Oracle

    Get PDF
    We consider the problem of learning k-term DNF formulas using equivalence queries and incomplete membership queries as defined by Angluin and Slonim. We demonstrate the this model can be applied to non-monotone classes. Namely, we describe a polynomial algorithm that exactly identifies a k-term DNF formula with a k-term DNF hypothesis using incomplete membership queries and equivalence queries from the class of DNF formulas

    Can PAC Learning Algorithms Tolerate Random Attribute Noise?

    Get PDF
    This paper studies the robustness of pac learning algorithms when the instances space is {0,1}n, and the examples are corrupted by purely random noise affecting only the instances (and not the labels). In the past, conflicting results on this subject have been obtained -- the best agreement rule can only tolerate small amounts of noise, yet in some cases large amounts of noise can be tolerated. We show that the truth lies somewhere in between these two alternatives. For uniform attribute noise, in which each attribute is flipped independently at random with the same probability, we present an algorithm that pac learns monomials for any (unknown) noise rate less than 1/2. Contrasting this positive result, we show that product random attribute noise, where each attribute i is flipped randomly and independently with its own probability pi, is nearly as harmful as malicious noise-- no algorithm can tolerate more than a very small amount of such noise

    PAC Learing of One-Dimensional Patterns

    Get PDF
    Developing the ability to recognize a landmark from a visual image of a robot\u27s current location is a fundamental problem in robotics. We consider the problem of PAC-learning the concept class of geometric patterns where the target geometric pattern is a configuration of k points on the real line. Each instance is a configuration of n points on the real line, where it is labeled according to whether or not it visually resembles the target pattern. To capture the notion of visual resemblance we use the Hausdorff metric. Informally, two geometric patterns P and Q resemble each othe runder the Hausdorff metric, if every point on one pattern is close to some point on the other pattern. We relate the concept class of geometric patterns to the landmark recognition problem and then present a polynomial-time algorithm that PAC-learns the class of one-dimensional geometric patterns. We also present some experimental results on how our algorithm performs

    Learning from Examples with Unspecified Attribute Values

    Get PDF
    We introduce the UAV learning model in which some of the attributes in the examples are unspecified. In our model, an example x is classified positive (resp., negative) if all possible assignments for the unspecified attributes result in a positive (resp., negative) classification. Otherwise the classificatoin given to x is ? (for unknown). Given an example x in which some attributes are unspecified, the oracle UAV-MQ responds with the classification of x. Given a hypothesis h, the oracle UAV-EQ returns an example x (that could have unspecified attributes) for which h(x) is incorrect. We show that any class learnable in the exact model using the MQ and EQ oracles is also learnable in the UAV model using the MQ and UAV-EQ oracles as long as the counterexamples provided by the UAV-EQ oracle have a logarithmic number of unspecified attributes. We also show that any class learnable in the exact model using the MQ and EQ oracles is also learnable in the UAV model using the UAV-MQ and UAV-EQ oracles as well as an oracle to evaluate a given boolean formula on an example with unspecified attributes. (For some hypothesis classes such as decision trees and unate formulas the evaluation can be done in polynomial time without an oracle.) We also study the learnability of a universal class of decision trees under the UAV model and of DNF formulas under a representation-dependent variation of the UAV model

    Noise-Tolerant Parallel Learning of Geometric Concepts

    Get PDF
    We present several efficient parallel algorithms for PAC-learning geometric concepts in a constant-dimensional space. The algorithms are robust even against malicious classification noise of any rate less than 1/2. We first give an efficient noise-tolerant parallel algorithm to PAC-learn the class of geometric concepts defined by a polynomial number of (d-1)-dimensional hyperplanes against an arbitrary distribution where each hyperplane has a slope from a set of known slopes. We then describe how boosting techniques can be used so that our algorithms\u27 dependence on {GREEK LETTER} and {DELTA} does not depend on d. Next we give an efficient noise-tolerant parallel algorithm to PAC-learn the class of geometric concepts defined by a polynomial number of (d-1)-dimensional hyperplanes (of unrestricted slopes) against a uniform distribution. We then show how to extend our algorithm to learn this class against any (unknown) product distribution. Next we defined a complexity measure of any set S of (d-1)-dimensional surfaces that we call the variant of S and prove that the class of geometric concepts defined by surfaces of polynomial variant can be efficienty learned in parallel under a product distribution (even under malicious classification noise). Furthermore, we show that the VC-dimension of the class of geometric concepts defined by a set of surfaces S of variant v is at least v. Finally, we give an efficient, parallel, noise-tolerant algorithm to PAC-learn any geometric concept defined by a set S of (d-1)-dimensional surfaces of polynomial area under a uniform distribution

    Agnostic Learning of Geometric Patterns

    Get PDF
    Goldberg, Goldman, and Scott demonstrated how the problem of recognizing a landmark from a one-dimensional visual image can be mapped to that of learning a one-dimensional geometric pattern and gave a PAC algorithm to learn that class. In this paper, we present an efficient on-line agnostic learning algorithm for learning the class of constant-dimension geometric patterns. Our algorithm can tolerate both classification and attribute noise. By working in higher dimensional spaces we can represent more features from the visual image in the geometric pattern. Our mapping of the data to a geometric pattern, and hence our learning algorithm, is applicable to any data representable as a constant-dimensional array of values, e.g. sonar data, temporal difference information, or amplitudes of a waveform. To our knowledge, these classes of patterns are more complex than any class of geometric patterns previously studied. Also, our results are easily adapted to learn the union of fixed-dimensional boxes from multiple-instance examples. Finally, our algorithms are tolerant of concept shift

    Real-valued multiple-instance learning with queries

    Get PDF
    AbstractWhile there has been a significant amount of theoretical and empirical research on the multiple-instance learning model, most of this research is for concept learning. However, for the important application area of drug discovery, a real-valued classification is preferable. In this paper we initiate a theoretical study of real-valued multiple-instance learning. We prove that the problem of finding a target point consistent with a set of labeled multiple-instance examples (or bags) is NP-complete, and that the problem of learning from real-valued multiple-instance examples is as hard as learning DNF. Another contribution of our work is in defining and studying a multiple-instance membership query (MI-MQ). We give a positive result on exactly learning the target point for a multiple-instance problem in which the learner is provided with a MI-MQ oracle and a single adversarially selected bag

    Smartacking: Improving TCP Performance from the Receiving End

    Get PDF
    We present smartacking, a technique that improves performance of Transmission Control Protocol (TCP) via adaptive generation of acknowledgments (ACKs) at the receiver. When the bottleneck link is underutilized, the receiver transmits an ACK for each delivered data segment and thereby allows the connection to acquire the available capacity promptly. When the bottleneck link is at its capacity, the smartacking receiver sends ACKs with a lower frequency reducing the control traffic overhead and slowing down the congestion window growth to utilize the network capacity more effectively. To promote quick deployment of the technique, our primary implementation of smartacking modifies only the receiver. This implementation estimates the sender\u27s congestion window using a novel algorithm of independent interest. We also consider different implementations of smartacking where the receiver relies on explicit assistance from the sender or network. Our experiments for a wide variety of settings show that TCP performance can substantially benefit from smartacking, especially in environments with low levels of connection multiplexing on bottleneck links. Whereas our extensive evaluation reveals no scenarios where the technique undermines the overall performance, we believe that smartacking represents a promising direction for enhancing TCP
    • …
    corecore