2,507 research outputs found

    Learning Unions of Rectangles with Queries

    Full text link
    We investigate the efficient learnability of unions of k rectangles in the discrete plane (1,...,n)[2] with equivalence and membership queries. We exhibit a learning algorithm that learns any union of k rectangles with O(k^3log n) queries, while the time complexity of this algorithm is bounded by O(k^5log n). We design our learning algorithm by finding "corners" and "edges" for rectangles contained in the target concept and then constructing the target concept from those "corners" and "edges". Our result provides a first approach to on-line learning of nontrivial subclasses of unions of intersections of halfspaces with equivalence and membership queries.NSF (CCR91-9103055), Boston University Presidential Graduate Fellowshi

    Learning Unions of ω(1)\omega(1)-Dimensional Rectangles

    Get PDF
    We consider the problem of learning unions of rectangles over the domain [b]n[b]^n, in the uniform distribution membership query learning setting, where both b and n are "large". We obtain poly(n,logb)(n, \log b)-time algorithms for the following classes: - poly(nlogb)(n \log b)-way Majority of O(log(nlogb)loglog(nlogb))O(\frac{\log(n \log b)} {\log \log(n \log b)})-dimensional rectangles. - Union of poly(log(nlogb))(\log(n \log b)) many O(log2(nlogb)(loglog(nlogb)logloglog(nlogb))2)O(\frac{\log^2 (n \log b)} {(\log \log(n \log b) \log \log \log (n \log b))^2})-dimensional rectangles. - poly(nlogb)(n \log b)-way Majority of poly(nlogb)(n \log b)-Or of disjoint O(log(nlogb)loglog(nlogb))O(\frac{\log(n \log b)} {\log \log(n \log b)})-dimensional rectangles. Our main algorithmic tool is an extension of Jackson's boosting- and Fourier-based Harmonic Sieve algorithm [Jackson 1997] to the domain [b]n[b]^n, building on work of [Akavia, Goldwasser, Safra 2003]. Other ingredients used to obtain the results stated above are techniques from exact learning [Beimel, Kushilevitz 1998] and ideas from recent work on learning augmented AC0AC^{0} circuits [Jackson, Klivans, Servedio 2002] and on representing Boolean functions as thresholds of parities [Klivans, Servedio 2001].Comment: 25 pages. Some corrections. Recipient of E. M. Gold award ALT 2006. To appear in Journal of Theoretical Computer Scienc

    Learning in Parallel

    Get PDF
    In this paper, we extend Valiant's sequential model of concept learning from examples [Valiant 1984] and introduce models for the e cient learning of concept classes from examples in parallel. We say that a concept class is NC-learnable if it can be learned in polylog time with a polynomial number of processors. We show that several concept classes which are polynomial-time learnable are NC-learnable in constant time. Some other classes can be shown to be NC-learnable in logarithmic time, but not in constant time. Our main result shows that other classes, such as s-fold unions of geometrical objects in Euclidean space, which are polynomial-time learnable by a greedy set cover technique, are NC-learnable using a non-greedy technique. We also show that (unless P RNC) several polynomial-time learnable concept classes related to linear programming are not NC-learnable. Equivalence of various parallel learning models and issues of fault-tolerance are also discussed

    Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners

    Get PDF
    In real-world applications of education, an effective teacher adaptively chooses the next example to teach based on the learner's current state. However, most existing work in algorithmic machine teaching focuses on the batch setting, where adaptivity plays no role. In this paper, we study the case of teaching consistent, version space learners in an interactive setting. At any time step, the teacher provides an example, the learner performs an update, and the teacher observes the learner's new state. We highlight that adaptivity does not speed up the teaching process when considering existing models of version space learners, such as "worst-case" (the learner picks the next hypothesis randomly from the version space) and "preference-based" (the learner picks hypothesis according to some global preference). Inspired by human teaching, we propose a new model where the learner picks hypotheses according to some local preference defined by the current hypothesis. We show that our model exhibits several desirable properties, e.g., adaptivity plays a key role, and the learner's transitions over hypotheses are smooth/interpretable. We develop efficient teaching algorithms and demonstrate our results via simulation and user studies.Comment: NeurIPS 2018 (extended version

    On the implementation of LIR: the case of simple linear regression with interval data

    Get PDF
    This paper considers the problem of simple linear regression with interval-censored data. That is, n pairs of intervals are observed instead of the n pairs of precise values for the two variables (dependent and independent). Each of these intervals is closed but possibly unbounded, and contains the corresponding (unobserved) value of the dependent or independent variable. The goal of the regression is to describe the relationship between (the precise values of) these two variables by means of a linear function. Likelihood-based Imprecise Regression (LIR) is a recently introduced, very general approach to regression for imprecisely observed quantities. The result of a LIR analysis is in general set-valued: it consists of all regression functions that cannot be excluded on the basis of likelihood inference. These regression functions are said to be undominated. Since the interval data can be unbounded, a robust regression method is necessary. Hence, we consider the robust LIR method based on the minimization of the residuals' quantiles. For this method, we prove that the set of all the intercept-slope pairs corresponding to the undominated regression functions is the union of finitely many polygons. We give an exact algorithm for determining this set (i.e., for determining the set-valued result of the robust LIR analysis), and show that it has worst-case time complexity O(n^3 log n). We have implemented this exact algorithm as part of the R package linLIR

    Box Drawings for Learning with Imbalanced Data

    Get PDF
    The vast majority of real world classification problems are imbalanced, meaning there are far fewer data from the class of interest (the positive class) than from other classes. We propose two machine learning algorithms to handle highly imbalanced classification problems. The classifiers constructed by both methods are created as unions of parallel axis rectangles around the positive examples, and thus have the benefit of being interpretable. The first algorithm uses mixed integer programming to optimize a weighted balance between positive and negative class accuracies. Regularization is introduced to improve generalization performance. The second method uses an approximation in order to assist with scalability. Specifically, it follows a \textit{characterize then discriminate} approach, where the positive class is characterized first by boxes, and then each box boundary becomes a separate discriminative classifier. This method has the computational advantages that it can be easily parallelized, and considers only the relevant regions of feature space

    NP-hardness of circuit minimization for multi-output functions

    Get PDF
    Can we design efficient algorithms for finding fast algorithms? This question is captured by various circuit minimization problems, and algorithms for the corresponding tasks have significant practical applications. Following the work of Cook and Levin in the early 1970s, a central question is whether minimizing the circuit size of an explicitly given function is NP-complete. While this is known to hold in restricted models such as DNFs, making progress with respect to more expressive classes of circuits has been elusive. In this work, we establish the first NP-hardness result for circuit minimization of total functions in the setting of general (unrestricted) Boolean circuits. More precisely, we show that computing the minimum circuit size of a given multi-output Boolean function f : {0,1}^n ? {0,1}^m is NP-hard under many-one polynomial-time randomized reductions. Our argument builds on a simpler NP-hardness proof for the circuit minimization problem for (single-output) Boolean functions under an extended set of generators. Complementing these results, we investigate the computational hardness of minimizing communication. We establish that several variants of this problem are NP-hard under deterministic reductions. In particular, unless ? = ??, no polynomial-time computable function can approximate the deterministic two-party communication complexity of a partial Boolean function up to a polynomial. This has consequences for the class of structural results that one might hope to show about the communication complexity of partial functions
    corecore