7 research outputs found

    SQ Lower Bounds for Learning Mixtures of Linear Classifiers

    Full text link
    We study the problem of learning mixtures of linear classifiers under Gaussian covariates. Given sample access to a mixture of rr distributions on Rn\mathbb{R}^n of the form (x,yβ„“)(\mathbf{x},y_{\ell}), β„“βˆˆ[r]\ell\in [r], where x∼N(0,In)\mathbf{x}\sim\mathcal{N}(0,\mathbf{I}_n) and yβ„“=sign(⟨vβ„“,x⟩)y_\ell=\mathrm{sign}(\langle\mathbf{v}_\ell,\mathbf{x}\rangle) for an unknown unit vector vβ„“\mathbf{v}_\ell, the goal is to learn the underlying distribution in total variation distance. Our main result is a Statistical Query (SQ) lower bound suggesting that known algorithms for this problem are essentially best possible, even for the special case of uniform mixtures. In particular, we show that the complexity of any SQ algorithm for the problem is npoly(1/Ξ”)log⁑(r)n^{\mathrm{poly}(1/\Delta) \log(r)}, where Ξ”\Delta is a lower bound on the pairwise β„“2\ell_2-separation between the vβ„“\mathbf{v}_\ell's. The key technical ingredient underlying our result is a new construction of spherical designs that may be of independent interest.Comment: To appear in NeurIPS 202

    Concept drift learning and its application to adaptive information filtering

    Get PDF
    Tracking the evolution of user interests is a problem instance of concept drift learning. Keeping track of multiple interest categories is a natural phenomenon as well as an interesting tracking problem because interests can emerge and diminish at different time frames. The first part of this dissertation presents a Multiple Three-Descriptor Representation (MTDR) algorithm, a novel algorithm for learning concept drift especially built for tracking the dynamics of multiple target concepts in the information filtering domain. The learning process of the algorithm combines the long-term and short-term interest (concept) models in an attempt to benefit from the strength of both models. The MTDR algorithm improves over existing concept drift learning algorithms in the domain. Being able to track multiple target concepts with a few examples poses an even more important and challenging problem because casual users tend to be reluctant to provide the examples needed, and learning from a few labeled data is generally difficult. The second part presents a computational Framework for Extending Incomplete Labeled Data Stream (FEILDS). The system modularly extends the capability of an existing concept drift learner in dealing with incomplete labeled data stream. It expands the learner's original input stream with relevant unlabeled data; the process generates a new stream with improved learnability. FEILDS employs a concept formation system for organizing its input stream into a concept (cluster) hierarchy. The system uses the concept and cluster hierarchy to identify the instance's concept and unlabeled data relevant to a concept. It also adopts the persistence assumption in temporal reasoning for inferring the relevance of concepts. Empirical evaluation indicates that FEILDS is able to improve the performance of existing learners particularly when learning from a stream with a few labeled data. Lastly, a new concept formation algorithm, one of the key components in the FEILDS architecture, is presented. The main idea is to discover intrinsic hierarchical structures regardless of the class distribution and the shape of the input stream. Experimental evaluation shows that the algorithm is relatively robust to input ordering, consistently producing a hierarchy structure of high quality

    Margins and combined classifiers

    Get PDF

    Concept drift learning and its application to adaptive information filtering

    Get PDF
    Tracking the evolution of user interests is a problem instance of concept drift learning. Keeping track of multiple interest categories is a natural phenomenon as well as an interesting tracking problem because interests can emerge and diminish at different time frames. The first part of this dissertation presents a Multiple Three-Descriptor Representation (MTDR) algorithm, a novel algorithm for learning concept drift especially built for tracking the dynamics of multiple target concepts in the information filtering domain. The learning process of the algorithm combines the long-term and short-term interest (concept) models in an attempt to benefit from the strength of both models. The MTDR algorithm improves over existing concept drift learning algorithms in the domain. Being able to track multiple target concepts with a few examples poses an even more important and challenging problem because casual users tend to be reluctant to provide the examples needed, and learning from a few labeled data is generally difficult. The second part presents a computational Framework for Extending Incomplete Labeled Data Stream (FEILDS). The system modularly extends the capability of an existing concept drift learner in dealing with incomplete labeled data stream. It expands the learner's original input stream with relevant unlabeled data; the process generates a new stream with improved learnability. FEILDS employs a concept formation system for organizing its input stream into a concept (cluster) hierarchy. The system uses the concept and cluster hierarchy to identify the instance's concept and unlabeled data relevant to a concept. It also adopts the persistence assumption in temporal reasoning for inferring the relevance of concepts. Empirical evaluation indicates that FEILDS is able to improve the performance of existing learners particularly when learning from a stream with a few labeled data. Lastly, a new concept formation algorithm, one of the key components in the FEILDS architecture, is presented. The main idea is to discover intrinsic hierarchical structures regardless of the class distribution and the shape of the input stream. Experimental evaluation shows that the algorithm is relatively robust to input ordering, consistently producing a hierarchy structure of high quality

    Randomness versus non-determinism in distributed computing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 1995.Includes bibliographical references (p. 209-214) and index.by Alain Isaac Saias.Ph.D

    Learning Switching Concepts

    No full text
    We consider learning in situations where the function used to classify examples may switch back and forth between a small number of different concepts during the course of learning. We examine several models for such situations: oblivious models in which switches are made independent of the selection of examples, and more adversarial models in which a single adversary controls both the concept switches and example selection. We show relationships between the more benign models and the pconcepts of Kearns and Schapire, and present polynomial-time algorithms for learning switches between two k-DNF formulas. For the most adversarial model, we present a model of success patterned after the popular competitive analysis used in studying on-line algorithms. We describe a randomized query algorithm for such adversarial switches between two monotone disjunctions that is β€œl-competitive ” in that the total number of mistakes plus queries is with high probability bounded by the number of switches plus some fixed polynomial in n (the number of variables). We also use notions described here to provide sufficient conditions under which learning a p-concept class β€œwith a decision rule ” implies being able to learn the class β€œwith a model of probability.”