53 research outputs found

    Multiple Random Oracles Are Better Than One

    Get PDF
    We study the problem of learning k-juntas given access to examples drawn from a number of different product distributions. Thus we wish to learn a function f: {−1, 1}n → {−1, 1} that depends on k (unknown) coordinates. While the best-known algorithms for the general problem of learning a k-junta require running times of nk poly(n, 2k), we show that, given access to k different product distributions with biases separated by γ \u3e 0, the functions may be learned in time poly(n, 2k, γ−k). More generally, given access to t ≤ k different product distributions, the functions may be learned in time nk/tpoly(n, 2k, γ−k). Our techniques involve novel results in Fourier analysis, relating Fourier expansions with respect to different biases, and a generalization of Russo\u27s formula

    Active classification with comparison queries

    Full text link
    We study an extension of active learning in which the learning algorithm may ask the annotator to compare the distances of two examples from the boundary of their label-class. For example, in a recommendation system application (say for restaurants), the annotator may be asked whether she liked or disliked a specific restaurant (a label query); or which one of two restaurants did she like more (a comparison query). We focus on the class of half spaces, and show that under natural assumptions, such as large margin or bounded bit-description of the input examples, it is possible to reveal all the labels of a sample of size nn using approximately O(logn)O(\log n) queries. This implies an exponential improvement over classical active learning, where only label queries are allowed. We complement these results by showing that if any of these assumptions is removed then, in the worst case, Ω(n)\Omega(n) queries are required. Our results follow from a new general framework of active learning with additional queries. We identify a combinatorial dimension, called the \emph{inference dimension}, that captures the query complexity when each additional query is determined by O(1)O(1) examples (such as comparison queries, each of which is determined by the two compared examples). Our results for half spaces follow by bounding the inference dimension in the cases discussed above.Comment: 23 pages (not including references), 1 figure. The new version contains a minor fix in the proof of Lemma 4.

    Auto-encoders: reconstruction versus compression

    Full text link
    We discuss the similarities and differences between training an auto-encoder to minimize the reconstruction error, and training the same auto-encoder to compress the data via a generative model. Minimizing a codelength for the data using an auto-encoder is equivalent to minimizing the reconstruction error plus some correcting terms which have an interpretation as either a denoising or contractive property of the decoding function. These terms are related but not identical to those used in denoising or contractive auto-encoders [Vincent et al. 2010, Rifai et al. 2011]. In particular, the codelength viewpoint fully determines an optimal noise level for the denoising criterion

    Non-Vacuous Generalisation Bounds for Shallow Neural Networks

    Get PDF
    25 pages, 12 figuresWe focus on a specific class of shallow neural networks with a single hidden layer, namely those with L2L_2-normalised data and either a sigmoid-shaped Gaussian error function ("erf") activation or a Gaussian Error Linear Unit (GELU) activation. For these networks, we derive new generalisation bounds through the PAC-Bayesian theory; unlike most existing such bounds they apply to neural networks with deterministic rather than randomised parameters. Our bounds are empirically non-vacuous when the network is trained with vanilla stochastic gradient descent on MNIST and Fashion-MNIST

    Learning categorial grammars

    Get PDF
    In 1967 E. M. Gold published a paper in which the language classes from the Chomsky-hierarchy were analyzed in terms of learnability, in the technical sense of identification in the limit. His results were mostly negative, and perhaps because of this his work had little impact on linguistics. In the early eighties there was renewed interest in the paradigm, mainly because of work by Angluin and Wright. Around the same time, Arikawa and his co-workers refined the paradigm by applying it to so-called Elementary Formal Systems. By making use of this approach Takeshi Shinohara was able to come up with an impressive result; any class of context-sensitive grammars with a bound on its number of rules is learnable. Some linguistically motivated work on learnability also appeared from this point on, most notably Wexler & Culicover 1980 and Kanazawa 1994. The latter investigates the learnability of various classes of categorial grammar, inspired by work by Buszkowski and Penn, and raises some interesting questions. We follow up on this work by exploring complexity issues relevant to learning these classes, answering an open question from Kanazawa 1994, and applying the same kind of approach to obtain (non)learnable classes of Combinatory Categorial Grammars, Tree Adjoining Grammars, Minimalist grammars, Generalized Quantifiers, and some variants of Lambek Grammars. We also discuss work on learning tree languages and its application to learning Dependency Grammars. Our main conclusions are: - formal learning theory is relevant to linguistics, - identification in the limit is feasible for non-trivial classes, - the `Shinohara approach' -i.e., placing a numerical bound on the complexity of a grammar- can lead to a learnable class, but this completely depends on the specific nature of the formalism and the notion of complexity. We give examples of natural classes of commonly used linguistic formalisms that resist this kind of approach, - learning is hard work. Our results indicate that learning even `simple' classes of languages requires a lot of computational effort, - dealing with structure (derivation-, dependency-) languages instead of string languages offers a useful and promising approach to learnabilty in a linguistic contex

    Annual Report of the University, 1994-1995, Volumes 1-4

    Get PDF
    DEMONSTRATING THE STRENGTH OF DIVERSITY A walk around the UNM campus as students change classes demonstrates UNM\\u27s commitment to diversity. Students and professors from a variety of ethnic backgrounds crowd the sidewalks and fill classrooms. Over the past year UNM moved forward with existing and new programs to interest more minority students, faculty and staff in the University and to aid in their success while here. Hispanic Outlook in Higher Education recently recognized the University\\u27s endeavors, ranking UNM as one of the best colleges in the nation at graduating Hispanic students. Provost Mary Sue Coleman says diversity contributes to a stimulating environment where faculty and students have different points of view and experiences. The campus becomes a more intellectually alive place, she says. The efforts to build a diverse campus go hand in hand with the University\\u27s goals of achieving academic excellence and attracting the best and brightest. MINORITY ENROLLMENT In the fall of 1994 a total of 32 percent of the student body came from underrepresented groups. The UNM School of Law had the largest number of Native Americans enrolled in any law school in the country
    corecore