53 research outputs found
Multiple Random Oracles Are Better Than One
We study the problem of learning k-juntas given access to examples drawn from a number of different product distributions. Thus we wish to learn a function f: {−1, 1}n → {−1, 1} that depends on k (unknown) coordinates. While the best-known algorithms for the general problem of learning a k-junta require running times of nk poly(n, 2k), we show that, given access to k different product distributions with biases separated by γ \u3e 0, the functions may be learned in time poly(n, 2k, γ−k). More generally, given access to t ≤ k different product distributions, the functions may be learned in time nk/tpoly(n, 2k, γ−k). Our techniques involve novel results in Fourier analysis, relating Fourier expansions with respect to different biases, and a generalization of Russo\u27s formula
Active classification with comparison queries
We study an extension of active learning in which the learning algorithm may
ask the annotator to compare the distances of two examples from the boundary of
their label-class. For example, in a recommendation system application (say for
restaurants), the annotator may be asked whether she liked or disliked a
specific restaurant (a label query); or which one of two restaurants did she
like more (a comparison query).
We focus on the class of half spaces, and show that under natural
assumptions, such as large margin or bounded bit-description of the input
examples, it is possible to reveal all the labels of a sample of size using
approximately queries. This implies an exponential improvement over
classical active learning, where only label queries are allowed. We complement
these results by showing that if any of these assumptions is removed then, in
the worst case, queries are required.
Our results follow from a new general framework of active learning with
additional queries. We identify a combinatorial dimension, called the
\emph{inference dimension}, that captures the query complexity when each
additional query is determined by examples (such as comparison queries,
each of which is determined by the two compared examples). Our results for half
spaces follow by bounding the inference dimension in the cases discussed above.Comment: 23 pages (not including references), 1 figure. The new version
contains a minor fix in the proof of Lemma 4.
Auto-encoders: reconstruction versus compression
We discuss the similarities and differences between training an auto-encoder
to minimize the reconstruction error, and training the same auto-encoder to
compress the data via a generative model. Minimizing a codelength for the data
using an auto-encoder is equivalent to minimizing the reconstruction error plus
some correcting terms which have an interpretation as either a denoising or
contractive property of the decoding function. These terms are related but not
identical to those used in denoising or contractive auto-encoders [Vincent et
al. 2010, Rifai et al. 2011]. In particular, the codelength viewpoint fully
determines an optimal noise level for the denoising criterion
Non-Vacuous Generalisation Bounds for Shallow Neural Networks
25 pages, 12 figuresWe focus on a specific class of shallow neural networks with a single hidden layer, namely those with -normalised data and either a sigmoid-shaped Gaussian error function ("erf") activation or a Gaussian Error Linear Unit (GELU) activation. For these networks, we derive new generalisation bounds through the PAC-Bayesian theory; unlike most existing such bounds they apply to neural networks with deterministic rather than randomised parameters. Our bounds are empirically non-vacuous when the network is trained with vanilla stochastic gradient descent on MNIST and Fashion-MNIST
Learning categorial grammars
In 1967 E. M. Gold published a paper in which the language classes from the Chomsky-hierarchy were analyzed in terms of learnability, in the technical sense of identification in the limit. His results were mostly negative, and perhaps because of this his work had little impact on linguistics.
In the early eighties there was renewed interest in the paradigm, mainly because of work by Angluin and Wright. Around the same time, Arikawa and his co-workers refined the paradigm by applying it to so-called Elementary Formal Systems. By making use of this approach Takeshi Shinohara was able to come up with an impressive result; any class of context-sensitive grammars with a bound on its number of rules is learnable.
Some linguistically motivated work on learnability also appeared from this point on, most notably Wexler & Culicover 1980 and Kanazawa 1994. The latter investigates the learnability of various classes of categorial grammar, inspired by work by Buszkowski and Penn, and raises some interesting questions.
We follow up on this work by exploring complexity issues relevant to learning these classes, answering an open question from Kanazawa 1994, and applying the same kind of approach to obtain (non)learnable classes of Combinatory Categorial Grammars, Tree Adjoining Grammars, Minimalist grammars, Generalized Quantifiers, and some variants of Lambek Grammars. We also discuss work on learning tree languages and its application to learning Dependency Grammars.
Our main conclusions are:
- formal learning theory is relevant to linguistics,
- identification in the limit is feasible for non-trivial classes,
- the `Shinohara approach' -i.e., placing a numerical bound on the complexity of a grammar- can lead to a learnable class, but this completely depends on the specific nature of the formalism and the notion of complexity. We give examples of natural classes of commonly used linguistic formalisms that resist this kind of approach,
- learning is hard work. Our results indicate that learning even `simple' classes of languages requires a lot of computational effort,
- dealing with structure (derivation-, dependency-) languages instead of string languages offers a useful and promising approach to learnabilty in a linguistic contex
Annual Report of the University, 1994-1995, Volumes 1-4
DEMONSTRATING THE STRENGTH OF DIVERSITY A walk around the UNM campus as students change classes demonstrates UNM\\u27s commitment to diversity. Students and professors from a variety of ethnic backgrounds crowd the sidewalks and fill classrooms. Over the past year UNM moved forward with existing and new programs to interest more minority students, faculty and staff in the University and to aid in their success while here. Hispanic Outlook in Higher Education recently recognized the University\\u27s endeavors, ranking UNM as one of the best colleges in the nation at graduating Hispanic students. Provost Mary Sue Coleman says diversity contributes to a stimulating environment where faculty and students have different points of view and experiences. The campus becomes a more intellectually alive place, she says. The efforts to build a diverse campus go hand in hand with the University\\u27s goals of achieving academic excellence and attracting the best and brightest. MINORITY ENROLLMENT In the fall of 1994 a total of 32 percent of the student body came from underrepresented groups. The UNM School of Law had the largest number of Native Americans enrolled in any law school in the country
- …