Search CORE

89 research outputs found

An algorithm for learning from hints

Author: Abu-Mostafa Y. S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/1993
Field of study

To take advantage of prior knowledge (hints) about the function one wants to learn, we introduce a method that generalizes learning from examples to learning from hints. A canonical representation of hints is defined and illustrated. All hints are represented to the learning process by examples, and examples of the function are treated on equal footing with the rest of the hints. During learning, examples from different hints are selected for processing according to a given schedule. We present two types of schedules; fixed schedules that specify the relative emphasis of each hint, and adaptive schedules that are based on how well each hint has been learned so far. Our learning method is compatible with any descent technique

Caltech Authors

Hints and the VC Dimension

Author: Abu-Mostafa Yaser S.
Publication venue: 'MIT Press - Journals'
Publication date: 01/03/1993
Field of study

Learning from hints is a generalization of learning from examples that allows for a variety of information about the unknown function to be used in the learning process. In this paper, we use the VC dimension, an established tool for analyzing learning from examples, to analyze learning from hints. In particular, we show how the VC dimension is affected by the introduction of a hint. We also derive a new quantity that defines a VC dimension for the hint itself. This quantity is used to estimate the number of examples needed to "absorb" the hint. We carry out the analysis for two types of hints, invariances and catalysts. We also describe how the same method can be applied to other types of hints

Caltech Authors

Dropout Training as Adaptive Regularization

Author: Liang Percy
Wager Stefan
Wang Sida
Publication venue
Publication date: 01/11/2013
Field of study

Dropout and other feature noising schemes control overfitting by artificially corrupting the training data. For generalized linear models, dropout performs a form of adaptive regularization. Using this viewpoint, we show that the dropout regularizer is first-order equivalent to an L2 regularizer applied after scaling the features by an estimate of the inverse diagonal Fisher information matrix. We also establish a connection to AdaGrad, an online learning algorithm, and find that a close relative of AdaGrad operates by repeatedly solving linear dropout-regularized problems. By casting dropout as regularization, we develop a natural semi-supervised algorithm that uses unlabeled data to create a better adaptive regularizer. We apply this idea to document classification tasks, and show that it consistently boosts the performance of dropout training, improving on state-of-the-art results on the IMDB reviews dataset.Comment: 11 pages. Advances in Neural Information Processing Systems (NIPS), 201

arXiv.org e-Print Archive

CiteSeerX

Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A Platforms

Author: Ma Lin
Malinowski Mateusz
Roberts Kirk
Stanley Clayton
Publication venue
Publication date: 25/05/2019
Field of study

Question categorization and expert retrieval methods have been crucial for information organization and accessibility in community question & answering (CQA) platforms. Research in this area, however, has dealt with only the text modality. With the increasing multimodal nature of web content, we focus on extending these methods for CQA questions accompanied by images. Specifically, we leverage the success of representation learning for text and images in the visual question answering (VQA) domain, and adapt the underlying concept and architecture for automated category classification and expert retrieval on image-based questions posted on Yahoo! Chiebukuro, the Japanese counterpart of Yahoo! Answers. To the best of our knowledge, this is the first work to tackle the multimodality challenge in CQA, and to adapt VQA models for tasks on a more ecologically valid source of visual questions. Our analysis of the differences between visual QA and community QA data drives our proposal of novel augmentations of an attention method tailored for CQA, and use of auxiliary tasks for learning better grounding features. Our final model markedly outperforms the text-only and VQA model baselines for both tasks of classification and expert retrieval on real-world multimodal CQA data.Comment: Submitted for review at CIKM 201

arXiv.org e-Print Archive

Crossref

A practical Bayesian framework for backpropagation networks

Author: MacKay David J. C.
Publication venue: 'MIT Press - Journals'
Publication date: 01/05/1992
Field of study

A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible (1) objective comparisons between solutions using alternative network architectures, (2) objective stopping rules for network pruning or growing procedures, (3) objective choice of magnitude and type of weight decay terms or additive regularizers (for penalizing large weights, etc.), (4) a measure of the effective number of well-determined parameters in a model, (5) quantified estimates of the error bars on network parameters and on network output, and (6) objective comparisons with alternative learning and interpolation models such as splines and radial basis functions. The Bayesian "evidence" automatically embodies "Occam's razor," penalizing overflexible and overcomplex models. The Bayesian approach helps detect poor underlying assumptions in learning models. For learning models well matched to a problem, a good correlation between generalization ability and the Bayesian evidence is obtained

Caltech Authors

Financial model calibration using consistency hints

Author: Abu-Mostafa Yaser S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

We introduce a technique for forcing the calibration of a financial model to produce valid parameters. The technique is based on learning from hints. It converts simple curve fitting into genuine calibration, where broad conclusions can be inferred from parameter values. The technique augments the error function of curve fitting with consistency hint error functions based on the Kullback-Leibler distance. We introduce an efficient EM-type optimization algorithm tailored to this technique. We also introduce other consistency hints, and balance their weights using canonical errors. We calibrate the correlated multifactor Vasicek model of interest rates, and apply it successfully to Japanese Yen swaps market and US dollar yield market

CiteSeerX

Caltech Authors