Search CORE

1,509 research outputs found

Hints

Author: Abu-Mostafa Yaser S.
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/1995
Field of study

The systematic use of hints in the learning-from-examples paradigm is the subject of this review. Hints are the properties of the target function that are known to us independently of the training examples. The use of hints is tantamount to combining rules and data in learning, and is compatible with different learning models, optimization techniques, and regularization techniques. The hints are represented to the learning process by virtual examples, and the training examples of the target function are treated on equal footing with the rest of the hints. A balance is achieved between the information provided by the different hints through the choice of objective functions and learning schedules. The Adaptive Minimization algorithm achieves this balance by relating the performance on each hint to the overall performance. The application of hints in forecasting the very noisy foreign-exchange markets is illustrated. On the theoretical side, the information value of hints is contrasted to the complexity value and related to the VC dimension

CiteSeerX

Caltech Authors

Financial model calibration using consistency hints

Author: Abu-Mostafa Yaser S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

We introduce a technique for forcing the calibration of a financial model to produce valid parameters. The technique is based on learning from hints. It converts simple curve fitting into genuine calibration, where broad conclusions can be inferred from parameter values. The technique augments the error function of curve fitting with consistency hint error functions based on the Kullback-Leibler distance. We introduce an efficient EM-type optimization algorithm tailored to this technique. We also introduce other consistency hints, and balance their weights using canonical errors. We calibrate the correlated multifactor Vasicek model of interest rates, and apply it successfully to Japanese Yen swaps market and US dollar yield market

CiteSeerX

Caltech Authors

Hints and the VC Dimension

Author: Abu-Mostafa Yaser S.
Publication venue: 'MIT Press - Journals'
Publication date: 01/03/1993
Field of study

Learning from hints is a generalization of learning from examples that allows for a variety of information about the unknown function to be used in the learning process. In this paper, we use the VC dimension, an established tool for analyzing learning from examples, to analyze learning from hints. In particular, we show how the VC dimension is affected by the introduction of a hint. We also derive a new quantity that defines a VC dimension for the hint itself. This quantity is used to estimate the number of examples needed to "absorb" the hint. We carry out the analysis for two types of hints, invariances and catalysts. We also describe how the same method can be applied to other types of hints

Caltech Authors

The Vapnik-Chervonenkis Dimension: Information versus Complexity in Learning

Author: Abu-Mostafa Yaser S.
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/1989
Field of study

When feasible, learning is a very attractive alternative to explicit programming. This is particularly true in areas where the problems do not lend themselves to systematic programming, such as pattern recognition in natural environments. The feasibility of learning an unknown function from examples depends on two questions: 1. Do the examples convey enough information to determine the function? 2. Is there a speedy way of constructing the function from the examples? These questions contrast the roles of information and complexity in learning. While the two roles share some ground, they are conceptually and technically different. In the common language of learning, the information question is that of generalization and the complexity question is that of scaling. The work of Vapnik and Chervonenkis (1971) provides the key tools for dealing with the information issue. In this review, we develop the main ideas of this framework and discuss how complexity fits in

Caltech Authors

Information theory, complexity and neural networks

Author: Abu-Mostafa Yaser S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1989
Field of study

Some of the main results in the mathematical evaluation of neural networks as information processing systems are discussed. The basic operation of feedback and feed-forward neural networks is described. Their memory capacity and computing power are considered. The concept of learning by example as it applies to neural networks is examined

CiteSeerX

Caltech Authors

An algorithm for learning from hints

Author: Abu-Mostafa Y. S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/1993
Field of study

To take advantage of prior knowledge (hints) about the function one wants to learn, we introduce a method that generalizes learning from examples to learning from hints. A canonical representation of hints is defined and illustrated. All hints are represented to the learning process by examples, and examples of the function are treated on equal footing with the rest of the hints. During learning, examples from different hints are selected for processing according to a given schedule. We present two types of schedules; fixed schedules that specify the relative emphasis of each hint, and adaptive schedules that are based on how well each hint has been learned so far. Our learning method is compatible with any descent technique

Caltech Authors

The capacity of multilevel threshold functions

Author: Abu-Mostafa Yaser S.
Olafsson Sverrir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/1988
Field of study

Lower and upper bounds for the capacity of multilevel threshold elements are estimated, using two essentially different enumeration techniques. It is demonstrated that the exact number of multilevel threshold functions depends strongly on the relative topology of the input set. The results correct a previously published estimate and indicate that adding threshold levels enhances the capacity more than adding variables

Caltech Authors

An analog feedback associative memory

Author: Abu-Mostafa Yaser S.
Atiya Amir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1993
Field of study

A method for the storage of analog vectors, i.e., vectors whose components are real-valued, is developed for the Hopfield continuous-time network. An important requirement is that each memory vector has to be an asymptotically stable (i.e. attractive) equilibrium of the network. Some of the limitations imposed by the continuous Hopfield model on the set of vectors that can be stored are pointed out. These limitations can be relieved by choosing a network containing visible as well as hidden units. An architecture consisting of several hidden layers and a visible layer, connected in a circular fashion, is considered. It is proved that the two-layer case is guaranteed to store any number of given analog vectors provided their number does not exceed 1 + the number of neurons in the hidden layer. A learning algorithm that correctly adjusts the locations of the equilibria and guarantees their asymptotic stability is developed. Simulation results confirm the effectiveness of the approach

CiteSeerX

Caltech Authors

Data complexity in machine learning

Author: Abu-Mostafa Yaser S.
Li Ling
Publication venue: 'California Institute of Technology Library'
Publication date: 26/05/2006
Field of study

We investigate the role of data complexity in the context of binary classification problems. The universal data complexity is defined for a data set as the Kolmogorov complexity of the mapping enforced by the data set. It is closely related to several existing principles used in machine learning such as Occam's razor, the minimum description length, and the Bayesian approach. The data complexity can also be defined based on a learning model, which is more realistic for applications. We demonstrate the application of the data complexity in two learning problems, data decomposition and data pruning. In data decomposition, we illustrate that a data set is best approximated by its principal subsets which are Pareto optimal with respect to the complexity and the set size. In data pruning, we show that outliers usually have high complexity contributions, and propose methods for estimating the complexity contribution. Since in practice we have to approximate the ideal data complexity measures, we also discuss the impact of such approximations

Caltech Authors