Search CORE

5 research outputs found

Practical algorithms for on-line sampling

Author: WATANABE OSAMU
渡辺治
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/05/2008
Field of study

Institutional Repositories DataBase (IRDB)

Practical Algorithms for On-line Sampling

Author: Carlos Domingo
Osamu Watanabe
Ricard Gavaldà
Publication venue
Publication date: 01/01/1998
Field of study

One of the core applications of machine learning to knowledge discovery consists on building a function (a hypothesis) from a given amount of data (for instance a decision tree or a neural network) such that we can use it afterwards to predict new instances of the data. In this paper, we focus on a particular situation where we assume that the hypothesis we want to use for prediction is very simple, and thus, the hypotheses class is of feasible size. We study the problem of how to determine which of the hypotheses in the class is almost the best one. We present two online sampling algorithms for selecting hypotheses, give theoretical bounds on the number of necessary examples, and analize them exprimentally. We compare them with the simple batch sampling approach commonly used and show that in most of the situations our algorithms use much fewer number of examples. 1 Introduction and Motivation The ubiquity of computers in business and commerce has lead to generation of huge quantitie..

arXiv.org e-Print Archive

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Practical Algorithms for On-Line Sampling

Author: D. Haussler
L. Breiman
R. C. Holte
S.M. Weiss
Y. Freund
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Practical algorithms for on-line sampling

Author: Domingo Soriano Carlos
Gavaldà Mestre Ricard
Watanabe O
Publication venue
Publication date
Field of study

One of the core applications of machine learning to knowledge discovery consists on building a function (a hypothesis) from a given amount of data (for instance a decision tree or a neural network) such that we can use it afterwards to predict new instances of the data. In this paper, we focus on a particular situation where we assume that the hypothesis we want to use for prediction is very simple, and thus, the hypotheses class is of feasible size. We study the problem of how to determine which of the hypotheses in the class is almost the best one. We present two on-line sampling algorithms for selecting hypotheses, give theoretical bounds on the number of necessary examples, and analize them exprimentally. We compare them with the simple batch sampling approach commonly used and show that in most of the situations our algorithms use much fewer number of examples

RECERCAT