Skip to main content
Article thumbnail
Location of Repository

Learning from Infinite Data in Finite Time

By Pedro Domingos and Geoff Hulten

Abstract

We propose the following general method for scaling learning algorithms to arbitrarily large data sets. Consider the model M ~n learned by the algorithm using n i examples in step i (~n = (n 1 ; : : : ; nm )), and the model M1 that would be learned using infinite examples. Upper-bound the loss L(M ~n ; M1 ) between them as a function of ~n, and then minimize the algorithm's time complexity f(~n) subject to the constraint that L(M1 ; M ~n ) be at most with probability at most . We apply this method to the EM algorithm for mixtures of Gaussians. Preliminary experiments on a series of large data sets provide evidence of the potential of this approach

Year: 2001
OAI identifier: oai:CiteSeerX.psu:10.1.1.19.8856
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www-2.cs.cmu.edu/Groups... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.