2 research outputs found
Multiple pass streaming algorithms for learning mixtures of distributions in Rd
We present a multiple pass streaming algorithm
for learning the density function of
a mixture of uniform distributions
over rectangles (cells) in , for any .
Our learning model is: samples drawn according to
the mixture are placed in {\it arbitrary order} in a
data stream that may only be accessed sequentially by an
algorithm with a very limited random access memory space.
Our algorithm makes passes, for any , and
requires memory at most .
This exhibits a
strong memory-space tradeoff: a few more passes significantly
lowers its memory requirements, thus trading one of the two most important
resources in streaming computation for the other.
Chang and Kannan \cite{chang06}
first considered this problem for .
Our learning algorithm is
especially appropriate for situations where massive data sets of
samples are available, but practical computation with such
large inputs requires very restricted models of computation