3 research outputs found
Random projections: data perturbation for classification problems
Random projections offer an appealing and flexible approach to a wide range
of large-scale statistical problems. They are particularly useful in
high-dimensional settings, where we have many covariates recorded for each
observation. In classification problems there are two general techniques using
random projections. The first involves many projections in an ensemble -- the
idea here is to aggregate the results after applying different random
projections, with the aim of achieving superior statistical accuracy. The
second class of methods include hashing and sketching techniques, which are
straightforward ways to reduce the complexity of a problem, perhaps therefore
with a huge computational saving, while approximately preserving the
statistical efficiency.Comment: 24 pages, 4 figure