Fitting linear regression models can be computationally very expensive in
large-scale data analysis tasks if the sample size and the number of variables
are very large. Random projections are extensively used as a dimension
reduction tool in machine learning and statistics. We discuss the applications
of random projections in linear regression problems, developed to decrease
computational costs, and give an overview of the theoretical guarantees of the
generalization error. It can be shown that the combination of random
projections with least squares regression leads to similar recovery as ridge
regression and principal component regression. We also discuss possible
improvements when averaging over multiple random projections, an approach that
lends itself easily to parallel implementation.Comment: 13 pages, 3 Figure