We present a novel technique to overcome the limitations of the applicability
of Principal Component Analysis to typical real-life data sets, especially
astronomical spectra. Our new approach addresses the issues of outliers,
missing information, large number of dimensions and the vast amount of data by
combining elements of robust statistics and recursive algorithms that provide
improved eigensystem estimates step-by-step. We develop a generic mechanism for
deriving reliable eigenspectra without manual data censoring, while utilising
all the information contained in the observations. We demonstrate the power of
the methodology on the attractive collection of the VIMOS VLT Deep Survey
spectra that manifest most of the challenges today, and highlight the
improvements over previous workarounds, as well as the scalability of our
approach to collections with sizes of the Sloan Digital Sky Survey and beyond.Comment: 7 pages, 3 figures, accepted to MNRA