Learning optimal sampling policies for sketching of huge data matrices

Abstract

This study presents methods called TSGPR- and SAC-SketchyCoreSVD to improve the protocol of subsampling data fibers for building random sketches and low rank SVD of a data matrix by formulating them as a sequential decision making problem. An agent progressively decides which data fibers to subsample next to maximize the accuracy of low rank SVD under the limited computational resources. Using the information coming from the partially observed data matrix constructed by subsampled fibers so far, methods learn the optimal policy. Thompson sampling and Gaussian process regression are used for TSGPR-SketchyCoreSVD, and Soft Actor-Critic is applied to SAC-SketchyCoreSVD. Experiments show TSGPR-SketchyCoreSVD actively learns the subsampling policy and produces higher accuracy than the original SketchyCoreSVD. SAC-SketchyCoreSVD is still developing, but its intermediate result also shows promising results. This study can be easily expanded to higher order data tensors.Computational Science, Engineering, and Mathematic

    Similar works