665 research outputs found
Targeted matrix completion
Matrix completion is a problem that arises in many data-analysis settings
where the input consists of a partially-observed matrix (e.g., recommender
systems, traffic matrix analysis etc.). Classical approaches to matrix
completion assume that the input partially-observed matrix is low rank. The
success of these methods depends on the number of observed entries and the rank
of the matrix; the larger the rank, the more entries need to be observed in
order to accurately complete the matrix. In this paper, we deal with matrices
that are not necessarily low rank themselves, but rather they contain low-rank
submatrices. We propose Targeted, which is a general framework for completing
such matrices. In this framework, we first extract the low-rank submatrices and
then apply a matrix-completion algorithm to these low-rank submatrices as well
as the remainder matrix separately. Although for the completion itself we use
state-of-the-art completion methods, our results demonstrate that Targeted
achieves significantly smaller reconstruction errors than other classical
matrix-completion methods. One of the key technical contributions of the paper
lies in the identification of the low-rank submatrices from the input
partially-observed matrices.Comment: Proceedings of the 2017 SIAM International Conference on Data Mining
(SDM
Recommender Systems in Light of Big Data
The growth in the usage of the web, especially e-commerce website, has led to the development of recommender system (RS) which aims in personalizing the web content for each user and reducing the cognitive load of information on the user. However, as the world enters Big Data era and lives through the contemporary data explosion, the main goal of a RS becomes to provide millions of high quality recommendations in few seconds for the increasing number of users and items. One of the successful techniques of RSs is collaborative filtering (CF) which makes recommendations for users based on what other like-mind users had preferred. Despite its success, CF is facing some challenges posed by Big Data, such as: scalability, sparsity and cold start. As a consequence, new approaches of CF that overcome the existing problems have been studied such as Singular value decomposition (SVD). This paper surveys the literature of RSs and reviews the current state of RSs with the main concerns surrounding them due to Big Data. Furthermore, it investigates thoroughly SVD, one of the promising approaches expected to perform well in tackling Big Data challenges, and provides an implementation to it using some of the successful Big Data tools (i.e. Apache Hadoop and Spark). This implementation is intended to validate the applicability of, existing contributions to the field of, SVD-based RSs as well as validated the effectiveness of Hadoop and spark in developing large-scale systems. The implementation has been evaluated empirically by measuring mean absolute error which gave comparable results with other experiments conducted, previously by other researchers, on a relatively smaller data set and non-distributed environment. This proved the scalability of SVD-based RS and its applicability to Big Data
- …