7,013 research outputs found
Expectile Matrix Factorization for Skewed Data Analysis
Matrix factorization is a popular approach to solving matrix estimation
problems based on partial observations. Existing matrix factorization is based
on least squares and aims to yield a low-rank matrix to interpret the
conditional sample means given the observations. However, in many real
applications with skewed and extreme data, least squares cannot explain their
central tendency or tail distributions, yielding undesired estimates. In this
paper, we propose \emph{expectile matrix factorization} by introducing
asymmetric least squares, a key concept in expectile regression analysis, into
the matrix factorization framework. We propose an efficient algorithm to solve
the new problem based on alternating minimization and quadratic programming. We
prove that our algorithm converges to a global optimum and exactly recovers the
true underlying low-rank matrices when noise is zero. For synthetic data with
skewed noise and a real-world dataset containing web service response times,
the proposed scheme achieves lower recovery errors than the existing matrix
factorization method based on least squares in a wide range of settings.Comment: 8 page main text with 5 page supplementary documents, published in
AAAI 201
To Index or Not to Index: Optimizing Exact Maximum Inner Product Search
Exact Maximum Inner Product Search (MIPS) is an important task that is widely
pertinent to recommender systems and high-dimensional similarity search. The
brute-force approach to solving exact MIPS is computationally expensive, thus
spurring recent development of novel indexes and pruning techniques for this
task. In this paper, we show that a hardware-efficient brute-force approach,
blocked matrix multiply (BMM), can outperform the state-of-the-art MIPS solvers
by over an order of magnitude, for some -- but not all -- inputs.
In this paper, we also present a novel MIPS solution, MAXIMUS, that takes
advantage of hardware efficiency and pruning of the search space. Like BMM,
MAXIMUS is faster than other solvers by up to an order of magnitude, but again
only for some inputs. Since no single solution offers the best runtime
performance for all inputs, we introduce a new data-dependent optimizer,
OPTIMUS, that selects online with minimal overhead the best MIPS solver for a
given input. Together, OPTIMUS and MAXIMUS outperform state-of-the-art MIPS
solvers by 3.2 on average, and up to 10.9, on widely studied
MIPS datasets.Comment: 12 pages, 8 figures, 2 table
- …