105,392 research outputs found
Matrix Completion with Side Information using Manifold Optimization
We solve the Matrix Completion (MC) problem based on manifold optimization by
incorporating the side information under which the columns of the intended
matrix are drawn from a union of low dimensional subspaces. It is proved that
this side information leads us to construct new manifolds, as
submanifold of the manifold of constant rank matrices, using which the MC
problem is solved more accurately.
The required geometrical properties of the aforementioned manifold are then
presented for matrix completion. Simulation results show that the proposed
method outperforms some recent techniques either based on side information or
not
Interpretable Matrix Completion: A Discrete Optimization Approach
We consider the problem of matrix completion on an matrix. We
introduce the problem of Interpretable Matrix Completion that aims to provide
meaningful insights for the low-rank matrix using side information. We show
that the problem can be reformulated as a binary convex optimization problem.
We design OptComplete, based on a novel concept of stochastic cutting planes to
enable efficient scaling of the algorithm up to matrices of sizes and
. We report experiments on both synthetic and real-world datasets that
show that OptComplete has favorable scaling behavior and accuracy when compared
with state-of-the-art methods for other types of matrix completion, while
providing insight on the factors that affect the matrix.Comment: Submitted to Operational Researc
A Sparse and Low-Rank Optimization Framework for Index Coding via Riemannian Optimization
Side information provides a pivotal role for message delivery in many
communication scenarios to accommodate increasingly large data sets, e.g.,
caching networks. Although index coding provides a fundamental modeling
framework to exploit the benefits of side information, the index coding problem
itself still remains open and only a few instances have been solved. In this
paper, we propose a novel sparse and low- rank optimization modeling framework
for the index coding problem to characterize the tradeoff between the amount of
side information and the achievable data rate. Specifically, sparsity of the
model measures the amount of side information, while low- rankness represents
the achievable data rate. The resulting sparse and low-rank optimization
problem has non-convex sparsity inducing objective and non-convex rank
constraint. To address the coupled challenges in objective and constraint, we
propose a novel Riemannian optimization framework by exploiting the quotient
manifold geometry of fixed-rank matrices, accompanied by a smooth sparsity
inducing surrogate. Simulation results demonstrate the appealing sparsity and
low-rankness tradeoff in the proposed model, thereby revealing the tradeoff
between the amount of side information and the achievable data rate in the
index coding problem.Comment: Simulation code is available at https://bamdevmishra.com/indexcoding
Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow
We revisit the inductive matrix completion problem that aims to recover a
rank- matrix with ambient dimension given features as the side prior
information. The goal is to make use of the known features to reduce sample
and computational complexities. We present and analyze a new gradient-based
non-convex optimization algorithm that converges to the true underlying matrix
at a linear rate with sample complexity only linearly depending on and
logarithmically depending on . To the best of our knowledge, all previous
algorithms either have a quadratic dependency on the number of features in
sample complexity or a sub-linear computational convergence rate. In addition,
we provide experiments on both synthetic and real world data to demonstrate the
effectiveness of our proposed algorithm.Comment: 35 pages, 3 figures and 2 table
Tensor Completion Algorithms in Big Data Analytics
Tensor completion is a problem of filling the missing or unobserved entries
of partially observed tensors. Due to the multidimensional character of tensors
in describing complex datasets, tensor completion algorithms and their
applications have received wide attention and achievement in areas like data
mining, computer vision, signal processing, and neuroscience. In this survey,
we provide a modern overview of recent advances in tensor completion algorithms
from the perspective of big data analytics characterized by diverse variety,
large volume, and high velocity. We characterize these advances from four
perspectives: general tensor completion algorithms, tensor completion with
auxiliary information (variety), scalable tensor completion algorithms
(volume), and dynamic tensor completion algorithms (velocity). Further, we
identify several tensor completion applications on real-world data-driven
problems and present some common experimental frameworks popularized in the
literature. Our goal is to summarize these popular methods and introduce them
to researchers and practitioners for promoting future research and
applications. We conclude with a discussion of key challenges and promising
research directions in this community for future exploration
Log-Normal Matrix Completion for Large Scale Link Prediction
The ubiquitous proliferation of online social networks has led to the
widescale emergence of relational graphs expressing unique patterns in link
formation and descriptive user node features. Matrix Factorization and
Completion have become popular methods for Link Prediction due to the low rank
nature of mutual node friendship information, and the availability of parallel
computer architectures for rapid matrix processing. Current Link Prediction
literature has demonstrated vast performance improvement through the
utilization of sparsity in addition to the low rank matrix assumption. However,
the majority of research has introduced sparsity through the limited L1 or
Frobenius norms, instead of considering the more detailed distributions which
led to the graph formation and relationship evolution. In particular, social
networks have been found to express either Pareto, or more recently discovered,
Log Normal distributions. Employing the convexity-inducing Lovasz Extension, we
demonstrate how incorporating specific degree distribution information can lead
to large scale improvements in Matrix Completion based Link prediction. We
introduce Log-Normal Matrix Completion (LNMC), and solve the complex
optimization problem by employing Alternating Direction Method of Multipliers.
Using data from three popular social networks, our experiments yield up to 5%
AUC increase over top-performing non-structured sparsity based methods.Comment: 6 page
Sparse Group Inductive Matrix Completion
We consider the problem of matrix completion with side information
(\textit{inductive matrix completion}). In real-world applications many
side-channel features are typically non-informative making feature selection an
important part of the problem. We incorporate feature selection into inductive
matrix completion by proposing a matrix factorization framework with
group-lasso regularization on side feature parameter matrices. We demonstrate,
that the theoretical sample complexity for the proposed method is much lower
compared to its competitors in sparse problems, and propose an efficient
optimization algorithm for the resulting low-rank matrix completion problem
with sparsifying regularizers. Experiments on synthetic and real-world datasets
show that the proposed approach outperforms other methods
Sample Complexity of Power System State Estimation using Matrix Completion
In this paper, we propose an analytical framework to quantify the amount of
data samples needed to obtain accurate state estimation in a power system - a
problem known as sample complexity analysis in computer science. Motivated by
the increasing adoption of distributed energy resources into the
distribution-level grids, it becomes imperative to estimate the state of
distribution grids in order to ensure stable operation. Traditional power
system state estimation techniques mainly focus on the transmission network
which involve solving an overdetermined system and eliminating bad data.
However, distribution networks are typically underdetermined due to the large
number of connection points and high cost of pervasive installation of
measurement devices. In this paper, we consider the recently proposed
state-estimation method for underdetermined systems that is based on matrix
completion. In particular, a constrained matrix completion algorithm was
proposed, wherein the standard matrix completion problem is augmented with
additional equality constraints representing the physics (namely power-flow
constraints). We analyze the sample complexity of this general method by
proving an upper bound on the sample complexity that depends directly on the
properties of these constraints that can lower number of needed samples as
compared to the unconstrained problem. To demonstrate the improvement that the
constraints add to distribution state estimation, we test the method on a
141-bus distribution network case study and compare it to the traditional least
squares minimization state estimation method
Active Matrix Factorization for Surveys
Amid historically low response rates, survey researchers seek ways to reduce
respondent burden while measuring desired concepts with precision. We propose
to ask fewer questions of respondents and impute missing responses via
probabilistic matrix factorization. A variance-minimizing active learning
criterion chooses the most informative questions per respondent. In simulations
of our matrix sampling procedure on real-world surveys, as well as a Facebook
survey experiment, we find active question selection achieves efficiency gains
over baselines. The reduction in imputation error is heterogeneous across
questions, and depends on the latent concepts they capture. The imputation
procedure can benefit from incorporating respondent side information, modeling
responses as ordered logit rather than Gaussian, and accounting for order
effects. With our method, survey researchers obtain principled suggestions of
questions to retain and, if desired, can automate the design of shorter
instruments
Analysis of Nuclear Norm Regularization for Full-rank Matrix Completion
In this paper, we provide a theoretical analysis of the nuclear-norm
regularized least squares for full-rank matrix completion. Although similar
formulations have been examined by previous studies, their results are
unsatisfactory because only additive upper bounds are provided. Under the
assumption that the top eigenspaces of the target matrix are incoherent, we
derive a relative upper bound for recovering the best low-rank approximation of
the unknown matrix. Our relative upper bound is tighter than previous additive
bounds of other methods if the mass of the target matrix is concentrated on its
top eigenspaces, and also implies perfect recovery if it is low-rank. The
analysis is built upon the optimality condition of the regularized formulation
and existing guarantees for low-rank matrix completion. To the best of our
knowledge, this is first time such a relative bound is proved for the
regularized formulation of matrix completion
- …