105,392 research outputs found

    Matrix Completion with Side Information using Manifold Optimization

    Full text link
    We solve the Matrix Completion (MC) problem based on manifold optimization by incorporating the side information under which the columns of the intended matrix are drawn from a union of low dimensional subspaces. It is proved that this side information leads us to construct new manifolds, as embedded\it{embedded} submanifold of the manifold of constant rank matrices, using which the MC problem is solved more accurately. The required geometrical properties of the aforementioned manifold are then presented for matrix completion. Simulation results show that the proposed method outperforms some recent techniques either based on side information or not

    Interpretable Matrix Completion: A Discrete Optimization Approach

    Full text link
    We consider the problem of matrix completion on an n×mn \times m matrix. We introduce the problem of Interpretable Matrix Completion that aims to provide meaningful insights for the low-rank matrix using side information. We show that the problem can be reformulated as a binary convex optimization problem. We design OptComplete, based on a novel concept of stochastic cutting planes to enable efficient scaling of the algorithm up to matrices of sizes n=106n=10^6 and m=106m=10^6. We report experiments on both synthetic and real-world datasets that show that OptComplete has favorable scaling behavior and accuracy when compared with state-of-the-art methods for other types of matrix completion, while providing insight on the factors that affect the matrix.Comment: Submitted to Operational Researc

    A Sparse and Low-Rank Optimization Framework for Index Coding via Riemannian Optimization

    Full text link
    Side information provides a pivotal role for message delivery in many communication scenarios to accommodate increasingly large data sets, e.g., caching networks. Although index coding provides a fundamental modeling framework to exploit the benefits of side information, the index coding problem itself still remains open and only a few instances have been solved. In this paper, we propose a novel sparse and low- rank optimization modeling framework for the index coding problem to characterize the tradeoff between the amount of side information and the achievable data rate. Specifically, sparsity of the model measures the amount of side information, while low- rankness represents the achievable data rate. The resulting sparse and low-rank optimization problem has non-convex sparsity inducing objective and non-convex rank constraint. To address the coupled challenges in objective and constraint, we propose a novel Riemannian optimization framework by exploiting the quotient manifold geometry of fixed-rank matrices, accompanied by a smooth sparsity inducing surrogate. Simulation results demonstrate the appealing sparsity and low-rankness tradeoff in the proposed model, thereby revealing the tradeoff between the amount of side information and the achievable data rate in the index coding problem.Comment: Simulation code is available at https://bamdevmishra.com/indexcoding

    Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow

    Full text link
    We revisit the inductive matrix completion problem that aims to recover a rank-rr matrix with ambient dimension dd given nn features as the side prior information. The goal is to make use of the known nn features to reduce sample and computational complexities. We present and analyze a new gradient-based non-convex optimization algorithm that converges to the true underlying matrix at a linear rate with sample complexity only linearly depending on nn and logarithmically depending on dd. To the best of our knowledge, all previous algorithms either have a quadratic dependency on the number of features in sample complexity or a sub-linear computational convergence rate. In addition, we provide experiments on both synthetic and real world data to demonstrate the effectiveness of our proposed algorithm.Comment: 35 pages, 3 figures and 2 table

    Tensor Completion Algorithms in Big Data Analytics

    Full text link
    Tensor completion is a problem of filling the missing or unobserved entries of partially observed tensors. Due to the multidimensional character of tensors in describing complex datasets, tensor completion algorithms and their applications have received wide attention and achievement in areas like data mining, computer vision, signal processing, and neuroscience. In this survey, we provide a modern overview of recent advances in tensor completion algorithms from the perspective of big data analytics characterized by diverse variety, large volume, and high velocity. We characterize these advances from four perspectives: general tensor completion algorithms, tensor completion with auxiliary information (variety), scalable tensor completion algorithms (volume), and dynamic tensor completion algorithms (velocity). Further, we identify several tensor completion applications on real-world data-driven problems and present some common experimental frameworks popularized in the literature. Our goal is to summarize these popular methods and introduce them to researchers and practitioners for promoting future research and applications. We conclude with a discussion of key challenges and promising research directions in this community for future exploration

    Log-Normal Matrix Completion for Large Scale Link Prediction

    Full text link
    The ubiquitous proliferation of online social networks has led to the widescale emergence of relational graphs expressing unique patterns in link formation and descriptive user node features. Matrix Factorization and Completion have become popular methods for Link Prediction due to the low rank nature of mutual node friendship information, and the availability of parallel computer architectures for rapid matrix processing. Current Link Prediction literature has demonstrated vast performance improvement through the utilization of sparsity in addition to the low rank matrix assumption. However, the majority of research has introduced sparsity through the limited L1 or Frobenius norms, instead of considering the more detailed distributions which led to the graph formation and relationship evolution. In particular, social networks have been found to express either Pareto, or more recently discovered, Log Normal distributions. Employing the convexity-inducing Lovasz Extension, we demonstrate how incorporating specific degree distribution information can lead to large scale improvements in Matrix Completion based Link prediction. We introduce Log-Normal Matrix Completion (LNMC), and solve the complex optimization problem by employing Alternating Direction Method of Multipliers. Using data from three popular social networks, our experiments yield up to 5% AUC increase over top-performing non-structured sparsity based methods.Comment: 6 page

    Sparse Group Inductive Matrix Completion

    Full text link
    We consider the problem of matrix completion with side information (\textit{inductive matrix completion}). In real-world applications many side-channel features are typically non-informative making feature selection an important part of the problem. We incorporate feature selection into inductive matrix completion by proposing a matrix factorization framework with group-lasso regularization on side feature parameter matrices. We demonstrate, that the theoretical sample complexity for the proposed method is much lower compared to its competitors in sparse problems, and propose an efficient optimization algorithm for the resulting low-rank matrix completion problem with sparsifying regularizers. Experiments on synthetic and real-world datasets show that the proposed approach outperforms other methods

    Sample Complexity of Power System State Estimation using Matrix Completion

    Full text link
    In this paper, we propose an analytical framework to quantify the amount of data samples needed to obtain accurate state estimation in a power system - a problem known as sample complexity analysis in computer science. Motivated by the increasing adoption of distributed energy resources into the distribution-level grids, it becomes imperative to estimate the state of distribution grids in order to ensure stable operation. Traditional power system state estimation techniques mainly focus on the transmission network which involve solving an overdetermined system and eliminating bad data. However, distribution networks are typically underdetermined due to the large number of connection points and high cost of pervasive installation of measurement devices. In this paper, we consider the recently proposed state-estimation method for underdetermined systems that is based on matrix completion. In particular, a constrained matrix completion algorithm was proposed, wherein the standard matrix completion problem is augmented with additional equality constraints representing the physics (namely power-flow constraints). We analyze the sample complexity of this general method by proving an upper bound on the sample complexity that depends directly on the properties of these constraints that can lower number of needed samples as compared to the unconstrained problem. To demonstrate the improvement that the constraints add to distribution state estimation, we test the method on a 141-bus distribution network case study and compare it to the traditional least squares minimization state estimation method

    Active Matrix Factorization for Surveys

    Full text link
    Amid historically low response rates, survey researchers seek ways to reduce respondent burden while measuring desired concepts with precision. We propose to ask fewer questions of respondents and impute missing responses via probabilistic matrix factorization. A variance-minimizing active learning criterion chooses the most informative questions per respondent. In simulations of our matrix sampling procedure on real-world surveys, as well as a Facebook survey experiment, we find active question selection achieves efficiency gains over baselines. The reduction in imputation error is heterogeneous across questions, and depends on the latent concepts they capture. The imputation procedure can benefit from incorporating respondent side information, modeling responses as ordered logit rather than Gaussian, and accounting for order effects. With our method, survey researchers obtain principled suggestions of questions to retain and, if desired, can automate the design of shorter instruments

    Analysis of Nuclear Norm Regularization for Full-rank Matrix Completion

    Full text link
    In this paper, we provide a theoretical analysis of the nuclear-norm regularized least squares for full-rank matrix completion. Although similar formulations have been examined by previous studies, their results are unsatisfactory because only additive upper bounds are provided. Under the assumption that the top eigenspaces of the target matrix are incoherent, we derive a relative upper bound for recovering the best low-rank approximation of the unknown matrix. Our relative upper bound is tighter than previous additive bounds of other methods if the mass of the target matrix is concentrated on its top eigenspaces, and also implies perfect recovery if it is low-rank. The analysis is built upon the optimality condition of the regularized formulation and existing guarantees for low-rank matrix completion. To the best of our knowledge, this is first time such a relative bound is proved for the regularized formulation of matrix completion
    • …
    corecore