2,951 research outputs found
Sparse Group Inductive Matrix Completion
We consider the problem of matrix completion with side information
(\textit{inductive matrix completion}). In real-world applications many
side-channel features are typically non-informative making feature selection an
important part of the problem. We incorporate feature selection into inductive
matrix completion by proposing a matrix factorization framework with
group-lasso regularization on side feature parameter matrices. We demonstrate,
that the theoretical sample complexity for the proposed method is much lower
compared to its competitors in sparse problems, and propose an efficient
optimization algorithm for the resulting low-rank matrix completion problem
with sparsifying regularizers. Experiments on synthetic and real-world datasets
show that the proposed approach outperforms other methods
Provable Inductive Robust PCA via Iterative Hard Thresholding
The robust PCA problem, wherein, given an input data matrix that is the
superposition of a low-rank matrix and a sparse matrix, we aim to separate out
the low-rank and sparse components, is a well-studied problem in machine
learning. One natural question that arises is that, as in the inductive
setting, if features are provided as input as well, can we hope to do better?
Answering this in the affirmative, the main goal of this paper is to study the
robust PCA problem while incorporating feature information. In contrast to
previous works in which recovery guarantees are based on the convex relaxation
of the problem, we propose a simple iterative algorithm based on
hard-thresholding of appropriate residuals. Under weaker assumptions than
previous works, we prove the global convergence of our iterative procedure;
moreover, it admits a much faster convergence rate and lesser computational
complexity per iteration. In practice, through systematic synthetic and real
data simulations, we confirm our theoretical findings regarding improvements
obtained by using feature information
Interpretable Matrix Completion: A Discrete Optimization Approach
We consider the problem of matrix completion on an matrix. We
introduce the problem of Interpretable Matrix Completion that aims to provide
meaningful insights for the low-rank matrix using side information. We show
that the problem can be reformulated as a binary convex optimization problem.
We design OptComplete, based on a novel concept of stochastic cutting planes to
enable efficient scaling of the algorithm up to matrices of sizes and
. We report experiments on both synthetic and real-world datasets that
show that OptComplete has favorable scaling behavior and accuracy when compared
with state-of-the-art methods for other types of matrix completion, while
providing insight on the factors that affect the matrix.Comment: Submitted to Operational Researc
Harmless interpolation of noisy data in regression
A continuing mystery in understanding the empirical success of deep neural
networks is their ability to achieve zero training error and generalize well,
even when the training data is noisy and there are more parameters than data
points. We investigate this overparameterized regime in linear regression,
where all solutions that minimize training error interpolate the data,
including noise. We characterize the fundamental generalization (mean-squared)
error of any interpolating solution in the presence of noise, and show that
this error decays to zero with the number of features. Thus,
overparameterization can be explicitly beneficial in ensuring harmless
interpolation of noise. We discuss two root causes for poor generalization that
are complementary in nature -- signal "bleeding" into a large number of alias
features, and overfitting of noise by parsimonious feature selectors. For the
sparse linear model with noise, we provide a hybrid interpolating scheme that
mitigates both these issues and achieves order-optimal MSE over all possible
interpolating solutions.Comment: 52 pages, expanded version of the paper presented at ITA in San Diego
in Feb 2019, ISIT in Paris in July 2019, at Simons in July, and as a plenary
at ITW in Visby in August 201
Multi-label Learning with Missing Labels using Mixed Dependency Graphs
This work focuses on the problem of multi-label learning with missing labels
(MLML), which aims to label each test instance with multiple class labels given
training instances that have an incomplete/partial set of these labels. The key
point to handle missing labels is propagating the label information from
provided labels to missing labels, through a dependency graph that each label
of each instance is treated as a node. We build this graph by utilizing
different types of label dependencies. Specifically, the instance-level
similarity is served as undirected edges to connect the label nodes across
different instances and the semantic label hierarchy is used as directed edges
to connect different classes. This base graph is referred to as the mixed
dependency graph, as it includes both undirected and directed edges.
Furthermore, we present another two types of label dependencies to connect the
label nodes across different classes. One is the class co-occurrence, which is
also encoded as undirected edges. Combining with the base graph, we obtain a
new mixed graph, called MG-CO (mixed graph with co-occurrence). The other is
the sparse and low rank decomposition of the whole label matrix, to embed
high-order dependencies over all labels. Combining with the base graph, the new
mixed graph is called as MG-SL (mixed graph with sparse and low rank
decomposition). Based on MG-CO and MG-SL, we propose two convex transductive
formulations of the MLML problem, denoted as MLMG-CO and MLMG-SL, respectively.
Two important applications, including image annotation and tag based image
retrieval, can be jointly handled using our proposed methods. Experiments on
benchmark datasets show that our methods give significant improvements in
performance and robustness to missing labels over the state-of-the-art methods.Comment: Published in International Journal of Computer Vision, 201
Tensor Completion Algorithms in Big Data Analytics
Tensor completion is a problem of filling the missing or unobserved entries
of partially observed tensors. Due to the multidimensional character of tensors
in describing complex datasets, tensor completion algorithms and their
applications have received wide attention and achievement in areas like data
mining, computer vision, signal processing, and neuroscience. In this survey,
we provide a modern overview of recent advances in tensor completion algorithms
from the perspective of big data analytics characterized by diverse variety,
large volume, and high velocity. We characterize these advances from four
perspectives: general tensor completion algorithms, tensor completion with
auxiliary information (variety), scalable tensor completion algorithms
(volume), and dynamic tensor completion algorithms (velocity). Further, we
identify several tensor completion applications on real-world data-driven
problems and present some common experimental frameworks popularized in the
literature. Our goal is to summarize these popular methods and introduce them
to researchers and practitioners for promoting future research and
applications. We conclude with a discussion of key challenges and promising
research directions in this community for future exploration
Forward - Backward Greedy Algorithms for Atomic Norm Regularization
In many signal processing applications, the aim is to reconstruct a signal
that has a simple representation with respect to a certain basis or frame.
Fundamental elements of the basis known as "atoms" allow us to define "atomic
norms" that can be used to formulate convex regularizations for the
reconstruction problem. Efficient algorithms are available to solve these
formulations in certain special cases, but an approach that works well for
general atomic norms, both in terms of speed and reconstruction accuracy,
remains to be found. This paper describes an optimization algorithm called
CoGEnT that produces solutions with succinct atomic representations for
reconstruction problems, generally formulated with atomic-norm constraints.
CoGEnT combines a greedy selection scheme based on the conditional gradient
approach with a backward (or "truncation") step that exploits the quadratic
nature of the objective to reduce the basis size. We establish convergence
properties and validate the algorithm via extensive numerical experiments on a
suite of signal processing applications. Our algorithm and analysis also allow
for inexact forward steps and for occasional enhancements of the current
representation to be performed. CoGEnT can outperform the basic conditional
gradient method, and indeed many methods that are tailored to specific
applications, when the enhancement and truncation steps are defined
appropriately. We also introduce several novel applications that are enabled by
the atomic-norm framework, including tensor completion, moment problems in
signal processing, and graph deconvolution.Comment: To appear in IEEE Transactions on Signal Processin
Collaborative Self-Attention for Recommender Systems
Recommender systems (RS), which have been an essential part in a wide range
of applications, can be formulated as a matrix completion (MC) problem. To
boost the performance of MC, matrix completion with side information, called
inductive matrix completion (IMC), was further proposed. In real applications,
the factorized version of IMC is more favored due to its efficiency of
optimization and implementation. Regarding the factorized version, traditional
IMC method can be interpreted as learning an individual representation for each
feature, which is independent from each other. Moreover, representations for
the same features are shared across all users/items. However, the independent
characteristic for features and shared characteristic for the same features
across all users/items may limit the expressiveness of the model. The
limitation also exists in variants of IMC, such as deep learning based IMC
models. To break the limitation, we generalize recent advances of
self-attention mechanism to IMC and propose a context-aware model called
collaborative self-attention (CSA), which can jointly learn context-aware
representations for features and perform inductive matrix completion process.
Extensive experiments on three large-scale datasets from real RS applications
demonstrate effectiveness of CSA.Comment: There are large modification
Crowd Labeling: a survey
Recently, there has been a burst in the number of research projects on human
computation via crowdsourcing. Multiple choice (or labeling) questions could be
referred to as a common type of problem which is solved by this approach. As an
application, crowd labeling is applied to find true labels for large machine
learning datasets. Since crowds are not necessarily experts, the labels they
provide are rather noisy and erroneous. This challenge is usually resolved by
collecting multiple labels for each sample, and then aggregating them to
estimate the true label. Although the mechanism leads to high-quality labels,
it is not actually cost-effective. As a result, efforts are currently made to
maximize the accuracy in estimating true labels, while fixing the number of
acquired labels.
This paper surveys methods to aggregate redundant crowd labels in order to
estimate unknown true labels. It presents a unified statistical latent model
where the differences among popular methods in the field correspond to
different choices for the parameters of the model. Afterwards, algorithms to
make inference on these models will be surveyed. Moreover, adaptive methods
which iteratively collect labels based on the previously collected labels and
estimated models will be discussed. In addition, this paper compares the
distinguished methods, and provides guidelines for future work required to
address the current open issues.Comment: Under consideration for publication in Knowledge and Information
System
Decomposition into Low-rank plus Additive Matrices for Background/Foreground Separation: A Review for a Comparative Evaluation with a Large-Scale Dataset
Recent research on problem formulations based on decomposition into low-rank
plus sparse matrices shows a suitable framework to separate moving objects from
the background. The most representative problem formulation is the Robust
Principal Component Analysis (RPCA) solved via Principal Component Pursuit
(PCP) which decomposes a data matrix in a low-rank matrix and a sparse matrix.
However, similar robust implicit or explicit decompositions can be made in the
following problem formulations: Robust Non-negative Matrix Factorization
(RNMF), Robust Matrix Completion (RMC), Robust Subspace Recovery (RSR), Robust
Subspace Tracking (RST) and Robust Low-Rank Minimization (RLRM). The main goal
of these similar problem formulations is to obtain explicitly or implicitly a
decomposition into low-rank matrix plus additive matrices. In this context,
this work aims to initiate a rigorous and comprehensive review of the similar
problem formulations in robust subspace learning and tracking based on
decomposition into low-rank plus additive matrices for testing and ranking
existing algorithms for background/foreground separation. For this, we first
provide a preliminary review of the recent developments in the different
problem formulations which allows us to define a unified view that we called
Decomposition into Low-rank plus Additive Matrices (DLAM). Then, we examine
carefully each method in each robust subspace learning/tracking frameworks with
their decomposition, their loss functions, their optimization problem and their
solvers. Furthermore, we investigate if incremental algorithms and real-time
implementations can be achieved for background/foreground separation. Finally,
experimental results on a large-scale dataset called Background Models
Challenge (BMC 2012) show the comparative performance of 32 different robust
subspace learning/tracking methods.Comment: 121 pages, 5 figures, submitted to Computer Science Review. arXiv
admin note: text overlap with arXiv:1312.7167, arXiv:1109.6297,
arXiv:1207.3438, arXiv:1105.2126, arXiv:1404.7592, arXiv:1210.0805,
arXiv:1403.8067 by other authors, Computer Science Review, November 201
- …