366 research outputs found
Robust Sparse Coding via Self-Paced Learning
Sparse coding (SC) is attracting more and more attention due to its
comprehensive theoretical studies and its excellent performance in many signal
processing applications. However, most existing sparse coding algorithms are
nonconvex and are thus prone to becoming stuck into bad local minima,
especially when there are outliers and noisy data. To enhance the learning
robustness, in this paper, we propose a unified framework named Self-Paced
Sparse Coding (SPSC), which gradually include matrix elements into SC learning
from easy to complex. We also generalize the self-paced learning schema into
different levels of dynamic selection on samples, features and elements
respectively. Experimental results on real-world data demonstrate the efficacy
of the proposed algorithms.Comment: submitted to AAAI201
Combinatorial rigidity of Incidence systems and Application to Dictionary learning
Given a hypergraph with hyperedges and a set of \emph{pinning
subspaces}, i.e.\ globally fixed subspaces in Euclidean space , a
\emph{pinned subspace-incidence system} is the pair , with the
constraint that each pinning subspace in is contained in the subspace
spanned by the point realizations in of vertices of the
corresponding hyperedge of . This paper provides a combinatorial
characterization of pinned subspace-incidence systems that are \emph{minimally
rigid}, i.e.\ those systems that are guaranteed to generically yield a locally
unique realization.
Pinned subspace-incidence systems have applications in the \emph{Dictionary
Learning (aka sparse coding)} problem, i.e.\ the problem of obtaining a sparse
representation of a given set of data vectors by learning \emph{dictionary
vectors} upon which the data vectors can be written as sparse linear
combinations. Viewing the dictionary vectors from a geometry perspective as the
spanning set of a subspace arrangement, the result gives a tight bound on the
number of dictionary vectors for sufficiently randomly chosen data vectors, and
gives a way of constructing a dictionary that meets the bound. For less
stringent restrictions on data, but a natural modification of the dictionary
learning problem, a further dictionary learning algorithm is provided. Although
there are recent rigidity based approaches for low rank matrix completion, we
are unaware of prior application of combinatorial rigidity techniques in the
setting of Dictionary Learning. We also provide a systematic classification of
problems related to dictionary learning together with various algorithms, their
assumptions and performance.Comment: arXiv admin note: text overlap with arXiv:1503.01837, arXiv:1402.734
SPRIGHT: A Fast and Robust Framework for Sparse Walsh-Hadamard Transform
We consider the problem of computing the Walsh-Hadamard Transform (WHT) of
some -length input vector in the presence of noise, where the -point
Walsh spectrum is -sparse with scaling sub-linearly in
the input dimension for some . Over the past decade, there has
been a resurgence in research related to the computation of Discrete Fourier
Transform (DFT) for some length- input signal that has a -sparse Fourier
spectrum. In particular, through a sparse-graph code design, our earlier work
on the Fast Fourier Aliasing-based Sparse Transform (FFAST) algorithm computes
the -sparse DFT in time by taking noiseless samples.
Inspired by the coding-theoretic design framework, Scheibler et al. proposed
the Sparse Fast Hadamard Transform (SparseFHT) algorithm that elegantly
computes the -sparse WHT in the absence of noise using
samples in time . However, the SparseFHT algorithm explicitly
exploits the noiseless nature of the problem, and is not equipped to deal with
scenarios where the observations are corrupted by noise. Therefore, a question
of critical interest is whether this coding-theoretic framework can be made
robust to noise. Further, if the answer is yes, what is the extra price that
needs to be paid for being robust to noise? In this paper, we show, quite
interestingly, that there is {\it no extra price} that needs to be paid for
being robust to noise other than a constant factor. In other words, we can
maintain the same sample complexity and the computational
complexity as those of the noiseless case, using our SParse
Robust Iterative Graph-based Hadamard Transform (SPRIGHT) algorithm.Comment: Part of our results was reported in ISIT 2014, titled "The SPRIGHT
algorithm for robust sparse Hadamard Transforms.
Tight Hardness for Shortest Cycles and Paths in Sparse Graphs
Fine-grained reductions have established equivalences between many core
problems with -time algorithms on -node weighted graphs,
such as Shortest Cycle, All-Pairs Shortest Paths (APSP), Radius, Replacement
Paths, Second Shortest Paths, and so on. These problems also have
-time algorithms on -edge -node weighted graphs, and such
algorithms have wider applicability. Are these bounds optimal when ?
Starting from the hypothesis that the minimum weight -Clique
problem in edge weighted graphs requires time, we prove that
for all sparsities of the form , there is no time algorithm for for \emph{any} of the below
problems:
Minimum Weight -Cycle in a directed weighted graph,
Shortest Cycle in a directed weighted graph,
APSP in a directed or undirected weighted graph,
Radius (or Eccentricities) in a directed or undirected weighted graph,
Wiener index of a directed or undirected weighted graph,
Replacement Paths in a directed weighted graph,
Second Shortest Path in a directed weighted graph,
Betweenness Centrality of a given node in a directed weighted graph.
That is, we prove hardness for a variety of sparse graph problems from the
hardness of a dense graph problem. Our results also lead to new conditional
lower bounds from several related hypothesis for unweighted sparse graph
problems including -cycle, shortest cycle, Radius, Wiener index and APSP.Comment: Updated the [AR16] citatio
FAQ: Questions Asked Frequently
We define and study the Functional Aggregate Query (FAQ) problem, which
encompasses many frequently asked questions in constraint satisfaction,
databases, matrix operations, probabilistic graphical models and logic. This is
our main conceptual contribution.
We then present a simple algorithm called "InsideOut" to solve this general
problem. InsideOut is a variation of the traditional dynamic programming
approach for constraint programming based on variable elimination. Our
variation adds a couple of simple twists to basic variable elimination in order
to deal with the generality of FAQ, to take full advantage of Grohe and Marx's
fractional edge cover framework, and of the analysis of recent worst-case
optimal relational join algorithms.
As is the case with constraint programming and graphical model inference, to
make InsideOut run efficiently we need to solve an optimization problem to
compute an appropriate 'variable ordering'. The main technical contribution of
this work is a precise characterization of when a variable ordering is
'semantically equivalent' to the variable ordering given by the input FAQ
expression. Then, we design an approximation algorithm to find an equivalent
variable ordering that has the best 'fractional FAQ-width'. Our results imply a
host of known and a few new results in graphical model inference, matrix
operations, relational joins, and logic.
We also briefly explain how recent algorithms on beyond worst-case analysis
for joins and those for solving SAT and #SAT can be viewed as variable
elimination to solve FAQ over compactly represented input functions
Improved Constructions for Non-adaptive Threshold Group Testing
The basic goal in combinatorial group testing is to identify a set of up to
defective items within a large population of size using a pooling
strategy. Namely, the items can be grouped together in pools, and a single
measurement would reveal whether there are one or more defectives in the pool.
The threshold model is a generalization of this idea where a measurement
returns positive if the number of defectives in the pool reaches a fixed
threshold , negative if this number is no more than a fixed lower
threshold , and may behave arbitrarily otherwise. We study
non-adaptive threshold group testing (in a possibly noisy setting) and show
that, for this problem, measurements (where and is any fixed constant) suffice to identify the defectives,
and also present almost matching lower bounds. This significantly improves the
previously known (non-constructive) upper bound .
Moreover, we obtain a framework for explicit construction of measurement
schemes using lossless condensers. The number of measurements resulting from
this scheme is ideally bounded by . Using
state-of-the-art constructions of lossless condensers, however, we obtain
explicit testing schemes with and
measurements, for arbitrary constant .Comment: Revised draft of the full version. Contains various edits and a new
lower bounds section. Preliminary version appeared in Proceedings of the 37th
International Colloquium on Automata, Languages and Programming (ICALP), 201
Exact Learning from an Honest Teacher That Answers Membership Queries
Given a teacher that holds a function from some class of functions
. The teacher can receive from the learner an element~ in the domain
(a query) and returns the value of the function in , . The
learner goal is to find with a minimum number of queries, optimal time
complexity, and optimal resources.
In this survey, we present some of the results known from the literature,
different techniques used, some new problems, and open problems
Provable Bounds for Learning Some Deep Representations
We give algorithms with provable guarantees that learn a class of deep nets
in the generative model view popularized by Hinton and others. Our generative
model is an node multilayer neural net that has degree at most
for some and each edge has a random edge weight in . Our
algorithm learns {\em almost all} networks in this class with polynomial
running time. The sample complexity is quadratic or cubic depending upon the
details of the model.
The algorithm uses layerwise learning. It is based upon a novel idea of
observing correlations among features and using these to infer the underlying
edge structure via a global graph recovery procedure. The analysis of the
algorithm reveals interesting structure of neural networks with random edge
weights.Comment: The first 18 pages serve as an extended abstract and a 36 pages long
technical appendix follow
A Survey on Learning to Hash
Nearest neighbor search is a problem of finding the data points from the
database such that the distances from them to the query point are the smallest.
Learning to hash is one of the major solutions to this problem and has been
widely studied recently. In this paper, we present a comprehensive survey of
the learning to hash algorithms, categorize them according to the manners of
preserving the similarities into: pairwise similarity preserving, multiwise
similarity preserving, implicit similarity preserving, as well as quantization,
and discuss their relations. We separate quantization from pairwise similarity
preserving as the objective function is very different though quantization, as
we show, can be derived from preserving the pairwise similarities. In addition,
we present the evaluation protocols, and the general performance analysis, and
point out that the quantization algorithms perform superiorly in terms of
search accuracy, search time cost, and space cost. Finally, we introduce a few
emerging topics.Comment: To appear in IEEE Transactions On Pattern Analysis and Machine
Intelligence (TPAMI
Multi-view Vector-valued Manifold Regularization for Multi-label Image Classification
In computer vision, image datasets used for classification are naturally
associated with multiple labels and comprised of multiple views, because each
image may contain several objects (e.g. pedestrian, bicycle and tree) and is
properly characterized by multiple visual features (e.g. color, texture and
shape). Currently available tools ignore either the label relationship or the
view complementary. Motivated by the success of the vector-valued function that
constructs matrix-valued kernels to explore the multi-label structure in the
output space, we introduce multi-view vector-valued manifold regularization
(MVMR) to integrate multiple features. MVMR exploits
the complementary property of different features and discovers the intrinsic
local geometry of the compact support shared by different features under the
theme of manifold regularization. We conducted extensive experiments on two
challenging, but popular datasets, PASCAL VOC' 07 (VOC) and MIR Flickr (MIR),
and validated the effectiveness of the proposed MVMR for image
classification
- …