Search CORE

366 research outputs found

Robust Sparse Coding via Self-Paced Learning

Author: Feng Xiaodong
Tang Zhiwei
Wu Sen
Publication venue
Publication date: 09/09/2017
Field of study

Sparse coding (SC) is attracting more and more attention due to its comprehensive theoretical studies and its excellent performance in many signal processing applications. However, most existing sparse coding algorithms are nonconvex and are thus prone to becoming stuck into bad local minima, especially when there are outliers and noisy data. To enhance the learning robustness, in this paper, we propose a unified framework named Self-Paced Sparse Coding (SPSC), which gradually include matrix elements into SC learning from easy to complex. We also generalize the self-paced learning schema into different levels of dynamic selection on samples, features and elements respectively. Experimental results on real-world data demonstrate the efficacy of the proposed algorithms.Comment: submitted to AAAI201

arXiv.org e-Print Archive

Combinatorial rigidity of Incidence systems and Application to Dictionary learning

Author: Sitharam Meera
Tarifi Mohamad
Wang Menghan
Publication venue
Publication date: 13/03/2016
Field of study

Given a hypergraph

H

with

m

hyperedges and a set

Q

m

\emph{pinning subspaces}, i.e.\ globally fixed subspaces in Euclidean space

\mathbb{R}^d

, a \emph{pinned subspace-incidence system} is the pair

(H, Q)

, with the constraint that each pinning subspace in

Q

is contained in the subspace spanned by the point realizations in

\mathbb{R}^d

of vertices of the corresponding hyperedge of

H

. This paper provides a combinatorial characterization of pinned subspace-incidence systems that are \emph{minimally rigid}, i.e.\ those systems that are guaranteed to generically yield a locally unique realization. Pinned subspace-incidence systems have applications in the \emph{Dictionary Learning (aka sparse coding)} problem, i.e.\ the problem of obtaining a sparse representation of a given set of data vectors by learning \emph{dictionary vectors} upon which the data vectors can be written as sparse linear combinations. Viewing the dictionary vectors from a geometry perspective as the spanning set of a subspace arrangement, the result gives a tight bound on the number of dictionary vectors for sufficiently randomly chosen data vectors, and gives a way of constructing a dictionary that meets the bound. For less stringent restrictions on data, but a natural modification of the dictionary learning problem, a further dictionary learning algorithm is provided. Although there are recent rigidity based approaches for low rank matrix completion, we are unaware of prior application of combinatorial rigidity techniques in the setting of Dictionary Learning. We also provide a systematic classification of problems related to dictionary learning together with various algorithms, their assumptions and performance.Comment: arXiv admin note: text overlap with arXiv:1503.01837, arXiv:1402.734

arXiv.org e-Print Archive

SPRIGHT: A Fast and Robust Framework for Sparse Walsh-Hadamard Transform

Author: Bradley Joseph K.
Li Xiao
Pawar Sameer
Ramchandran Kannan
Publication venue
Publication date: 25/08/2015
Field of study

We consider the problem of computing the Walsh-Hadamard Transform (WHT) of some

N

-length input vector in the presence of noise, where the

N

-point Walsh spectrum is

K

-sparse with

K = {O}(N^{\delta})

scaling sub-linearly in the input dimension

N

for some

0<\delta<1

. Over the past decade, there has been a resurgence in research related to the computation of Discrete Fourier Transform (DFT) for some length-

N

input signal that has a

K

-sparse Fourier spectrum. In particular, through a sparse-graph code design, our earlier work on the Fast Fourier Aliasing-based Sparse Transform (FFAST) algorithm computes the

K

-sparse DFT in time

{O}(K\log K)

by taking

{O}(K)

noiseless samples. Inspired by the coding-theoretic design framework, Scheibler et al. proposed the Sparse Fast Hadamard Transform (SparseFHT) algorithm that elegantly computes the

K

-sparse WHT in the absence of noise using

{O}(K\log N)

samples in time

{O}(K\log^2 N)

. However, the SparseFHT algorithm explicitly exploits the noiseless nature of the problem, and is not equipped to deal with scenarios where the observations are corrupted by noise. Therefore, a question of critical interest is whether this coding-theoretic framework can be made robust to noise. Further, if the answer is yes, what is the extra price that needs to be paid for being robust to noise? In this paper, we show, quite interestingly, that there is {\it no extra price} that needs to be paid for being robust to noise other than a constant factor. In other words, we can maintain the same sample complexity

{O}(K\log N)

and the computational complexity

{O}(K\log^2 N)

as those of the noiseless case, using our SParse Robust Iterative Graph-based Hadamard Transform (SPRIGHT) algorithm.Comment: Part of our results was reported in ISIT 2014, titled "The SPRIGHT algorithm for robust sparse Hadamard Transforms.

arXiv.org e-Print Archive

Tight Hardness for Shortest Cycles and Paths in Sparse Graphs

Author: Lincoln Andrea
Williams Ryan
Williams Virginia Vassilevska
Publication venue
Publication date: 05/05/2020
Field of study

Fine-grained reductions have established equivalences between many core problems with

\tilde{O}(n^3)

-time algorithms on

n

-node weighted graphs, such as Shortest Cycle, All-Pairs Shortest Paths (APSP), Radius, Replacement Paths, Second Shortest Paths, and so on. These problems also have

\tilde{O}(mn)

-time algorithms on

m

-edge

n

-node weighted graphs, and such algorithms have wider applicability. Are these

mn

bounds optimal when

m \ll n^2

? Starting from the hypothesis that the minimum weight

(2\ell+1)

-Clique problem in edge weighted graphs requires

n^{2\ell+1-o(1)}

time, we prove that for all sparsities of the form

m = \Theta(n^{1+1/\ell})

, there is no

O(n^2 + mn^{1-\epsilon})

time algorithm for

\epsilon>0

for \emph{any} of the below problems: Minimum Weight

(2\ell+1)

-Cycle in a directed weighted graph, Shortest Cycle in a directed weighted graph, APSP in a directed or undirected weighted graph, Radius (or Eccentricities) in a directed or undirected weighted graph, Wiener index of a directed or undirected weighted graph, Replacement Paths in a directed weighted graph, Second Shortest Path in a directed weighted graph, Betweenness Centrality of a given node in a directed weighted graph. That is, we prove hardness for a variety of sparse graph problems from the hardness of a dense graph problem. Our results also lead to new conditional lower bounds from several related hypothesis for unweighted sparse graph problems including

k

-cycle, shortest cycle, Radius, Wiener index and APSP.Comment: Updated the [AR16] citatio

arXiv.org e-Print Archive

FAQ: Questions Asked Frequently

Author: Khamis Mahmoud Abo
Ngo Hung Q.
Rudra Atri
Publication venue
Publication date: 05/02/2017
Field of study

We define and study the Functional Aggregate Query (FAQ) problem, which encompasses many frequently asked questions in constraint satisfaction, databases, matrix operations, probabilistic graphical models and logic. This is our main conceptual contribution. We then present a simple algorithm called "InsideOut" to solve this general problem. InsideOut is a variation of the traditional dynamic programming approach for constraint programming based on variable elimination. Our variation adds a couple of simple twists to basic variable elimination in order to deal with the generality of FAQ, to take full advantage of Grohe and Marx's fractional edge cover framework, and of the analysis of recent worst-case optimal relational join algorithms. As is the case with constraint programming and graphical model inference, to make InsideOut run efficiently we need to solve an optimization problem to compute an appropriate 'variable ordering'. The main technical contribution of this work is a precise characterization of when a variable ordering is 'semantically equivalent' to the variable ordering given by the input FAQ expression. Then, we design an approximation algorithm to find an equivalent variable ordering that has the best 'fractional FAQ-width'. Our results imply a host of known and a few new results in graphical model inference, matrix operations, relational joins, and logic. We also briefly explain how recent algorithms on beyond worst-case analysis for joins and those for solving SAT and #SAT can be viewed as variable elimination to solve FAQ over compactly represented input functions

arXiv.org e-Print Archive

Improved Constructions for Non-adaptive Threshold Group Testing

Author: Cheraghchi Mahdi
Publication venue
Publication date: 17/01/2013
Field of study

The basic goal in combinatorial group testing is to identify a set of up to

d

defective items within a large population of size

n \gg d

using a pooling strategy. Namely, the items can be grouped together in pools, and a single measurement would reveal whether there are one or more defectives in the pool. The threshold model is a generalization of this idea where a measurement returns positive if the number of defectives in the pool reaches a fixed threshold

u > 0

, negative if this number is no more than a fixed lower threshold

\ell < u

, and may behave arbitrarily otherwise. We study non-adaptive threshold group testing (in a possibly noisy setting) and show that, for this problem,

O(d^{g+2} (\log d) \log(n/d))

measurements (where

g := u-\ell-1

and

u

is any fixed constant) suffice to identify the defectives, and also present almost matching lower bounds. This significantly improves the previously known (non-constructive) upper bound

O(d^{u+1} \log(n/d))

. Moreover, we obtain a framework for explicit construction of measurement schemes using lossless condensers. The number of measurements resulting from this scheme is ideally bounded by

O(d^{g+3} (\log d) \log n)

. Using state-of-the-art constructions of lossless condensers, however, we obtain explicit testing schemes with

O(d^{g+3} (\log d) qpoly(\log n))

and

O(d^{g+3+\beta} poly(\log n))

measurements, for arbitrary constant

\beta > 0

.Comment: Revised draft of the full version. Contains various edits and a new lower bounds section. Preliminary version appeared in Proceedings of the 37th International Colloquium on Automata, Languages and Programming (ICALP), 201

arXiv.org e-Print Archive

Exact Learning from an Honest Teacher That Answers Membership Queries

Author: Bshouty Nader H.
Publication venue
Publication date: 13/06/2017
Field of study

Given a teacher that holds a function

f:X\to R

from some class of functions

C

. The teacher can receive from the learner an element~

d

in the domain

X

(a query) and returns the value of the function in

d

f(d)\in R

. The learner goal is to find

f

with a minimum number of queries, optimal time complexity, and optimal resources. In this survey, we present some of the results known from the literature, different techniques used, some new problems, and open problems

arXiv.org e-Print Archive

Provable Bounds for Learning Some Deep Representations

Author: Arora Sanjeev
Bhaskara Aditya
Ge Rong
Ma Tengyu
Publication venue
Publication date: 23/10/2013
Field of study

We give algorithms with provable guarantees that learn a class of deep nets in the generative model view popularized by Hinton and others. Our generative model is an

n

node multilayer neural net that has degree at most

n^{\gamma}

for some

\gamma <1

and each edge has a random edge weight in

[-1,1]

. Our algorithm learns {\em almost all} networks in this class with polynomial running time. The sample complexity is quadratic or cubic depending upon the details of the model. The algorithm uses layerwise learning. It is based upon a novel idea of observing correlations among features and using these to infer the underlying edge structure via a global graph recovery procedure. The analysis of the algorithm reveals interesting structure of neural networks with random edge weights.Comment: The first 18 pages serve as an extended abstract and a 36 pages long technical appendix follow

arXiv.org e-Print Archive

A Survey on Learning to Hash

Author: Sebe Nicu
Shen Heng Tao
Song Jingkuan
Wang Jingdong
Zhang Ting
Publication venue
Publication date: 21/04/2017
Field of study

Nearest neighbor search is a problem of finding the data points from the database such that the distances from them to the query point are the smallest. Learning to hash is one of the major solutions to this problem and has been widely studied recently. In this paper, we present a comprehensive survey of the learning to hash algorithms, categorize them according to the manners of preserving the similarities into: pairwise similarity preserving, multiwise similarity preserving, implicit similarity preserving, as well as quantization, and discuss their relations. We separate quantization from pairwise similarity preserving as the objective function is very different though quantization, as we show, can be derived from preserving the pairwise similarities. In addition, we present the evaluation protocols, and the general performance analysis, and point out that the quantization algorithms perform superiorly in terms of search accuracy, search time cost, and space cost. Finally, we introduce a few emerging topics.Comment: To appear in IEEE Transactions On Pattern Analysis and Machine Intelligence (TPAMI

arXiv.org e-Print Archive

Multi-view Vector-valued Manifold Regularization for Multi-label Image Classification

Author: Liu Hong
Luo Yong
Tao Dacheng
Wen Yonggang
Xu Chang
Xu Chao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/04/2019
Field of study

In computer vision, image datasets used for classification are naturally associated with multiple labels and comprised of multiple views, because each image may contain several objects (e.g. pedestrian, bicycle and tree) and is properly characterized by multiple visual features (e.g. color, texture and shape). Currently available tools ignore either the label relationship or the view complementary. Motivated by the success of the vector-valued function that constructs matrix-valued kernels to explore the multi-label structure in the output space, we introduce multi-view vector-valued manifold regularization (MV

\mathbf{^3}

MR) to integrate multiple features. MV

\mathbf{^3}

MR exploits the complementary property of different features and discovers the intrinsic local geometry of the compact support shared by different features under the theme of manifold regularization. We conducted extensive experiments on two challenging, but popular datasets, PASCAL VOC' 07 (VOC) and MIR Flickr (MIR), and validated the effectiveness of the proposed MV

\mathbf{^3}

MR for image classification

arXiv.org e-Print Archive