289 research outputs found
Sparse Subspace Clustering: Algorithm, Theory, and Applications
In many real-world problems, we are dealing with collections of
high-dimensional data, such as images, videos, text and web documents, DNA
microarray data, and more. Often, high-dimensional data lie close to
low-dimensional structures corresponding to several classes or categories the
data belongs to. In this paper, we propose and study an algorithm, called
Sparse Subspace Clustering (SSC), to cluster data points that lie in a union of
low-dimensional subspaces. The key idea is that, among infinitely many possible
representations of a data point in terms of other points, a sparse
representation corresponds to selecting a few points from the same subspace.
This motivates solving a sparse optimization program whose solution is used in
a spectral clustering framework to infer the clustering of data into subspaces.
Since solving the sparse optimization program is in general NP-hard, we
consider a convex relaxation and show that, under appropriate conditions on the
arrangement of subspaces and the distribution of data, the proposed
minimization program succeeds in recovering the desired sparse representations.
The proposed algorithm can be solved efficiently and can handle data points
near the intersections of subspaces. Another key advantage of the proposed
algorithm with respect to the state of the art is that it can deal with data
nuisances, such as noise, sparse outlying entries, and missing entries,
directly by incorporating the model of the data into the sparse optimization
program. We demonstrate the effectiveness of the proposed algorithm through
experiments on synthetic data as well as the two real-world problems of motion
segmentation and face clustering
Block-Sparse Recovery via Convex Optimization
Given a dictionary that consists of multiple blocks and a signal that lives
in the range space of only a few blocks, we study the problem of finding a
block-sparse representation of the signal, i.e., a representation that uses the
minimum number of blocks. Motivated by signal/image processing and computer
vision applications, such as face recognition, we consider the block-sparse
recovery problem in the case where the number of atoms in each block is
arbitrary, possibly much larger than the dimension of the underlying subspace.
To find a block-sparse representation of a signal, we propose two classes of
non-convex optimization programs, which aim to minimize the number of nonzero
coefficient blocks and the number of nonzero reconstructed vectors from the
blocks, respectively. Since both classes of problems are NP-hard, we propose
convex relaxations and derive conditions under which each class of the convex
programs is equivalent to the original non-convex formulation. Our conditions
depend on the notions of mutual and cumulative subspace coherence of a
dictionary, which are natural generalizations of existing notions of mutual and
cumulative coherence. We evaluate the performance of the proposed convex
programs through simulations as well as real experiments on face recognition.
We show that treating the face recognition problem as a block-sparse recovery
problem improves the state-of-the-art results by 10% with only 25% of the
training data.Comment: IEEE Transactions on Signal Processin
BIT: Bi-Level Temporal Modeling for Efficient Supervised Action Segmentation
We address the task of supervised action segmentation which aims to partition
a video into non-overlapping segments, each representing a different action.
Recent works apply transformers to perform temporal modeling at the
frame-level, which suffer from high computational cost and cannot well capture
action dependencies over long temporal horizons. To address these issues, we
propose an efficient BI-level Temporal modeling (BIT) framework that learns
explicit action tokens to represent action segments, in parallel performs
temporal modeling on frame and action levels, while maintaining a low
computational cost. Our model contains (i) a frame branch that uses convolution
to learn frame-level relationships, (ii) an action branch that uses transformer
to learn action-level dependencies with a small set of action tokens and (iii)
cross-attentions to allow communication between the two branches. We apply and
extend a set-prediction objective to allow each action token to represent one
or multiple action segments, thus can avoid learning a large number of tokens
over long videos with many segments. Thanks to the design of our action branch,
we can also seamlessly leverage textual transcripts of videos (when available)
to help action segmentation by using them to initialize the action tokens. We
evaluate our model on four video datasets (two egocentric and two third-person)
for action segmentation with and without transcripts, showing that BIT
significantly improves the state-of-the-art accuracy with much lower
computational cost (30 times faster) compared to existing transformer-based
methods.Comment: 9 pages, 6 figure
Towards Effective Multi-Label Recognition Attacks via Knowledge Graph Consistency
Many real-world applications of image recognition require multi-label
learning, whose goal is to find all labels in an image. Thus, robustness of
such systems to adversarial image perturbations is extremely important.
However, despite a large body of recent research on adversarial attacks, the
scope of the existing works is mainly limited to the multi-class setting, where
each image contains a single label. We show that the naive extensions of
multi-class attacks to the multi-label setting lead to violating label
relationships, modeled by a knowledge graph, and can be detected using a
consistency verification scheme. Therefore, we propose a graph-consistent
multi-label attack framework, which searches for small image perturbations that
lead to misclassifying a desired target set while respecting label hierarchies.
By extensive experiments on two datasets and using several multi-label
recognition models, we show that our method generates extremely successful
attacks that, unlike naive multi-label perturbations, can produce model
predictions consistent with the knowledge graph
Ionic liquid containing high-density polyethylene supported tungstate: a novel, efficient, and highly recoverable catalyst
Synthesis and catalytic application of polymeric-based nanocomposites are important subjects among researchers due to their high lipophilicity as well as high chemical and mechanical stability. In the present work, a novel nanocomposite material involving ionic liquid and high-density polyethylene supported tungstate (PE/IL-WO4=) is synthesized, characterized and its catalytic application is investigated. The coacervation method was used to incorporate 1-methyl-3-octylimidazolium bromide ([MOIm] [Br]) ionic liquid in high-density polyethylene, resulting in a PE/IL composite. Subsequently, tungstate was anchored on PE/IL to give PE/IL-WO4= catalyst. The PXRD, FT-IR, EDX, TGA, and SEM analyses were used to characterize the PE/IL-WO4= composite. This material demonstrated high catalytic efficiency in the synthesis of bioactive tetrahydrobenzo[a]xanthen-11-ones under green conditions. The recoverability and leching tests were performed to investigate the stability and durability of the designed PE/IL-WO4= catalyst under applied conditions
Automatic Calibration of Cameras with Special Motions
We consider the problem of auto-calibrating the intrinsic
parameters of a camera moving with a special motion: the
rotation axis of the camera being perpendicular to its translation
direction. Our method for calibrating the camera is
based on Kruppa’s equation which in general requires solving
a set of nonlinear equations. We prove in a theorem how
to recover the true scale of the Kruppa’s equation from the
eigenvalues of a matrix formed using the fundamental matrix
between two views
- …