140 research outputs found
Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery
PCA is one of the most widely used dimension reduction techniques. A related
easier problem is "subspace learning" or "subspace estimation". Given
relatively clean data, both are easily solved via singular value decomposition
(SVD). The problem of subspace learning or PCA in the presence of outliers is
called robust subspace learning or robust PCA (RPCA). For long data sequences,
if one tries to use a single lower dimensional subspace to represent the data,
the required subspace dimension may end up being quite large. For such data, a
better model is to assume that it lies in a low-dimensional subspace that can
change over time, albeit gradually. The problem of tracking such data (and the
subspaces) while being robust to outliers is called robust subspace tracking
(RST). This article provides a magazine-style overview of the entire field of
robust subspace learning and tracking. In particular solutions for three
problems are discussed in detail: RPCA via sparse+low-rank matrix decomposition
(S+LR), RST via S+LR, and "robust subspace recovery (RSR)". RSR assumes that an
entire data vector is either an outlier or an inlier. The S+LR formulation
instead assumes that outliers occur on only a few data vector indices and hence
are well modeled as sparse corruptions.Comment: To appear, IEEE Signal Processing Magazine, July 201
Exploratory search through large video corpora
Activity retrieval is a growing field in electrical engineering that specializes in the search and retrieval of relevant activities and events in video corpora. With the affordability and popularity of cameras for government, personal and retail use, the quantity of available video data is rapidly outscaling our ability to reason over it. Towards the end of empowering users to navigate and interact with the contents of these video corpora, we propose a framework for exploratory search that emphasizes activity structure and search space reduction over complex feature representations.
Exploratory search is a user driven process wherein a person provides a system with a query describing the activity, event, or object he is interested in finding. Typically, this description takes the implicit form of one or more exemplar videos, but it can also involve an explicit description. The system returns candidate matches, followed by query refinement and iteration. System performance is judged by the run-time of the system and the precision/recall curve of of the query matches returned.
Scaling is one of the primary challenges in video search. From vast web-video archives like youtube (1 billion videos and counting) to the 30 million active surveillance cameras shooting an estimated 4 billion hours of footage every week in the United States, trying to find a set of matches can be like looking for a needle in a haystack. Our goal is to create an efficient archival representation of video corpora that can be calculated in real-time as video streams in, and then enables a user to quickly get a set of results that match.
First, we design a system for rapidly identifying simple queries in large-scale video corpora. Instead of focusing on feature design, our system focuses on the spatiotemporal relationships between those features as a means of disambiguating an activity of interest from background. We define a semantic feature vocabulary of concepts that are both readily extracted from video and easily understood by an operator. As data streams in, features are hashed to an inverted index and retrieved in constant time after the system is presented with a user's query.
We take a zero-shot approach to exploratory search: the user manually assembles vocabulary elements like color, speed, size and type into a graph. Given that information, we perform an initial downsampling of the archived data, and design a novel dynamic programming approach based on genome-sequencing to search for similar patterns. Experimental results indicate that this approach outperforms other methods for detecting activities in surveillance video datasets.
Second, we address the problem of representing complex activities that take place over long spans of space and time. Subgraph and graph matching methods have seen limited use in exploratory search because both problems are provably NP-hard. In this work, we render these problems computationally tractable by identifying the maximally discriminative spanning tree (MDST), and using dynamic programming to optimally reduce the archive data based on a custom algorithm for tree-matching in attributed relational graphs. We demonstrate the efficacy of this approach on popular surveillance video datasets in several modalities.
Finally, we design an approach for successive search space reduction in subgraph matching problems. Given a query graph and archival data, our algorithm iteratively selects spanning trees from the query graph that optimize the expected search space reduction at each step until the archive converges. We use this approach to efficiently reason over video surveillance datasets, simulated data, as well as large graphs of protein data
Recommended from our members
Structured Sub-Nyquist Sampling with Applications in Compressive Toeplitz Covariance Estimation, Super-Resolution and Phase Retrieval
Sub-Nyquist sampling has received a huge amount of interest in the past decade. In classical compressed sensing theory, if the measurement procedure satisfies a particular condition known as Restricted Isometry Property (RIP), we can achieve stable recovery of signals of low-dimensional intrinsic structures with an order-wise optimal sample size. Such low-dimensional structures include sparse and low rank for both vector and matrix cases. The main drawback of conventional compressed sensing theory is that random measurements are required to ensure the RIP property. However, in many applications such as imaging and array signal processing, applying independent random measurements may not be practical as the systems are deterministic. Moreover, random measurements based compressed sensing always exploits convex programs for signal recovery even in the noiseless case, and solving those programs is computationally intensive if the ambient dimension is large, especially in the matrix case. The main contribution of this dissertation is that we propose a deterministic sub-Nyquist sampling framework for compressing the structured signal and come up with computationally efficient algorithms. Besides widely studied sparse and low-rank structures, we particularly focus on the cases that the signals of interest are stationary or the measurements are of Fourier type. The key difference between our work from classical compressed sensing theory is that we explicitly exploit the second-order statistics of the signals, and study the equivalent quadratic measurement model in the correlation domain. The essential observation made in this dissertation is that a difference/sum coarray structure will arise from the quadratic model if the measurements are of Fourier type. With these observations, we are able to achieve a better compression rate for covariance estimation, identify more sources in array signal processing or recover the signals of larger sparsity. In this dissertation, we will first study the problem of Toeplitz covariance estimation. In particular, we will show how to achieve an order-wise optimal compression rate using the idea of sparse arrays in both general and low-rank cases. Then, an analysis framework of super-resolution with positivity constraint is established. We will present fundamental robustness guarantees, efficient algorithms and applications in practices. Next, we will study the problem of phase-retrieval for which we successfully apply the sparse array ideas by fully exploiting the quadratic measurement model. We achieve near-optimal sample complexity for both sparse and general cases with practical Fourier measurements and provide efficient and deterministic recovery algorithms. In the end, we will further elaborate on the essential role of non-negative constraint in underdetermined inverse problems. In particular, we will analyze the nonlinear co-array interpolation problem and develop a universal upper bound of the interpolation error. Bilinear problem with non-negative constraint will be considered next and the exact characterization of the ambiguous solutions will be established for the first time in literature. At last, we will show how to apply the nested array idea to solve real problems such as Kriging. Using spatial correlation information, we are able to have a stable estimate of the field of interest with fewer sensors than classic methodologies. Extensive numerical experiments are implemented to demonstrate our theoretical claims
Dagstuhl News January - December 2008
"Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic
Reconstructing Graph Diffusion History from a Single Snapshot
Diffusion on graphs is ubiquitous with numerous high-impact applications. In
these applications, complete diffusion histories play an essential role in
terms of identifying dynamical patterns, reflecting on precaution actions, and
forecasting intervention effects. Despite their importance, complete diffusion
histories are rarely available and are highly challenging to reconstruct due to
ill-posedness, explosive search space, and scarcity of training data. To date,
few methods exist for diffusion history reconstruction. They are exclusively
based on the maximum likelihood estimation (MLE) formulation and require to
know true diffusion parameters. In this paper, we study an even harder problem,
namely reconstructing Diffusion history from A single SnapsHot} (DASH), where
we seek to reconstruct the history from only the final snapshot without knowing
true diffusion parameters. We start with theoretical analyses that reveal a
fundamental limitation of the MLE formulation. We prove: (a) estimation error
of diffusion parameters is unavoidable due to NP-hardness of diffusion
parameter estimation, and (b) the MLE formulation is sensitive to estimation
error of diffusion parameters. To overcome the inherent limitation of the MLE
formulation, we propose a novel barycenter formulation: finding the barycenter
of the posterior distribution of histories, which is provably stable against
the estimation error of diffusion parameters. We further develop an effective
solver named DIffusion hiTting Times with Optimal proposal (DITTO) by reducing
the problem to estimating posterior expected hitting times via the
Metropolis--Hastings Markov chain Monte Carlo method (M--H MCMC) and employing
an unsupervised graph neural network to learn an optimal proposal to accelerate
the convergence of M--H MCMC. We conduct extensive experiments to demonstrate
the efficacy of the proposed method.Comment: Full version of the KDD 2023 paper. Our code is available at
https://github.com/q-rz/KDD23-DITT
A Survey of Self-supervised Learning from Multiple Perspectives: Algorithms, Applications and Future Trends
Deep supervised learning algorithms generally require large numbers of
labeled examples to achieve satisfactory performance. However, collecting and
labeling too many examples can be costly and time-consuming. As a subset of
unsupervised learning, self-supervised learning (SSL) aims to learn useful
features from unlabeled examples without any human-annotated labels. SSL has
recently attracted much attention and many related algorithms have been
developed. However, there are few comprehensive studies that explain the
connections and evolution of different SSL variants. In this paper, we provide
a review of various SSL methods from the perspectives of algorithms,
applications, three main trends, and open questions. First, the motivations of
most SSL algorithms are introduced in detail, and their commonalities and
differences are compared. Second, typical applications of SSL in domains such
as image processing and computer vision (CV), as well as natural language
processing (NLP), are discussed. Finally, the three main trends of SSL and the
open research questions are discussed. A collection of useful materials is
available at https://github.com/guijiejie/SSL
Shape Representations Using Nested Descriptors
The problem of shape representation is a core problem in computer vision. It can be argued that shape representation is the most central representational problem for computer vision, since unlike texture or color, shape alone can be used for perceptual tasks such as image matching, object detection and object categorization.
This dissertation introduces a new shape representation called the nested descriptor. A nested descriptor represents shape both globally and locally by pooling salient scaled and oriented complex gradients in a large nested support set. We show that this nesting property introduces a nested correlation structure that enables a new local distance function called the nesting distance, which provides a provably robust similarity function for image matching. Furthermore, the nesting property suggests an elegant flower like normalization strategy called a log-spiral difference. We show that this normalization enables a compact binary representation and is equivalent to a form a bottom up saliency. This suggests that the nested descriptor representational power is due to representing salient edges, which makes a fundamental connection between the saliency and local feature descriptor literature. In this dissertation, we introduce three examples of shape representation using nested descriptors: nested shape descriptors for imagery, nested motion descriptors for video and nested pooling for activities. We show evaluation results for these representations that demonstrate state-of-the-art performance for image matching, wide baseline stereo and activity recognition tasks
Spectral Embedding Norm: Looking Deep into the Spectrum of the Graph Laplacian
The extraction of clusters from a dataset which includes multiple clusters
and a significant background component is a non-trivial task of practical
importance. In image analysis this manifests for example in anomaly detection
and target detection. The traditional spectral clustering algorithm, which
relies on the leading eigenvectors to detect clusters, fails in such
cases. In this paper we propose the {\it spectral embedding norm} which sums
the squared values of the first normalized eigenvectors, where can be
significantly larger than . We prove that this quantity can be used to
separate clusters from the background in unbalanced settings, including extreme
cases such as outlier detection. The performance of the algorithm is not
sensitive to the choice of , and we demonstrate its application on synthetic
and real-world remote sensing and neuroimaging datasets
- …