2,073 research outputs found
A Multiple Hypothesis Testing Approach to Low-Complexity Subspace Unmixing
Subspace-based signal processing traditionally focuses on problems involving
a few subspaces. Recently, a number of problems in different application areas
have emerged that involve a significantly larger number of subspaces relative
to the ambient dimension. It becomes imperative in such settings to first
identify a smaller set of active subspaces that contribute to the observation
before further processing can be carried out. This problem of identification of
a small set of active subspaces among a huge collection of subspaces from a
single (noisy) observation in the ambient space is termed subspace unmixing.
This paper formally poses the subspace unmixing problem under the parsimonious
subspace-sum (PS3) model, discusses connections of the PS3 model to problems in
wireless communications, hyperspectral imaging, high-dimensional statistics and
compressed sensing, and proposes a low-complexity algorithm, termed marginal
subspace detection (MSD), for subspace unmixing. The MSD algorithm turns the
subspace unmixing problem for the PS3 model into a multiple hypothesis testing
(MHT) problem and its analysis in the paper helps control the family-wise error
rate of this MHT problem at any level under two random
signal generation models. Some other highlights of the analysis of the MSD
algorithm include: (i) it is applicable to an arbitrary collection of subspaces
on the Grassmann manifold; (ii) it relies on properties of the collection of
subspaces that are computable in polynomial time; and () it allows for
linear scaling of the number of active subspaces as a function of the ambient
dimension. Finally, numerical results are presented in the paper to better
understand the performance of the MSD algorithm.Comment: Submitted for journal publication; 33 pages, 14 figure
Exploring Bit-Difference for Approximate KNN Search in High-dimensional Databases
In this paper, we develop a novel index structure to support efficient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID⁺. Extensive experiments are conducted to show that our proposed method yields significant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets.Singapore-MIT Alliance (SMA
- …