37,869 research outputs found

    Scalable Image Retrieval by Sparse Product Quantization

    Get PDF
    Fast Approximate Nearest Neighbor (ANN) search technique for high-dimensional feature indexing and retrieval is the crux of large-scale image retrieval. A recent promising technique is Product Quantization, which attempts to index high-dimensional image features by decomposing the feature space into a Cartesian product of low dimensional subspaces and quantizing each of them separately. Despite the promising results reported, their quantization approach follows the typical hard assignment of traditional quantization methods, which may result in large quantization errors and thus inferior search performance. Unlike the existing approaches, in this paper, we propose a novel approach called Sparse Product Quantization (SPQ) to encoding the high-dimensional feature vectors into sparse representation. We optimize the sparse representations of the feature vectors by minimizing their quantization errors, making the resulting representation is essentially close to the original data in practice. Experiments show that the proposed SPQ technique is not only able to compress data, but also an effective encoding technique. We obtain state-of-the-art results for ANN search on four public image datasets and the promising results of content-based image retrieval further validate the efficacy of our proposed method.Comment: 12 page

    Compressed sensing using sparse binary measurements: a rateless coding perspective

    Get PDF
    Compressed Sensing (CS) methods using sparse binary measurement matrices and iterative message-passing re- covery procedures have been recently investigated due to their low computational complexity and excellent performance. Drawing much of inspiration from sparse-graph codes such as Low-Density Parity-Check (LDPC) codes, these studies use analytical tools from modern coding theory to analyze CS solutions. In this paper, we consider and systematically analyze the CS setup inspired by a class of efficient, popular and flexible sparse-graph codes called rateless codes. The proposed rateless CS setup is asymptotically analyzed using tools such as Density Evolution and EXIT charts and fine-tuned using degree distribution optimization techniques

    Sparse p-Adic Data Coding for Computationally Efficient and Effective Big Data Analytics

    Get PDF
    We develop the theory and practical implementation of p-adic sparse coding of data. Rather than the standard, sparsifying criterion that uses the L0L_0 pseudo-norm, we use the p-adic norm.We require that the hierarchy or tree be node-ranked, as is standard practice in agglomerative and other hierarchical clustering, but not necessarily with decision trees. In order to structure the data, all computational processing operations are direct reading of the data, or are bounded by a constant number of direct readings of the data, implying linear computational time. Through p-adic sparse data coding, efficient storage results, and for bounded p-adic norm stored data, search and retrieval are constant time operations. Examples show the effectiveness of this new approach to content-driven encoding and displaying of data

    A location-aware embedding technique for accurate landmark recognition

    Full text link
    The current state of the research in landmark recognition highlights the good accuracy which can be achieved by embedding techniques, such as Fisher vector and VLAD. All these techniques do not exploit spatial information, i.e. consider all the features and the corresponding descriptors without embedding their location in the image. This paper presents a new variant of the well-known VLAD (Vector of Locally Aggregated Descriptors) embedding technique which accounts, at a certain degree, for the location of features. The driving motivation comes from the observation that, usually, the most interesting part of an image (e.g., the landmark to be recognized) is almost at the center of the image, while the features at the borders are irrelevant features which do no depend on the landmark. The proposed variant, called locVLAD (location-aware VLAD), computes the mean of the two global descriptors: the VLAD executed on the entire original image, and the one computed on a cropped image which removes a certain percentage of the image borders. This simple variant shows an accuracy greater than the existing state-of-the-art approach. Experiments are conducted on two public datasets (ZuBuD and Holidays) which are used both for training and testing. Morever a more balanced version of ZuBuD is proposed.Comment: 6 pages, 5 figures, ICDSC 201

    Learning Word Representations with Hierarchical Sparse Coding

    Full text link
    We propose a new method for learning word representations using hierarchical regularization in sparse coding inspired by the linguistic study of word meanings. We show an efficient learning algorithm based on stochastic proximal methods that is significantly faster than previous approaches, making it possible to perform hierarchical sparse coding on a corpus of billions of word tokens. Experiments on various benchmark tasks---word similarity ranking, analogies, sentence completion, and sentiment analysis---demonstrate that the method outperforms or is competitive with state-of-the-art methods. Our word representations are available at \url{http://www.ark.cs.cmu.edu/dyogatam/wordvecs/}

    On joint detection and decoding of linear block codes on Gaussian vector channels

    Get PDF
    Optimal receivers recovering signals transmitted across noisy communication channels employ a maximum-likelihood (ML) criterion to minimize the probability of error. The problem of finding the most likely transmitted symbol is often equivalent to finding the closest lattice point to a given point and is known to be NP-hard. In systems that employ error-correcting coding for data protection, the symbol space forms a sparse lattice, where the sparsity structure is determined by the code. In such systems, ML data recovery may be geometrically interpreted as a search for the closest point in the sparse lattice. In this paper, motivated by the idea of the "sphere decoding" algorithm of Fincke and Pohst, we propose an algorithm that finds the closest point in the sparse lattice to the given vector. This given vector is not arbitrary, but rather is an unknown sparse lattice point that has been perturbed by an additive noise vector whose statistical properties are known. The complexity of the proposed algorithm is thus a random variable. We study its expected value, averaged over the noise and over the lattice. For binary linear block codes, we find the expected complexity in closed form. Simulation results indicate significant performance gains over systems employing separate detection and decoding, yet are obtained at a complexity that is practically feasible over a wide range of system parameters
    • …
    corecore