5,856 research outputs found

    Streaming Binary Sketching based on Subspace Tracking and Diagonal Uniformization

    Full text link
    In this paper, we address the problem of learning compact similarity-preserving embeddings for massive high-dimensional streams of data in order to perform efficient similarity search. We present a new online method for computing binary compressed representations -sketches- of high-dimensional real feature vectors. Given an expected code length cc and high-dimensional input data points, our algorithm provides a cc-bits binary code for preserving the distance between the points from the original high-dimensional space. Our algorithm does not require neither the storage of the whole dataset nor a chunk, thus it is fully adaptable to the streaming setting. It also provides low time complexity and convergence guarantees. We demonstrate the quality of our binary sketches through experiments on real data for the nearest neighbors search task in the online setting

    Online Product Quantization

    Full text link
    Approximate nearest neighbor (ANN) search has achieved great success in many tasks. However, existing popular methods for ANN search, such as hashing and quantization methods, are designed for static databases only. They cannot handle well the database with data distribution evolving dynamically, due to the high computational effort for retraining the model based on the new database. In this paper, we address the problem by developing an online product quantization (online PQ) model and incrementally updating the quantization codebook that accommodates to the incoming streaming data. Moreover, to further alleviate the issue of large scale computation for the online PQ update, we design two budget constraints for the model to update partial PQ codebook instead of all. We derive a loss bound which guarantees the performance of our online PQ model. Furthermore, we develop an online PQ model over a sliding window with both data insertion and deletion supported, to reflect the real-time behaviour of the data. The experiments demonstrate that our online PQ model is both time-efficient and effective for ANN search in dynamic large scale databases compared with baseline methods and the idea of partial PQ codebook update further reduces the update cost.Comment: To appear in IEEE Transactions on Knowledge and Data Engineering (DOI: 10.1109/TKDE.2018.2817526

    Blind adaptive constrained reduced-rank parameter estimation based on constant modulus design for CDMA interference suppression

    Get PDF
    This paper proposes a multistage decomposition for blind adaptive parameter estimation in the Krylov subspace with the code-constrained constant modulus (CCM) design criterion. Based on constrained optimization of the constant modulus cost function and utilizing the Lanczos algorithm and Arnoldi-like iterations, a multistage decomposition is developed for blind parameter estimation. A family of computationally efficient blind adaptive reduced-rank stochastic gradient (SG) and recursive least squares (RLS) type algorithms along with an automatic rank selection procedure are also devised and evaluated against existing methods. An analysis of the convergence properties of the method is carried out and convergence conditions for the reduced-rank adaptive algorithms are established. Simulation results consider the application of the proposed techniques to the suppression of multiaccess and intersymbol interference in DS-CDMA systems
    • …
    corecore