48 research outputs found
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
Robust Quantization for General Similarity Search
The recent years have witnessed the emerging of vector quantization (VQ) techniques for efficient similarity search. VQ partitions the feature space into a set of codewords and encodes data points as integer indices using the codewords. Then the distance between data points can be efficiently approximated by simple memory lookup operations. By the compact quantization, the storage cost and searching complexity are significantly reduced, thereby facilitating efficient large-scale similarity search. However, the performance of several celebrated VQ approaches degrades significantly when dealing with noisy data. Additionally, it can barely facilitate a wide range of applications as the distortion measurement only limits to ℓ2 norm. To address the shortcomings of the squared Euclidean (ℓ2,2 norm) loss function employed by the VQ approaches, in this paper, we propose a novel robust and general VQ framework, named RGVQ, to enhance both robustness and generalization of VQ approaches. Specifically, a ℓp,q-norm loss function is proposed to conduct the ℓp-norm similarity search, rather than the ℓ2 norm search, and the q-th order loss is used to enhance the robustness. Despite the fact that changing the loss function to ℓp,q norm makes VQ approaches more robust and generic, it brings us a challenge that a non-smooth and non-convex orthogonality constrained ℓp,q- norm function has to be minimized. To solve this problem, we propose a novel and efficient optimization scheme and specify it to VQ approaches and theoretically prove its convergence. Extensive experiments on benchmark datasets demonstrate that the proposed RGVQ is better than the original VQ for several approaches, especially when searching similarity in noisy data
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Large language models (LLMs) with hundreds of billions of parameters have
sparked a new wave of exciting AI applications. However, they are
computationally expensive at inference time. Sparsity is a natural approach to
reduce this cost, but existing methods either require costly retraining, have
to forgo LLM's in-context learning ability, or do not yield wall-clock time
speedup on modern hardware. We hypothesize that contextual sparsity, which are
small, input-dependent sets of attention heads and MLP parameters that yield
approximately the same output as the dense model for a given input, can address
these issues. We show that contextual sparsity exists, that it can be
accurately predicted, and that we can exploit it to speed up LLM inference in
wall-clock time without compromising LLM's quality or in-context learning
ability. Based on these insights, we propose DejaVu, a system that uses a
low-cost algorithm to predict contextual sparsity on the fly given inputs to
each layer, along with an asynchronous and hardware-aware implementation that
speeds up LLM inference. We validate that DejaVu can reduce the inference
latency of OPT-175B by over 2X compared to the state-of-the-art
FasterTransformer, and over 6X compared to the widely used Hugging Face
implementation, without compromising model quality. The code is available at
https://github.com/FMInference/DejaVu
On the Application of Dictionary Learning to Image Compression
Signal models are a cornerstone of contemporary signal and image-processing methodology. In this chapter, a particular signal modelling method, called synthesis sparse representation, is studied which has been proven to be effective for many signals, such as natural images, and successfully used in a wide range of applications. In this kind of signal modelling, the signal is represented with respect to dictionary. The dictionary choice plays an important role on the success of the entire model. One main discipline of dictionary designing is based on a machine learning methodology which provides a simple and expressive structure for designing adaptable and efficient dictionaries. This chapter focuses on direct application of the sparse representation, i.e. image compression. Two image codec based on adaptive sparse representation over a trained dictionary are introduced. Experimental results show that the presented methods outperform the existing image coding standards, such as JPEG and JPEG2000
Image restoration with group sparse representation and low‐rank group residual learning
Image restoration, as a fundamental research topic of image processing, is to reconstruct the original image from degraded signal using the prior knowledge of image. Group sparse representation (GSR) is powerful for image restoration; it however often leads to undesirable sparse solutions in practice. In order to improve the quality of image restoration based on GSR, the sparsity residual model expects the representation learned from degraded images to be as close as possible to the true representation. In this article, a group residual learning based on low-rank self-representation is proposed to automatically estimate the true group sparse representation. It makes full use of the relation among patches and explores the subgroup structures within the same group, which makes the sparse residual model have better interpretation furthermore, results in high-quality restored images. Extensive experimental results on two typical image restoration tasks (image denoising and deblocking) demonstrate that the proposed algorithm outperforms many other popular or state-of-the-art image restoration methods
Boosting Adversarial Robustness via Neural Architecture Search and Design
Adversarial robustness in Deep Neural Networks (DNNs) is a critical and emerging field of research that addresses the vulnerability of DNNs to subtle, intentionally crafted perturbations in their input data. These perturbations, often imperceptible to the human eye, can lead to significant error increment in the network's predictions, while they can be easily derived via adversarial attacks in various data formats, such as image, text, and audio. This susceptibility poses serious security and trustworthy concerns in real-world applications such as autonomous driving, healthcare diagnostics, and cybersecurity. To enhance the trustworthiness of DNNs, lots of research efforts have been put into developing techniques that aim to improve DNNs ability to defend against such adversarial attacks, ensuring that trustworthy results can be provided in real-world scenarios. The main stream of adversarial robustness lies in the adversarial training strategies and regularizations. However, less attention has been paid to the DNN itself. Little is known about the influence of different neural network architectures or designs on adversarial robustness. To fulfill this knowledge gap, we propose to advance adversarial robustness via investigating neural architecture search and design in this thesis
Dynamic match kernel with deep convolutional features for image retrieval
For image retrieval methods based on bag of visual words, much attention has been paid to enhancing the discriminative powers of the local features. Although retrieved images are usually similar to a query in minutiae, they may be significantly different from a semantic perspective, which can be effectively distinguished by convolutional neural networks (CNN). Such images should not be considered as relevant pairs. To tackle this problem, we propose to construct a dynamic match kernel by adaptively calculating the matching thresholds between query and candidate images based on the pairwise distance among deep CNN features. In contrast to the typical static match kernel which is independent to the global appearance of retrieved images, the dynamic one leverages the semantical similarity as a constraint for determining the matches. Accordingly, we propose a semantic-constrained retrieval framework by incorporating the dynamic match kernel, which focuses on matched patches between relevant images and filters out the ones for irrelevant pairs. Furthermore, we demonstrate that the proposed kernel complements recent methods, such as hamming embedding, multiple assignment, local descriptors aggregation, and graph-based re-ranking, while it outperforms the static one under various settings on off-the-shelf evaluation metrics. We also propose to evaluate the matched patches both quantitatively and qualitatively. Extensive experiments on five benchmark data sets and large-scale distractors validate the merits of the proposed method against the state-of-the-art methods for image retrieval