7 research outputs found

    Sign-Full Random Projections

    Full text link
    The method of 1-bit ("sign-sign") random projections has been a popular tool for efficient search and machine learning on large datasets. Given two DD-dim data vectors uu, vRDv\in\mathbb{R}^D, one can generate x=i=1Duirix = \sum_{i=1}^D u_i r_i, and y=i=1Dviriy = \sum_{i=1}^D v_i r_i, where riN(0,1)r_i\sim N(0,1) iid. The "collision probability" is Pr(sgn(x)=sgn(y))=1cos1ρπ{Pr}\left(sgn(x)=sgn(y)\right) = 1-\frac{\cos^{-1}\rho}{\pi}, where ρ=ρ(u,v)\rho = \rho(u,v) is the cosine similarity. We develop "sign-full" random projections by estimating ρ\rho from (e.g.,) the expectation E(sgn(x)y)=2πρE(sgn(x)y)=\sqrt{\frac{2}{\pi}} \rho, which can be further substantially improved by normalizing yy. For nonnegative data, we recommend an interesting estimator based on E(y1x0+y+1x<0)E\left(y_- 1_{x\geq 0} + y_+ 1_{x<0}\right) and its normalized version. The recommended estimator almost matches the accuracy of the (computationally expensive) maximum likelihood estimator. At high similarity (ρ1\rho\rightarrow1), the asymptotic variance of recommended estimator is only 43π0.4\frac{4}{3\pi} \approx 0.4 of the estimator for sign-sign projections. At small kk and high similarity, the improvement would be even much more substantial

    Tree-based Text-Vision BERT for Video Search in Baidu Video Advertising

    Full text link
    The advancement of the communication technology and the popularity of the smart phones foster the booming of video ads. Baidu, as one of the leading search engine companies in the world, receives billions of search queries per day. How to pair the video ads with the user search is the core task of Baidu video advertising. Due to the modality gap, the query-to-video retrieval is much more challenging than traditional query-to-document retrieval and image-to-image search. Traditionally, the query-to-video retrieval is tackled by the query-to-title retrieval, which is not reliable when the quality of tiles are not high. With the rapid progress achieved in computer vision and natural language processing in recent years, content-based search methods becomes promising for the query-to-video retrieval. Benefited from pretraining on large-scale datasets, some visionBERT methods based on cross-modal attention have achieved excellent performance in many vision-language tasks not only in academia but also in industry. Nevertheless, the expensive computation cost of cross-modal attention makes it impractical for large-scale search in industrial applications. In this work, we present a tree-based combo-attention network (TCAN) which has been recently launched in Baidu's dynamic video advertising platform. It provides a practical solution to deploy the heavy cross-modal attention for the large-scale query-to-video search. After launching tree-based combo-attention network, click-through rate gets improved by 2.29\% and conversion rate get improved by 2.63\%.Comment: This revision is based on a manuscript submitted in October 2020, to ICDE 2021. We thank the Program Committee for their valuable comment

    CoopHash: Cooperative Learning of Multipurpose Descriptor and Contrastive Pair Generator via Variational MCMC Teaching for Supervised Image Hashing

    Full text link
    Leveraging supervised information can lead to superior retrieval performance in the image hashing domain but the performance degrades significantly without enough labeled data. One effective solution to boost the performance is to employ generative models, such as Generative Adversarial Networks (GANs), to generate synthetic data in an image hashing model. However, GAN-based methods are difficult to train and suffer from mode collapse issue, which prevents the hashing approaches from jointly training the generative models and the hash functions. This limitation results in sub-optimal retrieval performance. To overcome this limitation, we propose a novel framework, the generative cooperative hashing network (CoopHash), which is based on the energy-based cooperative learning. CoopHash jointly learns a powerful generative representation of the data and a robust hash function. CoopHash has two components: a top-down contrastive pair generator that synthesizes contrastive images and a bottom-up multipurpose descriptor that simultaneously represents the images from multiple perspectives, including probability density, hash code, latent code, and category. The two components are jointly learned via a novel likelihood-based cooperative learning scheme. We conduct experiments on several real-world datasets and show that the proposed method outperforms the competing hashing supervised methods, achieving up to 10% relative improvement over the current state-of-the-art supervised hashing methods, and exhibits a significantly better performance in out-of-distribution retrieval

    Constrained Approximate Similarity Search on Proximity Graph

    Full text link
    Search engines and recommendation systems are built to efficiently display relevant information from those massive amounts of candidates. Typically a three-stage mechanism is employed in those systems: (i) a small collection of items are first retrieved by (e.g.,) approximate near neighbor search algorithms; (ii) then a collection of constraints are applied on the retrieved items; (iii) a fine-grained ranking neural network is employed to determine the final recommendation. We observe a major defect of the original three-stage pipeline: Although we only target to retrieve kk vectors in the final recommendation, we have to preset a sufficiently large ss (s>ks > k) for each query, and ``hope'' the number of survived vectors after the filtering is not smaller than kk. That is, at least kk vectors in the ss similar candidates satisfy the query constraints. In this paper, we investigate this constrained similarity search problem and attempt to merge the similarity search stage and the filtering stage into one single search operation. We introduce AIRSHIP, a system that integrates a user-defined function filtering into the similarity search framework. The proposed system does not need to build extra indices nor require prior knowledge of the query constraints. We propose three optimization strategies: (1) starting point selection, (2) multi-direction search, and (3) biased priority queue selection. Experimental evaluations on both synthetic and real data confirm the effectiveness of the proposed AIRSHIP algorithm. We focus on constrained graph-based approximate near neighbor (ANN) search in this study, in part because graph-based ANN is known to achieve excellent performance. We believe it is also possible to develop constrained hashing-based ANN or constrained quantization-based ANN

    Breaking the waves: asymmetric random periodic features for low-bitrate kernel machines

    Full text link
    Many signal processing and machine learning applications are built from evaluating a kernel on pairs of signals, e.g. to assess the similarity of an incoming query to a database of known signals. This nonlinear evaluation can be simplified to a linear inner product of the random Fourier features of those signals: random projections followed by a periodic map, the complex exponential. It is known that a simple quantization of those features (corresponding to replacing the complex exponential by a different periodic map that takes binary values, which is appealing for their transmission and storage), distorts the approximated kernel, which may be undesirable in practice. Our take-home message is that when the features of only one of the two signals are quantized, the original kernel is recovered without distortion; its practical interest appears in several cases where the kernel evaluations are asymmetric by nature, such as a client-server scheme. Concretely, we introduce the general framework of asymmetric random periodic features, where the two signals of interest are observed through random periodic features: random projections followed by a general periodic map, which is allowed to be different for both signals. We derive the influence of those periodic maps on the approximated kernel, and prove uniform probabilistic error bounds holding for all signal pairs from an infinite low-complexity set. Interestingly, our results allow the periodic maps to be discontinuous, thanks to a new mathematical tool, i.e. the mean Lipschitz smoothness. We then apply this generic framework to semi-quantized kernel machines (where only one signal has quantized features and the other has classical random Fourier features), for which we show theoretically that the approximated kernel remains unchanged (with the associated error bound), and confirm the power of the approach with numerical simulations

    Differential Privacy with Random Projections and Sign Random Projections

    Full text link
    In this paper, we develop a series of differential privacy (DP) algorithms from a family of random projections (RP), for general applications in machine learning, data mining, and information retrieval. Among the presented algorithms, \textbf{iDP-SignRP} is remarkably effective under the setting of ``individual differential privacy'' (iDP), based on sign random projections (SignRP). Also, \textbf{DP-SignOPORP} considerably improves existing algorithms in the literature under the standard DP setting, using ``one permutation + one random projection'' (OPORP), where OPORP is a variant of the celebrated count-sketch method with fixed-length binning and normalization. Without taking signs, among the DP-RP family, \textbf{DP-OPORP} achieves the best performance. The concept of iDP (individual differential privacy) is defined only on a particular dataset of interest. While iDP is not strictly DP, iDP might be useful in certain applications, such as releasing a dataset (including sharing embeddings across companies or countries). In our study, we find that \textbf{iDP-SignRP} is remarkably effective for search and machine learning applications, in that the utilities are exceptionally good even at a very small privacy parameter ϵ\epsilon (e.g., ϵ<0.5\epsilon<0.5)

    Sign-Full Random Projections

    No full text
    corecore