Search CORE

734 research outputs found

Efficient Document Re-Ranking for Transformers by Precomputing Term Representations

Author: Frieder Ophir
Goharian Nazli
MacAvaney Sean
Nardini Franco Maria
Perego Raffaele
Tonellotto Nicola
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/05/2020
Field of study

Deep pretrained transformer networks are effective at various ranking tasks, such as question answering and ad-hoc document ranking. However, their computational expenses deem them cost-prohibitive in practice. Our proposed approach, called PreTTR (Precomputing Transformer Term Representations), considerably reduces the query-time latency of deep transformer networks (up to a 42x speedup on web document ranking) making these networks more practical to use in a real-time ranking scenario. Specifically, we precompute part of the document term representations at indexing time (without a query), and merge them with the query representation at query time to compute the final ranking score. Due to the large size of the token representations, we also propose an effective approach to reduce the storage requirement by training a compression layer to match attention scores. Our compression technique reduces the storage required up to 95% and it can be applied without a substantial degradation in ranking performance.Comment: Accepted at SIGIR 2020 (long

arXiv.org e-Print Archive

Crossref

Discrete Factorization Machines for Fast Feature-based Recommendation

Author: Feng Fuli
He Xiangnan
Liu Han
Liu Rui
Nie Liqiang
Zhang Hanwang
Publication venue
Publication date: 13/07/2018
Field of study

User and item features of side information are crucial for accurate recommendation. However, the large number of feature dimensions, e.g., usually larger than 10^7, results in expensive storage and computational cost. This prohibits fast recommendation especially on mobile applications where the computational resource is very limited. In this paper, we develop a generic feature-based recommendation model, called Discrete Factorization Machine (DFM), for fast and accurate recommendation. DFM binarizes the real-valued model parameters (e.g., float32) of every feature embedding into binary codes (e.g., boolean), and thus supports efficient storage and fast user-item score computation. To avoid the severe quantization loss of the binarization, we propose a convergent updating rule that resolves the challenging discrete optimization of DFM. Through extensive experiments on two real-world datasets, we show that 1) DFM consistently outperforms state-of-the-art binarized recommendation models, and 2) DFM shows very competitive performance compared to its real-valued version (FM), demonstrating the minimized quantization loss. This work is accepted by IJCAI 2018.Comment: Appeared in IJCAI 201

arXiv.org e-Print Archive

Crossref

ScholarBank@NUS

DESSERT: An Efficient Algorithm for Vector Set Search with Vector Set Queries

Author: Coleman Benjamin
Engels Joshua
Lakshman Vihan
Shrivastava Anshumali
Publication venue
Publication date: 26/10/2023
Field of study

We study the problem of

\textit{vector set search}

with

\textit{vector set queries}

. This task is analogous to traditional near-neighbor search, with the exception that both the query and each element in the collection are

\textit{sets}

of vectors. We identify this problem as a core subroutine for semantic search applications and find that existing solutions are unacceptably slow. Towards this end, we present a new approximate search algorithm, DESSERT (

{\bf D}

ESSERT

{\bf E}

ffeciently

{\bf S}

earches

{\bf S}

ets of

{\bf E}

mbeddings via

{\bf R}

etrieval

{\bf T}

ables). DESSERT is a general tool with strong theoretical guarantees and excellent empirical performance. When we integrate DESSERT into ColBERT, a state-of-the-art semantic search model, we find a 2-5x speedup on the MS MARCO and LoTTE retrieval benchmarks with minimal loss in recall, underscoring the effectiveness and practical applicability of our proposal.Comment: Code available, https://github.com/ThirdAIResearch/Desser

arXiv.org e-Print Archive