Search CORE

38 research outputs found

When Hashing Met Matching: Efficient Spatio-Temporal Search for Ridesharing

Author: Dutta Chinmoy
Publication venue
Publication date: 19/02/2020
Field of study

Carpooling, or sharing a ride with other passengers, holds immense potential for urban transportation. Ridesharing platforms enable such sharing of rides using real-time data. Finding ride matches in real-time at urban scale is a difficult combinatorial optimization task and mostly heuristic approaches are applied. In this work, we mathematically model the problem as that of finding near-neighbors and devise a novel efficient spatio-temporal search algorithm based on the theory of locality sensitive hashing for Maximum Inner Product Search (MIPS). The proposed algorithm can find

k

near-optimal potential matches for every ride from a pool of

n

rides in time

O(n^{1 + \rho} (k + \log n) \log k)

and space

O(n^{1 + \rho} \log k)

for a small

\rho < 1

. Our algorithm can be extended in several useful and interesting ways increasing its practical appeal. Experiments with large NY yellow taxi trip datasets show that our algorithm consistently outperforms state-of-the-art heuristic methods thereby proving its practical applicability

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Cardinality Estimation in Inner Product Space

Author: Amagata Daichi
Hara Takahiro
Hirata Kohei
Publication venue: Institute of Electrical and Electronics Engineers Inc.
Publication date
Field of study

This article addresses the problem of cardinality estimation in inner product spaces. Given a set of high-dimensional vectors, a query, and a threshold, this problem estimates the number of vectors such that their inner products with the query are not less than the threshold. This is an important problem for recent machine-learning applications that maintain objects, such as users and items, by using matrices. The important requirements for solutions of this problem are high efficiency and accuracy. To satisfy these requirements, we propose a sampling-based algorithm. We build trees of vectors via transformation to a Euclidean space and dimensionality reduction in a pre-processing phase. Then our algorithm samples vectors existing in the nodes that intersect with a search range on one of the trees. Our algorithm is surprisingly simple, but it is theoretically and practically fast and effective. We conduct extensive experiments on real datasets, and the results demonstrate that our algorithm shows superior performance compared with existing techniques.Hirata K., Amagata D., Hara T.. Cardinality Estimation in Inner Product Space. IEEE Open Journal of the Computer Society 3, 208 (2022); https://doi.org/10.1109/OJCS.2022.3215206

Osaka University Knowledge Archive

Revisiting Wedge Sampling for Budgeted Maximum Inner Product Search

Author: Lorenzen Stephan S.
Pham Ninh
Publication venue
Publication date: 12/09/2020
Field of study

Top-k maximum inner product search (MIPS) is a central task in many machine learning applications. This paper extends top-k MIPS with a budgeted setting, that asks for the best approximate top-k MIPS given a limit of B computational operations. We investigate recent advanced sampling algorithms, including wedge and diamond sampling to solve it. Though the design of these sampling schemes naturally supports budgeted top-k MIPS, they suffer from the linear cost from scanning all data points to retrieve top-k results and the performance degradation for handling negative inputs. This paper makes two main contributions. First, we show that diamond sampling is essentially a combination between wedge sampling and basic sampling for top-k MIPS. Our theoretical analysis and empirical evaluation show that wedge is competitive (often superior) to diamond on approximating top-k MIPS regarding both efficiency and accuracy. Second, we propose a series of algorithmic engineering techniques to deploy wedge sampling on budgeted top-k MIPS. Our novel deterministic wedge-based algorithm runs significantly faster than the state-of-the-art methods for budgeted and exact top-k MIPS while maintaining the top-5 precision at least 80% on standard recommender system data sets.Comment: ECML-PKDD 202

arXiv.org e-Print Archive

Copenhagen University Research Information System

To Index or Not to Index: Optimizing Exact Maximum Inner Product Search

Author: Abuzaid Firas
Bailis Peter
Sethi Geet
Zaharia Matei
Publication venue
Publication date: 14/03/2019
Field of study

Exact Maximum Inner Product Search (MIPS) is an important task that is widely pertinent to recommender systems and high-dimensional similarity search. The brute-force approach to solving exact MIPS is computationally expensive, thus spurring recent development of novel indexes and pruning techniques for this task. In this paper, we show that a hardware-efficient brute-force approach, blocked matrix multiply (BMM), can outperform the state-of-the-art MIPS solvers by over an order of magnitude, for some -- but not all -- inputs. In this paper, we also present a novel MIPS solution, MAXIMUS, that takes advantage of hardware efficiency and pruning of the search space. Like BMM, MAXIMUS is faster than other solvers by up to an order of magnitude, but again only for some inputs. Since no single solution offers the best runtime performance for all inputs, we introduce a new data-dependent optimizer, OPTIMUS, that selects online with minimal overhead the best MIPS solver for a given input. Together, OPTIMUS and MAXIMUS outperform state-of-the-art MIPS solvers by 3.2

\times

on average, and up to 10.9

\times

, on widely studied MIPS datasets.Comment: 12 pages, 8 figures, 2 table

arXiv.org e-Print Archive

Crossref

Preference learning and similarity learning perspectives on personalized recommendation

Author: LE Duy Dung
Publication venue: Singapore Management University
Publication date: 01/09/2019
Field of study

Institutional Knowledge at Singapore Management University

Stochastically robust personalized ranking for LSH recommendation retrieval

Author: LAUW Hady W.
LE Dung D.
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 01/02/2020
Field of study

National Research Foundation (NRF) Singapore under NRF Fellowship Programm

Institutional Knowledge at Singapore Management University

Association for the Advancement of Artificial Intelligence: AAAI Publications