Search CORE

29 research outputs found

On the Cost of Negation for Dynamic Pruning

Author: Culpepper J. Shane
Macdonald Craig
Mackenzie Joel
Scholer Falk
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Negated query terms allow documents containing such terms to be filtered out of a search results list, supporting disambiguation. In this work, the effect of negation on the efficiency of disjunctive, top-k retrieval is examined. First, we show how negation can be integrated efficiently into two popular dynamic pruning algorithms. Then, we explore the efficiency of our approach, and show that while often efficient, negation can negatively impact the dynamic pruning effectiveness for certain queries

Crossref

RMIT Research Repository

Enlighten

Reverse k Nearest Neighbor Search over Trajectories

Author: Bao Zhifeng
Cong Gao
Culpepper J. Shane
Sellis Timos
Wang Sheng
Publication venue
Publication date: 01/01/2017
Field of study

GPS enables mobile devices to continuously provide new opportunities to improve our daily lives. For example, the data collected in applications created by Uber or Public Transport Authorities can be used to plan transportation routes, estimate capacities, and proactively identify low coverage areas. In this paper, we study a new kind of query-Reverse k Nearest Neighbor Search over Trajectories (RkNNT), which can be used for route planning and capacity estimation. Given a set of existing routes DR, a set of passenger transitions DT, and a query route Q, a RkNNT query returns all transitions that take Q as one of its k nearest travel routes. To solve the problem, we first develop an index to handle dynamic trajectory updates, so that the most up-to-date transition data are available for answering a RkNNT query. Then we introduce a filter refinement framework for processing RkNNT queries using the proposed indexes. Next, we show how to use RkNNT to solve the optimal route planning problem MaxRkNNT (MinRkNNT), which is to search for the optimal route from a start location to an end location that could attract the maximum (or minimum) number of passengers based on a pre-defined travel distance threshold. Experiments on real datasets demonstrate the efficiency and scalability of our approaches. To the best of our best knowledge, this is the first work to study the RkNNT problem for route planning.Comment: 12 page

arXiv.org e-Print Archive

RMIT Research Repository

DR-NTU (Digital Repository of NTU)

Updatable Learned Indexes Meet Disk-Resident DBMS -- From Evaluations to Design Choices

Author: Bao Zhifeng
Borovica-Gajic Renata
Culpepper J. Shane
Lan Hai
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/05/2023
Field of study

Although many updatable learned indexes have been proposed in recent years, whether they can outperform traditional approaches on disk remains unknown. In this study, we revisit and implement four state-of-the-art updatable learned indexes on disk, and compare them against the B+-tree under a wide range of settings. Through our evaluation, we make some key observations: 1) Overall, the B+-tree performs well across a range of workload types and datasets. 2) A learned index could outperform B+-tree or other learned indexes on disk for a specific workload. For example, PGM achieves the best performance in write-only workloads while LIPP significantly outperforms others in lookup-only workloads. We further conduct a detailed performance analysis to reveal the strengths and weaknesses of these learned indexes on disk. Moreover, we summarize the observed common shortcomings in five categories and propose four design principles to guide future design of on-disk, updatable learned indexes: (1) reducing the index's tree height, (2) better data structures to lower operation overheads, (3) improving the efficiency of scan operations, and (4) more efficient storage layout.Comment: 22 page

arXiv.org e-Print Archive

A Linear-Time Algorithm for Finding Induced Planar Subgraphs

Author: Bao Zhifeng
Culpepper J. Shane
Huang Shixun
Zhang Bang
Zhang Ping
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th International Symposium on Experimental Algorithms (SEA 2018)
Publication date: 01/01/2018
Field of study

In this paper we study the problem of efficiently and effectively extracting induced planar subgraphs. Edwards and Farr proposed an algorithm with O(mn) time complexity to find an induced planar subgraph of at least 3n/(d+1) vertices in a graph of maximum degree d. They also proposed an alternative algorithm with O(mn) time complexity to find an induced planar subgraph graph of at least 3n/(bar{d}+1) vertices, where bar{d} is the average degree of the graph. These two methods appear to be best known when d and bar{d} are small. Unfortunately, they sacrifice accuracy for lower time complexity by using indirect indicators of planarity. A limitation of those approaches is that the algorithms do not implicitly test for planarity, and the additional costs of this test can be significant in large graphs. In contrast, we propose a linear-time algorithm that finds an induced planar subgraph of n-nu vertices in a graph of n vertices, where nu denotes the total number of vertices shared by the detected Kuratowski subdivisions. An added benefit of our approach is that we are able to detect when a graph is planar, and terminate the reduction. The resulting planar subgraphs also do not have any rigid constraints on the maximum degree of the induced subgraph. The experiment results show that our method achieves better performance than current methods on graphs with small skewness

Dagstuhl Research Online Publication Server

Spatial Object Recommendation with Hints: When Spatial Granularity Matters

Author: Bao Zhifeng
Culpepper J. Shane
Li Shuangli
Liu Hao
Luo Hui
Xiong Hui
Ying Haochao
Zhou Jingbo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/01/2021
Field of study

Existing spatial object recommendation algorithms generally treat objects identically when ranking them. However, spatial objects often cover different levels of spatial granularity and thereby are heterogeneous. For example, one user may prefer to be recommended a region (say Manhattan), while another user might prefer a venue (say a restaurant). Even for the same user, preferences can change at different stages of data exploration. In this paper, we study how to support top-k spatial object recommendations at varying levels of spatial granularity, enabling spatial objects at varying granularity, such as a city, suburb, or building, as a Point of Interest (POI). To solve this problem, we propose the use of a POI tree, which captures spatial containment relationships between POIs. We design a novel multi-task learning model called MPR (short for Multi-level POI Recommendation), where each task aims to return the top-k POIs at a certain spatial granularity level. Each task consists of two subtasks: (i) attribute-based representation learning; (ii) interaction-based representation learning. The first subtask learns the feature representations for both users and POIs, capturing attributes directly from their profiles. The second subtask incorporates user-POI interactions into the model. Additionally, MPR can provide insights into why certain recommendations are being made to a user based on three types of hints: user-aspect, POI-aspect, and interaction-aspect. We empirically validate our approach using two real-life datasets, and show promising performance improvements over several state-of-the-art methods

arXiv.org e-Print Archive

Crossref