29 research outputs found

    On the Cost of Negation for Dynamic Pruning

    Get PDF
    Negated query terms allow documents containing such terms to be filtered out of a search results list, supporting disambiguation. In this work, the effect of negation on the efficiency of disjunctive, top-k retrieval is examined. First, we show how negation can be integrated efficiently into two popular dynamic pruning algorithms. Then, we explore the efficiency of our approach, and show that while often efficient, negation can negatively impact the dynamic pruning effectiveness for certain queries

    Reverse k Nearest Neighbor Search over Trajectories

    Full text link
    GPS enables mobile devices to continuously provide new opportunities to improve our daily lives. For example, the data collected in applications created by Uber or Public Transport Authorities can be used to plan transportation routes, estimate capacities, and proactively identify low coverage areas. In this paper, we study a new kind of query-Reverse k Nearest Neighbor Search over Trajectories (RkNNT), which can be used for route planning and capacity estimation. Given a set of existing routes DR, a set of passenger transitions DT, and a query route Q, a RkNNT query returns all transitions that take Q as one of its k nearest travel routes. To solve the problem, we first develop an index to handle dynamic trajectory updates, so that the most up-to-date transition data are available for answering a RkNNT query. Then we introduce a filter refinement framework for processing RkNNT queries using the proposed indexes. Next, we show how to use RkNNT to solve the optimal route planning problem MaxRkNNT (MinRkNNT), which is to search for the optimal route from a start location to an end location that could attract the maximum (or minimum) number of passengers based on a pre-defined travel distance threshold. Experiments on real datasets demonstrate the efficiency and scalability of our approaches. To the best of our best knowledge, this is the first work to study the RkNNT problem for route planning.Comment: 12 page

    Updatable Learned Indexes Meet Disk-Resident DBMS -- From Evaluations to Design Choices

    Full text link
    Although many updatable learned indexes have been proposed in recent years, whether they can outperform traditional approaches on disk remains unknown. In this study, we revisit and implement four state-of-the-art updatable learned indexes on disk, and compare them against the B+-tree under a wide range of settings. Through our evaluation, we make some key observations: 1) Overall, the B+-tree performs well across a range of workload types and datasets. 2) A learned index could outperform B+-tree or other learned indexes on disk for a specific workload. For example, PGM achieves the best performance in write-only workloads while LIPP significantly outperforms others in lookup-only workloads. We further conduct a detailed performance analysis to reveal the strengths and weaknesses of these learned indexes on disk. Moreover, we summarize the observed common shortcomings in five categories and propose four design principles to guide future design of on-disk, updatable learned indexes: (1) reducing the index's tree height, (2) better data structures to lower operation overheads, (3) improving the efficiency of scan operations, and (4) more efficient storage layout.Comment: 22 page

    A Linear-Time Algorithm for Finding Induced Planar Subgraphs

    Get PDF
    In this paper we study the problem of efficiently and effectively extracting induced planar subgraphs. Edwards and Farr proposed an algorithm with O(mn) time complexity to find an induced planar subgraph of at least 3n/(d+1) vertices in a graph of maximum degree d. They also proposed an alternative algorithm with O(mn) time complexity to find an induced planar subgraph graph of at least 3n/(bar{d}+1) vertices, where bar{d} is the average degree of the graph. These two methods appear to be best known when d and bar{d} are small. Unfortunately, they sacrifice accuracy for lower time complexity by using indirect indicators of planarity. A limitation of those approaches is that the algorithms do not implicitly test for planarity, and the additional costs of this test can be significant in large graphs. In contrast, we propose a linear-time algorithm that finds an induced planar subgraph of n-nu vertices in a graph of n vertices, where nu denotes the total number of vertices shared by the detected Kuratowski subdivisions. An added benefit of our approach is that we are able to detect when a graph is planar, and terminate the reduction. The resulting planar subgraphs also do not have any rigid constraints on the maximum degree of the induced subgraph. The experiment results show that our method achieves better performance than current methods on graphs with small skewness

    Spatial Object Recommendation with Hints: When Spatial Granularity Matters

    Full text link
    Existing spatial object recommendation algorithms generally treat objects identically when ranking them. However, spatial objects often cover different levels of spatial granularity and thereby are heterogeneous. For example, one user may prefer to be recommended a region (say Manhattan), while another user might prefer a venue (say a restaurant). Even for the same user, preferences can change at different stages of data exploration. In this paper, we study how to support top-k spatial object recommendations at varying levels of spatial granularity, enabling spatial objects at varying granularity, such as a city, suburb, or building, as a Point of Interest (POI). To solve this problem, we propose the use of a POI tree, which captures spatial containment relationships between POIs. We design a novel multi-task learning model called MPR (short for Multi-level POI Recommendation), where each task aims to return the top-k POIs at a certain spatial granularity level. Each task consists of two subtasks: (i) attribute-based representation learning; (ii) interaction-based representation learning. The first subtask learns the feature representations for both users and POIs, capturing attributes directly from their profiles. The second subtask incorporates user-POI interactions into the model. Additionally, MPR can provide insights into why certain recommendations are being made to a user based on three types of hints: user-aspect, POI-aspect, and interaction-aspect. We empirically validate our approach using two real-life datasets, and show promising performance improvements over several state-of-the-art methods