5 research outputs found

    The FastMap Algorithm for Shortest Path Computations

    Full text link
    We present a new preprocessing algorithm for embedding the nodes of a given edge-weighted undirected graph into a Euclidean space. The Euclidean distance between any two nodes in this space approximates the length of the shortest path between them in the given graph. Later, at runtime, a shortest path between any two nodes can be computed with A* search using the Euclidean distances as heuristic. Our preprocessing algorithm, called FastMap, is inspired by the data mining algorithm of the same name and runs in near-linear time. Hence, FastMap is orders of magnitude faster than competing approaches that produce a Euclidean embedding using Semidefinite Programming. FastMap also produces admissible and consistent heuristics and therefore guarantees the generation of shortest paths. Moreover, FastMap applies to general undirected graphs for which many traditional heuristics, such as the Manhattan Distance heuristic, are not well defined. Empirically, we demonstrate that A* search using the FastMap heuristic is competitive with A* search using other state-of-the-art heuristics, such as the Differential heuristic

    Managing Complex Join Queries in Big Data Management Systems

    No full text

    Managing Complex Join Queries in Big Data Management Systems

    No full text
    In addition to storing and managing the data and providing capabilities to query them, aDatabase Management System (DBMS) tries to achieve performance goals. High resource utilization, high throughput, and low query execution time are a few of the performance goals that are considered for various DBMSs. The system’s success in achieving its performance goals highly depends on the performance of queries and their operators. Many factors can impact a query’s performance, including how much of its resource requirements are satisfied, when it is scheduled for execution, and which other queries will execute concurrently with it. This thesis is an experimental study focusing on resource management and scheduling techniques to assist a database management system in reaching its performance goals. We begin this thesis by exploring the design space for a robust dynamic Hybrid Hash Join operator, one of the main and most common types of memory-intensive database operators. Our variant of this operator is specifically designed to perform well even when the required statistics and information for a Hybrid Hash Join operator are unavailable or inaccurate. Next, we explore various memory management and execution strategies for efficiently executing queries containing multiple join operators. We specifically study variations of Left Deep Trees, Right Deep Trees, and Bushy Trees containing one to eight join operators. We evaluate their performance under different memory availabilities, join and scan selectivities, degrees of parallelism, storage types, and query complexities. Lastly, we study and evaluate the performance of various schedulers designed to schedule queries with highly different memory requirements and execution times in a concurrent environment. Our performance goal is to design a fair scheduler that keeps different classes of queries in admission and resource control queues in proportion to their execution times
    corecore