214 research outputs found

    TAPER: query-aware, partition-enhancement for large, heterogenous, graphs

    Full text link
    Graph partitioning has long been seen as a viable approach to address Graph DBMS scalability. A partitioning, however, may introduce extra query processing latency unless it is sensitive to a specific query workload, and optimised to minimise inter-partition traversals for that workload. Additionally, it should also be possible to incrementally adjust the partitioning in reaction to changes in the graph topology, the query workload, or both. Because of their complexity, current partitioning algorithms fall short of one or both of these requirements, as they are designed for offline use and as one-off operations. The TAPER system aims to address both requirements, whilst leveraging existing partitioning algorithms. TAPER takes any given initial partitioning as a starting point, and iteratively adjusts it by swapping chosen vertices across partitions, heuristically reducing the probability of inter-partition traversals for a given pattern matching queries workload. Iterations are inexpensive thanks to time and space optimisations in the underlying support data structures. We evaluate TAPER on two different large test graphs and over realistic query workloads. Our results indicate that, given a hash-based partitioning, TAPER reduces the number of inter-partition traversals by around 80%; given an unweighted METIS partitioning, by around 30%. These reductions are achieved within 8 iterations and with the additional advantage of being workload-aware and usable online.Comment: 12 pages, 11 figures, unpublishe

    Loom: Query-aware Partitioning of Online Graphs

    Full text link
    As with general graph processing systems, partitioning data over a cluster of machines improves the scalability of graph database management systems. However, these systems will incur additional network cost during the execution of a query workload, due to inter-partition traversals. Workload-agnostic partitioning algorithms typically minimise the likelihood of any edge crossing partition boundaries. However, these partitioners are sub-optimal with respect to many workloads, especially queries, which may require more frequent traversal of specific subsets of inter-partition edges. Furthermore, they largely unsuited to operating incrementally on dynamic, growing graphs. We present a new graph partitioning algorithm, Loom, that operates on a stream of graph updates and continuously allocates the new vertices and edges to partitions, taking into account a query workload of graph pattern expressions along with their relative frequencies. First we capture the most common patterns of edge traversals which occur when executing queries. We then compare sub-graphs, which present themselves incrementally in the graph update stream, against these common patterns. Finally we attempt to allocate each match to single partitions, reducing the number of inter-partition edges within frequently traversed sub-graphs and improving average query performance. Loom is extensively evaluated over several large test graphs with realistic query workloads and various orderings of the graph updates. We demonstrate that, given a workload, our prototype produces partitionings of significantly better quality than existing streaming graph partitioning algorithms Fennel and LDG

    Workload-sensitive approaches to improving graph data partitioning online

    Get PDF
    PhD ThesisMany modern applications, from social networks to network security tools, rely upon the graph data model, using it as part of an offline analytics pipeline or, increasingly, for storing and querying data online, e.g. in a graph database management system (GDBMS). Unfortunately, effective horizontal scaling of this graph data reduces to the NP-Hard problem of “k-way balanced graph partitioning”. Owing to the problem’s importance, several practical approaches exist, producing quality graph partitionings. However, these existing systems are unsuitable for partitioning online graphs, either introducing unnecessary network latency during query processing, being unable to efficiently adapt to changing data and query workloads, or both. In this thesis we propose partitioning techniques which are efficient and sensitive to given query workloads, suitable for application to online graphs and query workloads. To incrementally adapt partitionings in response to workload change, we propose TAPER: a graph repartitioner. TAPER uses novel datastructures to compute the probability of expensive inter -partition traversals (ipt) from each vertex, given the current workload of path queries. Subsequently, it iteratively adjusts an initial partitioning by swapping selected vertices amongst partitions, heuristically maintaining low ipt and high partition quality with respect to that workload. Iterations are inexpensive thanks to time and space optimisations in the underlying datastructures. To incrementally create partitionings in response to graph growth, we propose Loom: a streaming graph partitioner. Loom uses another novel datastructure to detect common patterns of edge traversals when executing a given workload of pattern matching queries. Subsequently, it employs a probabilistic graph isomorphism method to incrementally and efficiently compare sub-graphs in the stream of graph updates, to these common patterns. Matches are assigned within individual partitions if possible, thereby also reducing ipt and increasing partitioning quality w.r.t the given workload. - i - Both partitioner and repartitioner are extensively evaluated with real/synthetic graph datasets and query workloads. The headline results include that TAPER can reduce ipt by upto 80% over a naive existing partitioning and can maintain this reduction in the event of workload change, through additional iterations. Meanwhile, Loom reduces ipt by upto 40% over a state of the art streaming graph partitioner

    Efficient Generating And Processing Of Large-Scale Unstructured Meshes

    Get PDF
    Unstructured meshes are used in a variety of disciplines to represent simulations and experimental data. Scientists who want to increase accuracy of simulations by increasing resolution must also increase the size of the resulting dataset. However, generating and processing a extremely large unstructured meshes remains a barrier. Researchers have published many parallel Delaunay triangulation (DT) algorithms, often focusing on partitioning the initial mesh domain, so that each rectangular partition can be triangulated in parallel. However, the comproblems for this method is how to merge all triangulated partitions into a single domain-wide mesh or the significant cost for communication the sub-region borders. We devised a novel algorithm --Triangulation of Independent Partitions in Parallel (TIPP) to deal with very large DT problems without requiring inter-processor communication while still guaranteeing the Delaunay criteria. The core of the algorithm is to find a set of independent} partitions such that the circumcircles of triangles in one partition do not enclose any vertex in other partitions. For this reason, this set of independent partitions can be triangulated in parallel without affecting each other. The results of mesh generation is the large unstructured meshes including vertex index and vertex coordinate files which introduce a new challenge \-- locality. Partitioning unstructured meshes to improve locality is a key part of our own approach. Elements that were widely scattered in the original dataset are grouped together, speeding data access. For further improve unstructured mesh partitioning, we also described our new approach. Direct Load which mitigates the challenges of unstructured meshes by maximizing the proportion of useful data retrieved during each read from disk, which in turn reduces the total number of read operations, boosting performance

    Application Heartbeats for Software Performance and Health

    Get PDF
    Adaptive, or self-aware, computing has been proposed as one method to help application programmers confront the growing complexity of multicore software development. However, existing approaches to adaptive systems are largely ad hoc and often do not manage to incorporate the true performance goals of the applications they are designed to support. This paper presents an enabling technology for adaptive computing systems: Application Heartbeats. The Application Heartbeats framework provides a simple, standard programming interface that applications can use to indicate their performance and system software (and hardware) can use to query an applicationâ s performance. Several experiments demonstrate the simplicity and efficacy of the Application Heartbeat approach. First the PARSEC benchmark suite is instrumented with Application Heartbeats to show the broad applicability of the interface. Then, an adaptive H.264 encoder is developed to show how applications might use Application Heartbeats internally. Next, an external resource scheduler is developed which assigns cores to an application based on its performance as specified with Application Heartbeats. Finally, the adaptive H.264 encoder is used to illustrate how Application Heartbeats can aid fault tolerance

    A User-Level Process Package for Concurrent Computing

    Get PDF
    A lightweight user-level process(ULP) package for parallel computing is described. Each ULP has its own register context, stack, data and heap space and communication with other ULPs is performed using locally synchronous, location transparent, message passing primitives. The aim of the package is to provide support for lightweight over-decomposition, optimized local communication and transparent dynamic migration. The package supports a subset of the Parallel Virtual Machine(PVM) interface[Sun90)

    Cache-and-query for wide area sensor databases

    Get PDF
    • …
    corecore