2,981 research outputs found

    Partitioning Complex Networks via Size-constrained Clustering

    Full text link
    The most commonly used method to tackle the graph partitioning problem in practice is the multilevel approach. During a coarsening phase, a multilevel graph partitioning algorithm reduces the graph size by iteratively contracting nodes and edges until the graph is small enough to be partitioned by some other algorithm. A partition of the input graph is then constructed by successively transferring the solution to the next finer graph and applying a local search algorithm to improve the current solution. In this paper, we describe a novel approach to partition graphs effectively especially if the networks have a highly irregular structure. More precisely, our algorithm provides graph coarsening by iteratively contracting size-constrained clusterings that are computed using a label propagation algorithm. The same algorithm that provides the size-constrained clusterings can also be used during uncoarsening as a fast and simple local search algorithm. Depending on the algorithm's configuration, we are able to compute partitions of very high quality outperforming all competitors, or partitions that are comparable to the best competitor in terms of quality, hMetis, while being nearly an order of magnitude faster on average. The fastest configuration partitions the largest graph available to us with 3.3 billion edges using a single machine in about ten minutes while cutting less than half of the edges than the fastest competitor, kMetis

    Scientific High Performance Computing (HPC) Applications On The Azure Cloud Platform

    Get PDF
    Cloud computing is emerging as a promising platform for compute and data intensive scientific applications. Thanks to the on-demand elastic provisioning capabilities, cloud computing has instigated curiosity among researchers from a wide range of disciplines. However, even though many vendors have rolled out their commercial cloud infrastructures, the service offerings are usually only best-effort based without any performance guarantees. Utilization of these resources will be questionable if it can not meet the performance expectations of deployed applications. Additionally, the lack of the familiar development tools hamper the productivity of eScience developers to write robust scientific high performance computing (HPC) applications. There are no standard frameworks that are currently supported by any large set of vendors offering cloud computing services. Consequently, the application portability among different cloud platforms for scientific applications is hard. Among all clouds, the emerging Azure cloud from Microsoft in particular remains a challenge for HPC program development both due to lack of its support for traditional parallel programming support such as Message Passing Interface (MPI) and map-reduce and due to its evolving application programming interfaces (APIs). We have designed newer frameworks and runtime environments to help HPC application developers by providing them with easy to use tools similar to those known from traditional parallel and distributed computing environment set- ting, such as MPI, for scientific application development on the Azure cloud platform. It is challenging to create an efficient framework for any cloud platform, including the Windows Azure platform, as they are mostly offered to users as a black-box with a set of application programming interfaces (APIs) to access various service components. The primary contributions of this Ph.D. thesis are (i) creating a generic framework for bag-of-tasks HPC applications to serve as the basic building block for application development on the Azure cloud platform, (ii) creating a set of APIs for HPC application development over the Azure cloud platform, which is similar to message passing interface (MPI) from traditional parallel and distributed setting, and (iii) implementing Crayons using the proposed APIs as the first end-to-end parallel scientific application to parallelize the fundamental GIS operations

    OpenACC Based GPU Parallelization of Plane Sweep Algorithm for Geometric Intersection

    Get PDF
    Line segment intersection is one of the elementary operations in computational geometry. Complex problems in Geographic Information Systems (GIS) like finding map overlays or spatial joins using polygonal data require solving segment intersections. Plane sweep paradigm is used for finding geometric intersection in an efficient manner. However, it is difficult to parallelize due to its in-order processing of spatial events. We present a new fine-grained parallel algorithm for geometric intersection and its CPU and GPU implementation using OpenMP and OpenACC. To the best of our knowledge, this is the first work demonstrating an effective parallelization of plane sweep on GPUs. We chose compiler directive based approach for implementation because of its simplicity to parallelize sequential code. Using Nvidia Tesla P100 GPU, our implementation achieves around 40X speedup for line segment intersection problem on 40K and 80K data sets compared to sequential CGAL library

    Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud

    Get PDF
    While Cloud Computing has transformed how we solve many computing tasks, some scientific and many-task applications are not efficiently executed on cloud resources. Decentralized scheduling, as studied in grid computing, can provide a scalable system to organize cloud resources and schedule a variety of work. By measuring simulations of two algorithms, the fully decentralized Organic Grid, and the partially decentralized Air Traffic Controller from IBM, we establish that decentralization is a workable approach, and that there are bottlenecks that can impact partially centralized algorithms. Through measurements in the cloud, we verify that our simulation approach is sound, and assess the variable performance of cloud resources. We propose a scheduler that measures the capabilities of the resources available to execute a task and distributes work dynamically at run time. Our scheduling algorithm is evaluated experimentally, and we show that performance-aware scheduling in a cloud environment can provide improvements in execution time. This provides a framework by which a variety of parameters can be weighed to make job-specific and context-aware scheduling decisions. Our measurements examine the usefulness of benchmarking as a metric used to measure a node\u27s performance, and drive scheduling. Benchmarking provides an advantage over simple queue-based scheduling on distributed systems whose members vary in actual performance, but the NAS benchmark we use does not always correlate perfectly with actual performance. The utilized hardware is examined, as are enforced performance variations, and we observe changes in performance that result in running on a system in which different workers receive different CPU allocations. As we see that performance metrics are useful near the end of the execution of a large job, we create a new metric from historical data of partially completed work, and use that to drive execution time down further. Interdependent task graph work is introduced and described as a next step in improving cloud scheduling. Realistic task graph problems are defined and a scheduling approach is introduced. This dissertation lays the groundwork to expand the types of problems that can be solved efficiently in the cloud environment
    corecore