685 research outputs found

    ATOP-grid for unified multidimensional adaptation of grid applications.

    Get PDF

    Space and time adaptation for parallel applications via data over-partitioning.

    Get PDF
    Adaptive resource allocation is a new feature to run parallel applications. It is used to obtain better space and time sharing according to current workload, to schedule around obstacles through reservation and to cope with lack of accurate predictability on heterogeneous resources. The implementation of resource adaptation is potentially very expensive if total remapping or partitioning from scratch has to be performed. The existing popular run-time systems include AMPI and Dome. AMPI, which uses huge numbers of threads in MPI process to implement resource adaptation, suffers from frequent thread switches and loss of cache locality; and Dome, an object-based migration environment, suffers from lack of general language supports. When resource adaptation occurs, load balancing techniques are used to allocate the workload fairly across processors, so that each processor takes roughly the same time to execute the processes assigned to it, and that every processor has the same workload to obtain the best performance and maximize resource utilization. This thesis proposes a novel approach---Adaptive Time/space sharing via Over-Partitioning (ATOP)---to implement resource adaptation with better performance in terms of time overhead. Total workload is represented by a data graph. ATOP performs over-partitioning on the graph to create a certain number of workload pieces, or partitions, while processing partitions per processor as one data collection in a single MPI process. Typically, the number of partitions is set equal to the number of processors potentially allocated. This approach is feasible for the applications using 2n processors. In the cases where our over-partitioning approach does not perform well, or non-fitting numbers of resources need to be chosen, ATOP still provides the alternative option to repartition from scratch. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .H36. Source: Masters Abstracts International, Volume: 43-03, page: 0876. Adviser: A. C. Sodan. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    Time adaptation for parallel applications in unbalanced time sharing environment

    Get PDF
    Time adaptation is very significant for parallel jobs running on a parallel centralized or distributed multiprocessor machine. The turnaround time of an individual job depends on the turnaround time of each of its processes. Dynamic load balancing for unbalanced time sharing environment helps to equally distribute the work load among the available resources, so that all processes of a single job end almost at the same time, thus minimizing the turnaround time and maximizing the resource utilization. In this thesis we propose and implement an approach that helps parallel applications to use our library so that it can adapt in time dimension (if running in a time sharing environment) without changing the space allocation. This approach provides an interface between application, monitoring information, the job scheduler and a cost model that considers application, system and load-balancing information. This interface allows binding of different adaptation approaches for synchronous adaptation and semi-static remapping. We also determined job types for what this approach is suitable and at the end we present results from our test run on a 16-node cluster with synthetic MPI programs and a time adaptation approach, demonstrating the gain from our approach. In this work, we make extension of existing ATOP [11] work. We directly use their over partitioning strategy. But unlike ATOP, applications can use our adaptation library and adapt dynamically. We also adopted the dynamic directory concept used in SCOJO [8]. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .A74. Source: Masters Abstracts International, Volume: 44-03, page: 1393. Thesis (M.Sc.)--University of Windsor (Canada), 2005

    BIGhybrid - A Toolkit for Simulating MapReduce on Hybrid Infrastructures

    Get PDF
    Cloud computing has increasingly been used as a platform for running large business and data processing applications. Although clouds have become highly popular, when it comes to data processing, the cost of usage is not negligible. Conversely, Desktop Grids, have been used by a plethora of projects, taking advantage of the high number of resources provided for free by volunteers. Merging cloud computing and desktop grids into hybrid infrastructure can provide a feasible low-cost solution for big data analysis. Although frameworks like MapReduce have been conceived to exploit commodity hardware, their use on hybrid infrastructure poses some challenges due to large resource heterogeneity and high churn rate. This study introduces BIGhybrid a toolkit to simulate MapReduce on hybrid environments. The main goal is to provide a framework for developers and system designers to address the issues of hybrid MapReduce. In this paper, we describe the framework which simulates the assembly of two existing middleware: BitDew- MapReduce for Desktop Grids and Hadoop-BlobSeer for Cloud Computing. Experimental results included in this work demonstrate the feasibility of our approach

    Grid-job scheduling with reservations and preemption

    Get PDF
    Computational grids make it possible to exploit grid resources across multiple clusters when grid jobs are deconstructed into tasks and allocated across clusters. Grid-job tasks are often scheduled in the form of workflows which require synchronization, and advance reservation makes it easy to guarantee predictable resource provisioning for these jobs. However, advance reservation for grid jobs creates roadblocks and fragmentation which adversely affects the system utilization and response times for local jobs. We provide a solution which incorporates relaxed reservations and uses a modified version of the standard grid-scheduling algorithm, HEFT, to obtain flexibility in placing reservations for workflow grid jobs. Furthermore, we deploy the relaxed reservation with modified HEFT as an extension of the preemption based job scheduling framework, SCOJO-PECT job scheduler. In SCOJO-PECT, relaxed reservations serve the additional purpose of permitting scheduler optimizations which shift the overall schedule forward. Furthermore, a propagation heuristics algorithm is used to alleviate the workflow job makespan extension caused by the slack of relaxed reservation. Our solution aims at decreasing the fragmentation caused by grid jobs, so that local jobs and system utilization are not compromised, and at the same time grid jobs also have reasonable response times

    Quality of Service Aware Data Stream Processing for Highly Dynamic and Scalable Applications

    Get PDF
    Huge amounts of georeferenced data streams are arriving daily to data stream management systems that are deployed for serving highly scalable and dynamic applications. There are innumerable ways at which those loads can be exploited to gain deep insights in various domains. Decision makers require an interactive visualization of such data in the form of maps and dashboards for decision making and strategic planning. Data streams normally exhibit fluctuation and oscillation in arrival rates and skewness. Those are the two predominant factors that greatly impact the overall quality of service. This requires data stream management systems to be attuned to those factors in addition to the spatial shape of the data that may exaggerate the negative impact of those factors. Current systems do not natively support services with quality guarantees for dynamic scenarios, leaving the handling of those logistics to the user which is challenging and cumbersome. Three workloads are predominant for any data stream, batch processing, scalable storage and stream processing. In this thesis, we have designed a quality of service aware system, SpatialDSMS, that constitutes several subsystems that are covering those loads and any mixed load that results from intermixing them. Most importantly, we natively have incorporated quality of service optimizations for processing avalanches of geo-referenced data streams in highly dynamic application scenarios. This has been achieved transparently on top of the codebases of emerging de facto standard best-in-class representatives, thus relieving the overburdened shoulders of the users in the presentation layer from having to reason about those services. Instead, users express their queries with quality goals and our system optimizers compiles that down into query plans with an embedded quality guarantee and leaves logistic handling to the underlying layers. We have developed standard compliant prototypes for all the subsystems that constitutes SpatialDSMS

    Introducing distributed dynamic data-intensive (D3) science: Understanding applications and infrastructure

    Get PDF
    A common feature across many science and engineering applications is the amount and diversity of data and computation that must be integrated to yield insights. Data sets are growing larger and becoming distributed; and their location, availability and properties are often time-dependent. Collectively, these characteristics give rise to dynamic distributed data-intensive applications. While "static" data applications have received significant attention, the characteristics, requirements, and software systems for the analysis of large volumes of dynamic, distributed data, and data-intensive applications have received relatively less attention. This paper surveys several representative dynamic distributed data-intensive application scenarios, provides a common conceptual framework to understand them, and examines the infrastructure used in support of applications.Comment: 38 pages, 2 figure

    Computational Methods in Science and Engineering : Proceedings of the Workshop SimLabs@KIT, November 29 - 30, 2010, Karlsruhe, Germany

    Get PDF
    In this proceedings volume we provide a compilation of article contributions equally covering applications from different research fields and ranging from capacity up to capability computing. Besides classical computing aspects such as parallelization, the focus of these proceedings is on multi-scale approaches and methods for tackling algorithm and data complexity. Also practical aspects regarding the usage of the HPC infrastructure and available tools and software at the SCC are presented