835 research outputs found

    Parallelization of implicit finite difference schemes in computational fluid dynamics

    Get PDF
    Implicit finite difference schemes are often the preferred numerical schemes in computational fluid dynamics, requiring less stringent stability bounds than the explicit schemes. Each iteration in an implicit scheme involves global data dependencies in the form of second and higher order recurrences. Efficient parallel implementations of such iterative methods are considerably more difficult and non-intuitive. The parallelization of the implicit schemes that are used for solving the Euler and the thin layer Navier-Stokes equations and that require inversions of large linear systems in the form of block tri-diagonal and/or block penta-diagonal matrices is discussed. Three-dimensional cases are emphasized and schemes that minimize the total execution time are presented. Partitioning and scheduling schemes for alleviating the effects of the global data dependencies are described. An analysis of the communication and the computation aspects of these methods is presented. The effect of the boundary conditions on the parallel schemes is also discussed

    Hybridized Darts Game with Beluga Whale Optimization Strategy for Efficient Task Scheduling with Optimal Load Balancing in Cloud Computing

    Get PDF
    A cloud computing technology permits clients to use hardware and software technology virtually on a subscription basis. The task scheduling process is planned to effectively minimize implementation time and cost while simultaneously increasing resource utilization, and it is one of the most common problems in cloud computing systems. The Nondeterministic Polynomial (NP)-hard optimization problem occurs due to limitations like an insufficient make-span, excessive resource utilization, low implementation costs, and immediate response for scheduling. The task allocation is NP-hard because of the increase in the amount of combinations and computing resources. In this work, a hybrid heuristic optimization technique with load balancing is implemented for optimal task scheduling to increase the performance of service providers in the cloud infrastructure. Thus, the issues that occur in the scheduling process is greatly reduced. The load balancing problem is effectively solved with the help of the proposed task scheduling scheme. The allocation of tasks to the machines based on the workload is done with the help of the proposed Hybridized Darts Game-Based Beluga Whale Optimization Algorithm (HDG-BWOA). The objective functions like higher Cloud Data Center (CDC) resource consumption, increased task assurance ratio, minimized mean reaction time, and reduced energy utilization are considered while allocating the tasks to the virtual machines. This task scheduling approach ensures flexibility among virtual machines, preventing them from overloading or underloading. Also, using this technique, more tasks is efficiently completed within the deadline. The efficacy of the offered arrangement is ensured with the conventional heuristic-based task scheduling approaches in accordance with various evaluation measures

    Assessing general-purpose algorithms to cope with fail-stop and silent errors

    Get PDF
    In this paper, we combine the traditional checkpointing and rollback recovery strategies with verification mechanisms to cope with both fail-stop and silent errors. The objective is to minimize makespan and/or energy consumption.For divisible load applications, we use first-order approximations to find the optimal checkpointing period to minimize execution time, with an additional verification mechanism to detect silent errors before each checkpoint,hence extending the classical formula by Young and Daly for fail-stop errors only. We further extendthe approach to include intermediate verifications, and to consider a bi-criteria problem involving both time and energy(linear combination of execution time and energy consumption). Then, we focus on application workflows whose dependence graph is a linear chain of tasks. Here, we determine the optimal checkpointing and verification locations, with or without intermediate verifications, for the bi-criteria problem. Rather than using a single speed during the whole execution, we further introduce a new execution scenario, which allows for changing the execution speed via dynamic voltage and frequency scaling (DVFS).In this latter scenario, we determine the optimal checkpointing and verification locations, as well as the optimal speed pairs for each task segment between any two consecutive checkpoints.Finally, we conduct an extensive set of simulations to support the theoretical study, and to assess the performanceof each algorithm, showing that the best overall performance is achieved under the most flexible scenariousing intermediate verifications and different speeds

    Revisiting Matrix Product on Master-Worker Platforms

    Get PDF
    This paper is aimed at designing efficient parallel matrix-product algorithms for heterogeneous master-worker platforms. While matrix-product is well-understood for homogeneous 2D-arrays of processors (e.g., Cannon algorithm and ScaLAPACK outer product algorithm), there are three key hypotheses that render our work original and innovative: - Centralized data. We assume that all matrix files originate from, and must be returned to, the master. - Heterogeneous star-shaped platforms. We target fully heterogeneous platforms, where computational resources have different computing powers. - Limited memory. Because we investigate the parallelization of large problems, we cannot assume that full matrix panels can be stored in the worker memories and re-used for subsequent updates (as in ScaLAPACK). We have devised efficient algorithms for resource selection (deciding which workers to enroll) and communication ordering (both for input and result messages), and we report a set of numerical experiments on various platforms at Ecole Normale Superieure de Lyon and the University of Tennessee. However, we point out that in this first version of the report, experiments are limited to homogeneous platforms
    • …
    corecore