Search CORE

835 research outputs found

Parallelization of implicit finite difference schemes in computational fluid dynamics

Author: Decker Naomi H.
Naik Vijay K.
Nicoules Michel
Publication venue
Publication date
Field of study

Implicit finite difference schemes are often the preferred numerical schemes in computational fluid dynamics, requiring less stringent stability bounds than the explicit schemes. Each iteration in an implicit scheme involves global data dependencies in the form of second and higher order recurrences. Efficient parallel implementations of such iterative methods are considerably more difficult and non-intuitive. The parallelization of the implicit schemes that are used for solving the Euler and the thin layer Navier-Stokes equations and that require inversions of large linear systems in the form of block tri-diagonal and/or block penta-diagonal matrices is discussed. Three-dimensional cases are emphasized and schemes that minimize the total execution time are presented. Partitioning and scheduling schemes for alleviating the effects of the global data dependencies are described. An analysis of the communication and the computation aspects of these methods is presented. The effect of the boundary conditions on the parallel schemes is also discussed

NASA Technical Reports Server

Hybridized Darts Game with Beluga Whale Optimization Strategy for Efficient Task Scheduling with Optimal Load Balancing in Cloud Computing

Author: Manish Chhabra et al.
Publication venue: Auricle Global Society of Education and Research
Publication date: 02/11/2023
Field of study

A cloud computing technology permits clients to use hardware and software technology virtually on a subscription basis. The task scheduling process is planned to effectively minimize implementation time and cost while simultaneously increasing resource utilization, and it is one of the most common problems in cloud computing systems. The Nondeterministic Polynomial (NP)-hard optimization problem occurs due to limitations like an insufficient make-span, excessive resource utilization, low implementation costs, and immediate response for scheduling. The task allocation is NP-hard because of the increase in the amount of combinations and computing resources. In this work, a hybrid heuristic optimization technique with load balancing is implemented for optimal task scheduling to increase the performance of service providers in the cloud infrastructure. Thus, the issues that occur in the scheduling process is greatly reduced. The load balancing problem is effectively solved with the help of the proposed task scheduling scheme. The allocation of tasks to the machines based on the workload is done with the help of the proposed Hybridized Darts Game-Based Beluga Whale Optimization Algorithm (HDG-BWOA). The objective functions like higher Cloud Data Center (CDC) resource consumption, increased task assurance ratio, minimized mean reaction time, and reduced energy utilization are considered while allocating the tasks to the virtual machines. This task scheduling approach ensures flexibility among virtual machines, preventing them from overloading or underloading. Also, using this technique, more tasks is efficiently completed within the deadline. The efficacy of the offered arrangement is ensured with the conventional heuristic-based task scheduling approaches in accordance with various evaluation measures

International Journal on Recent and Innovation Trends in Computing and Communication

Assessing general-purpose algorithms to cope with fail-stop and silent errors

Author: Benoit Anne
Cavelan Aurélien
Robert Yves
Sun Hongyang
Publication venue: HAL CCSD
Publication date: 01/09/2014
Field of study

In this paper, we combine the traditional checkpointing and rollback recovery strategies with verification mechanisms to cope with both fail-stop and silent errors. The objective is to minimize makespan and/or energy consumption.For divisible load applications, we use first-order approximations to find the optimal checkpointing period to minimize execution time, with an additional verification mechanism to detect silent errors before each checkpoint,hence extending the classical formula by Young and Daly for fail-stop errors only. We further extendthe approach to include intermediate verifications, and to consider a bi-criteria problem involving both time and energy(linear combination of execution time and energy consumption). Then, we focus on application workflows whose dependence graph is a linear chain of tasks. Here, we determine the optimal checkpointing and verification locations, with or without intermediate verifications, for the bi-criteria problem. Rather than using a single speed during the whole execution, we further introduce a new execution scenario, which allows for changing the execution speed via dynamic voltage and frequency scaling (DVFS).In this latter scenario, we determine the optimal checkpointing and verification locations, as well as the optimal speed pairs for each task segment between any two consecutive checkpoints.Finally, we conduct an extensive set of simulations to support the theoretical study, and to assess the performanceof each algorithm, showing that the best overall performance is achieved under the most flexible scenariousing intermediate verifications and different speeds

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Revisiting Matrix Product on Master-Worker Platforms

Author: Dongarra Jack
Laboratoire de l'informatique du parallélisme
Pineau Jean-François
Robert Yves
Shi Zhiao
Vivien Frédéric
Publication venue
Publication date: 01/01/2006
Field of study

This paper is aimed at designing efficient parallel matrix-product algorithms for heterogeneous master-worker platforms. While matrix-product is well-understood for homogeneous 2D-arrays of processors (e.g., Cannon algorithm and ScaLAPACK outer product algorithm), there are three key hypotheses that render our work original and innovative: - Centralized data. We assume that all matrix files originate from, and must be returned to, the master. - Heterogeneous star-shaped platforms. We target fully heterogeneous platforms, where computational resources have different computing powers. - Limited memory. Because we investigate the parallelization of large problems, we cannot assume that full matrix panels can be stored in the worker memories and re-used for subsequent updates (as in ScaLAPACK). We have devised efficient algorithms for resource selection (deciding which workers to enroll) and communication ordering (both for input and result messages), and we report a set of numerical experiments on various platforms at Ecole Normale Superieure de Lyon and the University of Tennessee. However, we point out that in this first version of the report, experiments are limited to homogeneous platforms

arXiv.org e-Print Archive

HAL-ENS-LYON

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Libre Acces aux Rapports Scientifiques et Techniques

The University of Manchester - Institutional Repository

Hal-Diderot