Search CORE

28 research outputs found

Scalable Adaptive Mantle Convection Simulation on Petascale Supercomputers

Author: Burstedde Carsten
Ghattas Omar
Gurnis Michael
Stadler Georg
Tan Eh
Tu Tiankai
Wilcox Lucas C.
Zhong Shijie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2008
Field of study

Mantle convection is the principal control on the thermal and geological evolution of the Earth. Mantle convection modeling involves solution of the mass, momentum, and energy equations for a viscous, creeping, incompressible non-Newtonian fluid at high Rayleigh and Peclet numbers. Our goal is to conduct global mantle convection simulations that can resolve faulted plate boundaries, down to 1 km scales. However, uniform resolution at these scales would result in meshes with a trillion elements, which would elude even sustained petaflops supercomputers. Thus parallel adaptive mesh refinement and coarsening (AMR) is essential. We present RHEA, a new generation mantle convection code designed to scale to hundreds of thousands of cores. RHEA is built on ALPS, a parallel octree-based adaptive mesh finite element library that provides new distributed data structures and parallel algorithms for dynamic coarsening, refinement, rebalancing, and repartitioning of the mesh. ALPS currently supports low order continuous Lagrange elements, and arbitrary order discontinuous Galerkin spectral elements, on octree meshes. A forest-ofoctrees implementation permits nearly arbitrary geometries to be accommodated. Using TACC’s 579 teraflops Ranger supercomputer, we demonstrate excellent weak and strong scalability of parallel AMR on up to 62,464 cores for problems with up to 12.4 billion elements. With RHEA’s adaptive capabilities, we have been able to reduce the number of elements by over three orders of magnitude, thus enabling us to simulate large-scale mantle convection with finest local resolution of 1.5 km

Crossref

Caltech Authors

Tails in the cloud: a survey and taxonomy of straggler management within large-scale cloud data centres

Author: Garraghan P
Gill SS
Ouyang X
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/03/2020
Field of study

Cloud computing systems are splitting compute- and data-intensive jobs into smaller tasks to execute them in a parallel manner using clusters to improve execution time. However, such systems at increasing scale are exposed to stragglers, whereby abnormally slow running tasks executing within a job substantially affect job performance completion. Such stragglers are a direct threat towards attaining fast execution of data-intensive jobs within cloud computing. Researchers have proposed an assortment of different mechanisms, frameworks, and management techniques to detect and mitigate stragglers both proactively and reactively. In this paper, we present a comprehensive review of straggler management techniques within large-scale cloud data centres. We provide a detailed taxonomy of straggler causes, as well as proposed management and mitigation techniques based on straggler characteristics and properties. From this systematic review, we outline several outstanding challenges and potential directions of possible future work for straggler research

Queen Mary Research Online

Lancaster E-Prints

Parallel Programming with Migratable Objects: Charm++ in Practice

Author: Acun Bilge
Gupta Abhishek
Jain Nikhil
Kale Laxmikant
Langer Akhil
Menon Harshitha
Mikida Eric
Ni Xiang
Robson Michael
Sun Yanhua
Totoni Ehsan
Wesolowski Lukasz
Publication venue: Smith ScholarWorks
Publication date: 16/01/2014
Field of study

The advent of petascale computing has introduced new challenges (e.g. Heterogeneity, system failure) for programming scalable parallel applications. Increased complexity and dynamism in science and engineering applications of today have further exacerbated the situation. Addressing these challenges requires more emphasis on concepts that were previously of secondary importance, including migratability, adaptivity, and runtime system introspection. In this paper, we leverage our experience with these concepts to demonstrate their applicability and efficacy for real world applications. Using the CHARM++ parallel programming framework, we present details on how these concepts can lead to development of applications that scale irrespective of the rough landscape of supercomputing technology. Empirical evaluation presented in this paper spans many miniapplications and real applications executed on modern supercomputers including Blue Gene/Q, Cray XE6, and Stampede

Smith College: Smith ScholarWorks

Task Scheduling in Big Data Platforms: A Systematic Literature Review

Author: Khomh Foutse
Soualhia Mbarka
Tahar Sofiène
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Context: Hadoop, Spark, Storm, and Mesos are very well known frameworks in both research and industrial communities that allow expressing and processing distributed computations on massive amounts of data. Multiple scheduling algorithms have been proposed to ensure that short interactive jobs, large batch jobs, and guaranteed-capacity production jobs running on these frameworks can deliver results quickly while maintaining a high throughput. However, only a few works have examined the effectiveness of these algorithms. Objective: The Evidence-based Software Engineering (EBSE) paradigm and its core tool, i.e., the Systematic Literature Review (SLR), have been introduced to the Software Engineering community in 2004 to help researchers systematically and objectively gather and aggregate research evidences about different topics. In this paper, we conduct a SLR of task scheduling algorithms that have been proposed for big data platforms. Method: We analyse the design decisions of different scheduling models proposed in the literature for Hadoop, Spark, Storm, and Mesos over the period between 2005 and 2016. We provide a research taxonomy for succinct classification of these scheduling models. We also compare the algorithms in terms of performance, resources utilization, and failure recovery mechanisms. Results: Our searches identifies 586 studies from journals, conferences and workshops having the highest quality in this field. This SLR reports about different types of scheduling models (dynamic, constrained, and adaptive) and the main motivations behind them (including data locality, workload balancing, resources utilization, and energy efficiency). A discussion of some open issues and future challenges pertaining to improving the current studies is provided

Crossref

Concordia University Research Repository

PolyPublie

Recommended from our members

Computing resources sensitive parallelization of neural neworks for large scale diabetes data modelling, diagnosis and prediction

Author: Qi Hao
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2011
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Diabetes has become one of the most severe deceases due to an increasing number of diabetes patients globally. A large amount of digital data on diabetes has been collected through various channels. How to utilize these data sets to help doctors to make a decision on diagnosis, treatment and prediction of diabetic patients poses many challenges to the research community. The thesis investigates mathematical models with a focus on neural networks for large scale diabetes data modelling and analysis by utilizing modern computing technologies such as grid computing and cloud computing. These computing technologies provide users with an inexpensive way to have access to extensive computing resources over the Internet for solving data and computationally intensive problems. This thesis evaluates the performance of seven representative machine learning techniques in classification of diabetes data and the results show that neural network produces the best accuracy in classification but incurs high overhead in data training. As a result, the thesis develops MRNN, a parallel neural network model based on the MapReduce programming model which has become an enabling technology in support of data intensive applications in the clouds. By partitioning the diabetic data set into a number of equally sized data blocks, the workload in training is distributed among a number of computing nodes for speedup in data training. MRNN is first evaluated in small scale experimental environments using 12 mappers and subsequently is evaluated in large scale simulated environments using up to 1000 mappers. Both the experimental and simulations results have shown the effectiveness of MRNN in classification, and its high scalability in data training. MapReduce does not have a sophisticated job scheduling scheme for heterogonous computing environments in which the computing nodes may have varied computing capabilities. For this purpose, this thesis develops a load balancing scheme based on genetic algorithms with an aim to balance the training workload among heterogeneous computing nodes. The nodes with more computing capacities will receive more MapReduce jobs for execution. Divisible load theory is employed to guide the evolutionary process of the genetic algorithm with an aim to achieve fast convergence. The proposed load balancing scheme is evaluated in large scale simulated MapReduce environments with varied levels of heterogeneity using different sizes of data sets. All the results show that the genetic algorithm based load balancing scheme significantly reduce the makespan in job execution in comparison with the time consumed without load balancing.This work is funded by the EPSRC and China Market Association

Brunel University Research Archive

Hypersonic flows around complex geometries with adaptive mesh refinement and immersed boundary method

Author: Patel Monal
Publication venue: Mechanical Engineering, Imperial College London
Publication date: 01/03/2023
Field of study

This thesis develops and validates a computational fluid dynamics numerical method for hypersonic flows; and uses it to conduct two novel investigations. The numerical method involves a novel combination of structured adaptive mesh refinement, ghost-point immersed boundary and artificial dissipation shock-stable Euler flux discretisation. The method is high-order, low dissipation and stable up to Mach numbers

M \lesssim 30

with stationary or moving complex geometries; it is shown to be suitable for direct numerical simulations of laminar and turbulent flows. The method's performance is assessed through various test cases. Firstly, heat transfer to proximal cylinders in hypersonic flow is investigated to improve understanding of destructive atmospheric entries of meteors, satellites and spacecraft components. Binary bodies and clusters with five bodies are considered. With binary proximal bodies, the heat load and peak heat transfer are augmented for either or both proximal bodies by

+20\%

-90\%

of an isolated body. Whereas with five bodies, the cluster-averaged heat load varied between

+20\%

-60\%

of an isolated body. Generally, clusters which are thin in the direction perpendicular to free-stream velocity and long in the direction parallel to the free-stream velocity have their heat load reduced. In contrast, clusters which are thick and thin in directions perpendicular and parallel to the free-stream velocity feel an increased heat load. Secondly, hypersonic ablation patterns are investigated. Ablation patterns form on spacecraft thermal protection systems and meteor surfaces, where their development and interactions with the boundary layer are poorly understood. Initially, a simple subliming sphere case without solid conduction in hypersonic laminar flow is used to validate the numerical method. Where the surface recession is artificially sped-up via the wall Damk\"{o}hler number without introducing significant errors in the shape change. Then, a case with transitional inflow over a backward facing step with a subliming boundary is devised. Differential ablation is observed to generate surface roughness and add vorticity to the boundary layer. A maximum surface recession of

\sim 0.8\times

and a maximum surface fluctuation of

\sim 0.2\times

the inflow boundary layer thickness were generated over two flow times.Open Acces

Spiral - Imperial College Digital Repository

Recommended from our members

Hadoop performance modeling and job optimization for big data analytics

Author: Khan Mukhtaj
Publication venue: Brunel University London
Publication date: 01/01/2015
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonBig data has received a momentum from both academia and industry. The MapReduce model has emerged into a major computing model in support of big data analytics. Hadoop, which is an open source implementation of the MapReduce model, has been widely taken up by the community. Cloud service providers such as Amazon EC2 cloud have now supported Hadoop user applications. However, a key challenge is that the cloud service providers do not a have resource provisioning mechanism to satisfy user jobs with deadline requirements. Currently, it is solely the user responsibility to estimate the require amount of resources for their job running in a public cloud. This thesis presents a Hadoop performance model that accurately estimates the execution duration of a job and further provisions the required amount of resources for a job to be completed within a deadline. The proposed model employs Locally Weighted Linear Regression (LWLR) model to estimate execution time of a job and Lagrange Multiplier technique for resource provisioning to satisfy user job with a given deadline. The performance of the propose model is extensively evaluated in both in-house Hadoop cluster and Amazon EC2 Cloud. Experimental results show that the proposed model is highly accurate in job execution estimation and jobs are completed within the required deadlines following on the resource provisioning scheme of the proposed model. In addition, the Hadoop framework has over 190 configuration parameters and some of them have significant effects on the performance of a Hadoop job. Manually setting the optimum values for these parameters is a challenging task and also a time consuming process. This thesis presents optimization works that enhances the performance of Hadoop by automatically tuning its parameter values. It employs Gene Expression Programming (GEP) technique to build an objective function that represents the performance of a job and the correlation among the configuration parameters. For the purpose of optimization, Particle Swarm Optimization (PSO) is employed to find automatically an optimal or a near optimal configuration settings. The performance of the proposed work is intensively evaluated on a Hadoop cluster and the experimental results show that the proposed work enhances the performance of Hadoop significantly compared with the default settings.Abdul Wali Khan University Marda

Brunel University Research Archive

Recommended from our members

A Component Architecture for High-Performance Scientific Computing

Author: Allan B. A.
Armstrong R.
Bernholdt D. E.
Bertrand F.
Chiu K.
Dahlgren T. L.
Damevski K.
Elwasif W. R.
Epperly T. W.
Govindaraju M.
Katz D. S.
Kohl J. A.
Krishnan M.
Kumfert G.
Larson J. W.
Lefantzi S.
Lewis M. J.
Malony A. D.
McInnes L. C.
Nieplocha J.
Norris B.
Parker S. G.
Ray J.
Shende S.
Windus T. L.
Zhou S.
Publication venue: Lawrence Livermore National Laboratory
Publication date: 14/12/2004
Field of study

The Common Component Architecture (CCA) provides a means for software developers to manage the complexity of large-scale scientific simulations and to move toward a plug-and-play environment for high-performance computing. In the scientific computing context, component models also promote collaboration using independently developed software, thereby allowing particular individuals or groups to focus on the aspects of greatest interest to them. The CCA supports parallel and distributed computing as well as local high-performance connections between components in a language-independent manner. The design places minimal requirements on components and thus facilitates the integration of existing code into the CCA environment. The CCA model imposes minimal overhead to minimize the impact on application performance. The focus on high performance distinguishes the CCA from most other component models. The CCA is being applied within an increasing range of disciplines, including combustion research, global climate simulation, and computational chemistry

UNT Digital Library

Computing resources sensitive parallelization of neural neworks for large scale diabetes data modelling, diagnosis and prediction

Author: Li M
Qi Hao
Publication venue
Publication date: 01/01/2011
Field of study

Diabetes has become one of the most severe deceases due to an increasing number of diabetes patients globally. A large amount of digital data on diabetes has been collected through various channels. How to utilize these data sets to help doctors to make a decision on diagnosis, treatment and prediction of diabetic patients poses many challenges to the research community. The thesis investigates mathematical models with a focus on neural networks for large scale diabetes data modelling and analysis by utilizing modern computing technologies such as grid computing and cloud computing. These computing technologies provide users with an inexpensive way to have access to extensive computing resources over the Internet for solving data and computationally intensive problems. This thesis evaluates the performance of seven representative machine learning techniques in classification of diabetes data and the results show that neural network produces the best accuracy in classification but incurs high overhead in data training. As a result, the thesis develops MRNN, a parallel neural network model based on the MapReduce programming model which has become an enabling technology in support of data intensive applications in the clouds. By partitioning the diabetic data set into a number of equally sized data blocks, the workload in training is distributed among a number of computing nodes for speedup in data training. MRNN is first evaluated in small scale experimental environments using 12 mappers and subsequently is evaluated in large scale simulated environments using up to 1000 mappers. Both the experimental and simulations results have shown the effectiveness of MRNN in classification, and its high scalability in data training. MapReduce does not have a sophisticated job scheduling scheme for heterogonous computing environments in which the computing nodes may have varied computing capabilities. For this purpose, this thesis develops a load balancing scheme based on genetic algorithms with an aim to balance the training workload among heterogeneous computing nodes. The nodes with more computing capacities will receive more MapReduce jobs for execution. Divisible load theory is employed to guide the evolutionary process of the genetic algorithm with an aim to achieve fast convergence. The proposed load balancing scheme is evaluated in large scale simulated MapReduce environments with varied levels of heterogeneity using different sizes of data sets. All the results show that the genetic algorithm based load balancing scheme significantly reduce the makespan in job execution in comparison with the time consumed without load balancing.EThOS - Electronic Theses Online ServiceEPSRCChina Market AssociationGBUnited Kingdo

OpenGrey Repository

Heuristics for periodical batch job scheduling in a MapReduce computing framework

Author: Ruiz García Rubén
Tianze Jiang
Xiaoping Li
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Task scheduling has a significant impact on the performance of the MapReduce computing framework. In this paper, a scheduling problem of periodical batch jobs with makespan minimization is considered. The problem is modeled as a general two-stage hybrid flow shop scheduling problem with schedule-dependent setup times. The new model incorporates the data locality of tasks and is formulated as an integer program. Three heuristics are developed to solve the problem and an improvement policy based on data locality is presented to enhance the methods. A lower bound of the makespan is derived. 150 instances are randomly generated from data distributions drawn from a real cluster. The parameters involved in the methods are set according to different cluster setups. The proposed heuristics are compared over different numbers of jobs and cluster setups. Computational results show that the performance of the methods is highly dependent on both the number of jobs and the cluster setups. The proposed improvement policy is effective and the impact of the input data distribution on the policy is analyzed and tested.This work is supported by the National Natural Science Foundation of China (No. 61272377) and the Specialized Research Fund for the Doctoral Program of Higher Education (No. 20120092110027). Ruben Ruiz is partially supported by the Spanish Ministry of Economy and Competitiveness, under the project "RESULT - Realistic Extended Scheduling Using Light Techniques" (No. DPI2012-36243-C02-01) partially financed with FEDER funds.Xiaoping Li; Tianze Jiang; Ruiz García, R. (2016). Heuristics for periodical batch job scheduling in a MapReduce computing framework. Information Sciences. 326:119-133. https://doi.org/10.1016/j.ins.2015.07.040S11913332

Crossref

RiuNet