Search CORE

10,623 research outputs found

Model Coupling between the Weather Research and Forecasting Model and the DPRI Large Eddy Simulator for Urban Flows on GPU-accelerated Multicore Systems

Author: Vanderbauwhede Wim
Publication venue
Publication date: 01/01/2015
Field of study

In this report we present a novel approach to model coupling for shared-memory multicore systems hosting OpenCL-compliant accelerators, which we call The Glasgow Model Coupling Framework (GMCF). We discuss the implementation of a prototype of GMCF and its application to coupling the Weather Research and Forecasting Model and an OpenCL-accelerated version of the Large Eddy Simulator for Urban Flows (LES) developed at DPRI. The first stage of this work concerned the OpenCL port of the LES. The methodology used for the OpenCL port is a combination of automated analysis and code generation and rule-based manual parallelization. For the evaluation, the non-OpenCL LES code was compiled using gfortran, fort and pgfortran}, in each case with auto-parallelization and auto-vectorization. The OpenCL-accelerated version of the LES achieves a 7 times speed-up on a NVIDIA GeForce GTX 480 GPGPU, compared to the fastest possible compilation of the original code running on a 12-core Intel Xeon E5-2640. In the second stage of this work, we built the Glasgow Model Coupling Framework and successfully used it to couple an OpenMP-parallelized WRF instance with an OpenCL LES instance which runs the LES code on the GPGPI. The system requires only very minimal changes to the original code. The report discusses the rationale, aims, approach and implementation details of this work.Comment: This work was conducted during a research visit at the Disaster Prevention Research Institute of Kyoto University, supported by an EPSRC Overseas Travel Grant, EP/L026201/

arXiv.org e-Print Archive

Enlighten

The finite element machine: An experiment in parallel processing

Author: Adams L.
Crockett T. W.
Knott J. D.
Peebles S. W.
Storaasli O. O.
Publication venue
Publication date
Field of study

The finite element machine is a prototype computer designed to support parallel solutions to structural analysis problems. The hardware architecture and support software for the machine, initial solution algorithms and test applications, and preliminary results are described

NASA Technical Reports Server

Solution of partial differential equations on vector and parallel computers

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

NASA Technical Reports Server

Towards developing robust algorithms for solving partial differential equations on MIMD machines

Author: Naik V. K.
Saltz J. H.
Publication venue
Publication date
Field of study

Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system

NASA Technical Reports Server

Parallel performance prediction for multigrid codes on distributed memory architectures

Author: Jimack P.K.
Romanazzi G
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

We propose a model for describing the parallel performance of multigrid software on distributed memory architectures. The goal of the model is to allow reliable predictions to be made as to the execution time of a given code on a large number of processors, of a given parallel system, by only benchmarking the code on small numbers of processors. This has potential applications for the scheduling of jobs in a Grid computing environment where reliable predictions as to execution times on different systems will be valuable. The model is tested for two different multigrid codes running on two different parallel architectures and the results obtained are discussed

CiteSeerX

White Rose Research Online

Parallel Image Reconstruction Systems for Magnetic Induction Tomography

Author: Yasheng Maimaitijiang
Publication venue
Publication date: 02/03/2009
Field of study

University of South Wales Research Explorer

An ADMM Based Framework for AutoML Pipeline Configuration

Author: Bouneffouf Djallel
Bramble Gregory
Conn Andrew
Gray Alexander
Liu Sijia
Ram Parikshit
Samulowitz Horst
Vijaykeerthy Deepak
Wang Dakuo
Publication venue
Publication date: 06/12/2019
Field of study

We study the AutoML problem of automatically configuring machine learning pipelines by jointly selecting algorithms and their appropriate hyper-parameters for all steps in supervised learning pipelines. This black-box (gradient-free) optimization with mixed integer & continuous variables is a challenging problem. We propose a novel AutoML scheme by leveraging the alternating direction method of multipliers (ADMM). The proposed framework is able to (i) decompose the optimization problem into easier sub-problems that have a reduced number of variables and circumvent the challenge of mixed variable categories, and (ii) incorporate black-box constraints along-side the black-box optimization objective. We empirically evaluate the flexibility (in utilizing existing AutoML techniques), effectiveness (against open source AutoML toolkits),and unique capability (of executing AutoML with practically motivated black-box constraints) of our proposed scheme on a collection of binary classification data sets from UCI ML& OpenML repositories. We observe that on an average our framework provides significant gains in comparison to other AutoML frameworks (Auto-sklearn & TPOT), highlighting the practical advantages of this framework

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications