Search CORE

1,464 research outputs found

State-of-the-Art in Parallel Computing with R

Author: Eddelbuettel Dirk
Mansmann Ulrich
Morgan Martin
Schmidberger Markus
Tierney Luke
Yu Hao
Publication venue
Publication date: 01/01/2009
Field of study

R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing. This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance. Two packages (snow, Rmpi) stand out as particularly useful for general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems four different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix

Crossref

Directory of Open Access Journals

Open Access LMU

Journal of Statistical Software

A First Step Towards Automatically Building Network Representations

Author: A. Legrand
A. Legrand
D. Tsafrir
E. Caron
I. Foster
I. Foster
J.W. Byers
M. Burger den
P.K. Chouhan
R. Wolski
T. Kielmann
T. Ng
Publication venue
Publication date: 01/01/2007
Field of study

To fully harness Grids, users or middlewares must have some knowledge on the topology of the platform interconnection network. As such knowledge is usually not available, one must uses tools which automatically build a topological network model through some measurements. In this article, we define a methodology to assess the quality of these network model building tools, and we apply this methodology to representatives of the main classes of model builders and to two new algorithms. We show that none of the main existing techniques build models that enable to accurately predict the running time of simple application kernels for actual platforms. However some of the new algorithms we propose give excellent results in a wide range of situations

arXiv.org e-Print Archive

HAL-ENS-LYON

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

A Taxonomy of Workflow Management Systems for Grid Computing

Author: Buyya Rajkumar
Yu Jia
Publication venue
Publication date: 01/01/2005
Field of study

With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

A high resolution coupled hydrologic–hydraulic model (HiResFlood-UCI) for flash flood modeling

Author: AghaKouchak A
Cui Z
Hsu K
Koren V
Nguyen P
Sanders B
Smith M
Sorooshian S
Thorstensen A
Publication venue: eScholarship, University of California
Publication date: 01/10/2016
Field of study

HiResFlood-UCI was developed by coupling the NWS's hydrologic model (HL-RDHM) with the hydraulic model (BreZo) for flash flood modeling at decameter resolutions. The coupled model uses HL-RDHM as a rainfall-runoff generator and replaces the routing scheme of HL-RDHM with the 2D hydraulic model (BreZo) in order to predict localized flood depths and velocities. A semi-automated technique of unstructured mesh generation was developed to cluster an adequate density of computational cells along river channels such that numerical errors are negligible compared with other sources of error, while ensuring that computational costs of the hydraulic model are kept to a bare minimum. HiResFlood-UCI was implemented for a watershed (ELDO2) in the DMIP2 experiment domain in Oklahoma. Using synthetic precipitation input, the model was tested for various components including HL-RDHM parameters (a priori versus calibrated), channel and floodplain Manning n values, DEM resolution (10 m versus 30 m) and computation mesh resolution (10 m+ versus 30 m+). Simulations with calibrated versus a priori parameters of HL-RDHM show that HiResFlood-UCI produces reasonable results with the a priori parameters from NWS. Sensitivities to hydraulic model resistance parameters, mesh resolution and DEM resolution are also identified, pointing to the importance of model calibration and validation for accurate prediction of localized flood intensities. HiResFlood-UCI performance was examined using 6 measured precipitation events as model input for model calibration and validation of the streamflow at the outlet. The Nash–Sutcliffe Efficiency (NSE) obtained ranges from 0.588 to 0.905. The model was also validated for the flooded map using USGS observed water level at an interior point. The predicted flood stage error is 0.82 m or less, based on a comparison to measured stage. Validation of stage and discharge predictions builds confidence in model predictions of flood extent and localized velocities, which are fundamental to reliable flash flood warning

eScholarship - University of California

Transferring big data across the globe

Author: Villa Adam H
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2012
Field of study

Transmitting data via the Internet is a routine and common task for users today. The amount of data being transmitted by the average user has dramatically increased over the past few years. Transferring a gigabyte of data in an entire day was normal, however users are now transmitting multiple gigabytes in a single hour. With the influx of big data and massive scientific data sets that are measured in tens of petabytes, a user has the propensity to transfer even larger amounts of data. When transferring data sets of this magnitude on public or shared networks, the performance of all workloads in the system will be impacted. This dissertation addresses the issues and challenges inherent with transferring big data over shared networks. A survey of current transfer techniques is provided and these techniques are evaluated in simulated, experimental and live environments. The main contribution of this dissertation is the development of a new, nice model for big data transfers, which is based on a store-and-forward methodology instead of an end-to-end approach. This nice model ensures that big data transfers only occur when there is idle bandwidth that can be repurposed for these large transfers. The nice model improves overall performance and significantly reduces the transmission time for big data transfers. The model allows for efficient transfers regardless of time zone differences or variations in bandwidth between sender and receiver. Nice is the first model that addresses the challenges of transferring big data across the globe

UNH Scholars' Repository

Resource and Application Models for Advanced Grid Schedulers

Author: Lazarevic A.
Sacks L.
Publication venue
Publication date: 01/09/2003
Field of study

As Grid computing is becoming an inevitable future, managing, scheduling and monitoring dynamic, heterogeneous resources will present new challenges. Solutions will have to be agile and adaptive, support self-organization and autonomous management, while maintaining optimal resource utilisation. Presented in this paper are basic principles and architectural concepts for efficient resource allocation in heterogeneous Grid environment

UCL Discovery

Analytical Performance Models of Parallel Programs in Clusters

Author: Blanco Vicente
Boullón Marcos
Cabaleiro José Carlos
Martínez Diego R.
Pena Tomás F.
Publication venue: John von Neumann Institute for Computing
Publication date: 01/01/2007
Field of study

Juelich Shared Electronic Resources