Search CORE

851 research outputs found

State-of-the-Art in Parallel Computing with R

Author: Eddelbuettel Dirk
Mansmann Ulrich
Morgan Martin
Schmidberger Markus
Tierney Luke
Yu Hao
Publication venue
Publication date: 01/01/2009
Field of study

R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing. This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance. Two packages (snow, Rmpi) stand out as particularly useful for general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems four different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix

Crossref

Directory of Open Access Journals

Open Access LMU

Journal of Statistical Software

Developing High Performance Computing Resources for Teaching Cluster and Grid Computing courses

Author: Amy Apon
Brim
Coveney
Craig Kirkwood
Ian Foster
Ian Foster
Imperial College
Jack Dongarra
Neil Geddes
Sloan
Violeta Holmes
Violeta Holmes
Violeta Holmes
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

High-Performance Computing (HPC) and the ability to process large amounts of data are of paramount importance for UK business and economy as outlined by Rt Hon David Willetts MP at the HPC and Big Data conference in February 2014. However there is a shortage of skills and available training in HPC to prepare and expand the workforce for the HPC and Big Data research and development. Currently, HPC skills are acquired mainly by students and staff taking part in HPC-related research projects, MSc courses, and at the dedicated training centres such as Edinburgh University’s EPCC. There are few UK universities teaching the HPC, Clusters and Grid Computing courses at the undergraduate level. To address the issue of skills shortages in the HPC it is essential to provide teaching and training as part of both postgraduate and undergraduate courses. The design and development of such courses is challenging since the technologies and software in the fields of large scale distributed systems such as Cluster, Cloud and Grid computing are undergoing continuous change. The students completing the HPC courses should be proficient in these evolving technologies and equipped with practical and theoretical skills for future jobs in this fast developing area. In this paper we present our experience in developing the HPC, Cluster and Grid modules including a review of existing HPC courses offered at the UK universities. The topics covered in the modules are described, as well as the coursework projects based on practical laboratory work. We conclude with an evaluation based on our experience over the last ten years in developing and delivering the HPC modules on the undergraduate courses, with suggestions for future work

Elsevier - Publisher Connector

Crossref

University of Huddersfield Repository

Huddersfield Research Portal

State of the Art in Parallel Computing with R

Author: Dirk Eddelbuettel
Hao Yu
Luke Tierney
Markus Schmidberger
Martin Morgan
Ulrich Mansmann
Publication venue
Publication date
Field of study

R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing. This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance. Two packages (snow, Rmpi) stand out as particularly suited to general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems five different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix.

Research Papers in Economics

EGI user forum 2011 : book of abstracts

Author
Publication venue
Publication date: 01/01/2011
Field of study

Hochschulschriftenserver - Universität Frankfurt am Main

East Lancashire Research 2007

Author: Kirkup Stephen Martin
Publication venue: University Centre at Blackburn College
Publication date
Field of study

CLoK

Building a High Performance Cluster through Computer Reuse

Author: Clark Christopher Thomas
Publication venue: Digital WPI
Publication date: 26/10/2010
Field of study

The goal of this MQP was to use outdated commodity PCs to build a cluster computer capable of being used for computationally intensive research. Traditionally, PCs are recycled after being taken out of use during a refresh cycle. By using open source software and minimal hardware modifications these PCs can be configured into a cluster that approaches the performance of state-of-the-art cluster computers. In this paper, we present the system setup, configuration, and validation of the WOPPR cluster computer

DigitalCommons@WPI

サービスシコウガタグリッドアプリケーションノカイハツシュホウトツールニカンスルケンキュウ

Author: Ichikawa Kohei
イチカワコウヘイ
市川昊平
Publication venue
Publication date
Field of study

Osaka University Knowledge Archive

Parallelizing the Cluster Rank Analysis application

Author: Esposito Anthony G., Jr.
Publication venue: RIT Scholar Works
Publication date: 01/01/2007
Field of study

A wide range of researchers is beginning to utilize customized statistical methods for analyzing data as hardware and software become cheaper and more widely available. Cluster Rank Analysis (CRA) is an existing multivariate statistical algorithm that existed as an inefficient service-oriented application. Here it is described how CRA was optimized and parallelized using an available computing cluster and both open source and custom software. This was followed by the development of a command-line submission system for CRA jobs, as well as a Web retrieval system for the results of analyses. A subsequent timing study revealed speedup that quickly rose to 15 by the use 35 processors, and should reach a proposed maximum of 19 given over 100 processors. It was found that this speedup was limited primarily by the serial portion of code; the Ethernet communication network was sufficient for this application. By the time that even 10 processors were involved in parallel runs, the average runtime had dropped from over 100 minutes to approximately 15 minutes, before being reduced to 6 minutes by 80 processors. The locations of bottlenecks suggest that further performance increases are possible through additional parallelization. This work with CRA illustrates (1) the speed with which high-performance in-house applications can be developed and (2) the speed and efficiency with which statistical analyses of complex data structures can be carried out given commodity hardware and software resources

RIT Scholar Works