377 research outputs found

    Assessing the Utility of a Personal Desktop Cluster

    Get PDF
    The computer workstation, introduced by Sun Microsystems in 1982, was the tool of choice for scientists and engineers as an interactive computing environment for the development of scientific codes. However, by the mid-1990s, the performance of workstations began to lag behind high-end commodity PCs. This, coupled with the disappearance of BSD-based operating systems in workstations and the emergence of Linux as an opensource operating system for PCs, arguably led to the demise of the workstation as we knew it. Around the same time, computational scientists started to leverage PCs running Linux to create a commodity-based (Beowulf) cluster that provided dedicated compute cycles, i.e., supercomputing for the rest of us, as a cost-effective alternative to large supercomputers, i.e., supercomputing for the few. However, as the cluster movement has matured, with respect to cluster hardware and open-source software, these clusters have become much more like their large-scale supercomputing brethren — a shared datacenter resource that resides in a machine room. Consequently, the above observations, when coupled with the ever-increasing performance gap between the PC and cluster supercomputer, provide the motivation for a personal desktop cluster workstation — a turnkey solution that provides an interactive and parallel computing environment with the approximate form factor of a Sun SPARCstation 1 “pizza box” workstation. In this paper, we present the hardware and software architecture of such a solution as well as its prowess as a developmental platform for parallel codes. In short, imagine a 12-node personal desktop cluster that achieves 14 Gflops on Linpack but sips only 150-180 watts of power, resulting in a performance-power ratio that is over 300% better than our test SMP platform

    Parallel cloth simulation using OpenMp and CUDA

    Get PDF
    The widespread availability of parallel computing architectures has lead to research regarding algorithms and techniques that best exploit available parallelism. In addition to the CPU parallelism available; the GPU has emerged as a parallel computational device. The goal of this study was to explore the combined use of CPU and GPU parallelism by developing a hybrid parallel CPU/GPU cloth simulation application. In order to evaluate the benefits of the hybrid approach, the application was first developed in sequential CPU form, followed by a parallel CPU form. The application uses Backward Euler implicit time integration to solve the differential equations of motion associated with the physical system. The Conjugate Gradient (CG) algorithm is used to determine the solution vector for the system of equations formed by the Backward Euler approach. The matrix/vector, vector/vector, and vector/scalar operations required by CG are handled by calls to BLAS level 1 and level 2 functions. In the sequential CPU and parallel CPU versions, the Intel Math Kernel Library implementation of BLAS is used. In the hybrid parallel CPU/GPU version, the Nvidia CUDA based BLAS implementation (CUBLAS) is used. In the parallel CPU and hybrid implementations, OpenMP directives are used to parallelize the force application loop that traverses the list of forces acting on the system. Runtimes were collected for each version of the application while simulating cloth meshes with particle resolutions of 20x20, 40x40, and 60x60. The performance of each version was compared at each mesh resolution. The level of performance degradation experienced when transitioning to the larger mesh sizes was also determined. The hybrid parallel CPU/GPU implementation yielded the highest frame rate for the 40x40 and 60x60 meshes. The parallel CPU implementation yielded the highest frame rate for the 20x20 mesh. The performance of the hybrid parallel CPU/GPU implementation degraded the least as it transitioned to the two larger mesh sizes. The results of this study will potentially lead to further research regarding the use of GPUs to perform the matrix/vector operations associated with the CG algorithm under more complex cloth simulation scenarios

    Green Supercomputing in a Desktop Box

    Full text link

    Impact of communication times on mixed CPU/GPU applications scheduling using KAAPI

    No full text
    National audienceHigh Performance Computing machines use more and more Graphical Processing Units as they are very efficient for homogeneous computation such as matrix operations. However before using these accelerators, one has to transfer data from the processor to them. Such a transfer can be slow. In this report, our aim is to study the impact of communication times on the makespan of a scheduling. Indeed, with a better anticipation of these communications, we could use the GPUs even more efficiently. More precisely, we will focus on machines with one or more GPUs and on applications with a low ratio of computations over communications. During this study, we have implemented two offline scheduling algorithms within XKAAPI's runtime. Then we have led an experimental study, combining these algorithms to highlight the impact of communication times. Finally our study has shown that, by using communication aware scheduling algorithms, we can reduce substantially the makespan of an application. Our experiments have shown a reduction of this makespan up to 64%64\% on a machine with several GPUs executing homogeneous computations

    INGEN's advanced IT facilities: The least you need to know

    Get PDF
    The facilities described in this document were made possible in part through funding from Indiana University, the Indiana University Office of the Vice President for Information Technology, the State of Indiana, Shared University Research Grants from IBM, Inc., and from the Lilly Endowment through their support o f the Indiana Genomics Initiative. The Indiana Genomics Initiative (INGEN) of Indiana University is supported in part by Lilly Endowment Inc

    University Information Technology Services' Advanced IT Facilities: The least every researcher needs to know

    Get PDF
    This is an archived document containing instructions for using IU's advanced IT facilities ca. 2003. A version of this document updated in 2011 is available from http://hdl.handle.net/2022/13620. Further versions are forthcoming.This document is designed to be read as a printed document, and designed to permit anyone at all familiar with computers and the Internet to start at the beginning, get a general overview of UITS' advanced IT facilities and what they offer, and then read the detailed portions of the document that are of interest. In many cases, examples are provided, as well as directions on how to download sample files. And in some cases there is information that one is best off really not learning – for example the process of logging into IU's IBM supercomputer the first time involves setup steps that should be followed, keystroke by keystroke, from the directions presented herein, and then promptly forgotten. This document is intended to be a starting point, not a comprehensive guide. As such it should get any reader off to a good start, but then point the reader in the direction of consulting staff and online resources that will permit the reader to get additional help and information as needed. Most of all, this document is provided for the convenience of researchers, who may peruse this information at their leisure. Our hope and expectation is that consultants in UITS will provide extensive help and programming assistance to IU researchers who wish to make use of these excellent IT facilities.The facilities described in this document were made possible in part through funding from Indiana University, the Indiana University Office of the Vice President for Information Technology, the State of Indiana, Shared University Research Grants from IBM, Inc., the National Science Foundation under Grant No. 0116050 and Grant CDA- 9601632, and from the Lilly Endowment through their support of the Indiana Genomics Initiative. The Indiana Genomics Initiative (INGEN) of Indiana University is supported in part by Lilly Endowment Inc

    The X-Files: Investigating Alien Performance in a Thin-client World

    Full text link
    Many scientific applications use the X11 window environment; an open source windows GUI standard employing a client/server architecture. X11 promotes: distributed computing, thin-client functionality, cheap desktop displays, compatibility with heterogeneous servers, remote services and administration, and greater maturity than newer web technologies. This paper details the author's investigations into close encounters with alien performance in X11-based seismic applications running on a 200-node cluster, backed by 2 TB of mass storage. End-users cited two significant UFOs (Unidentified Faulty Operations) i) long application launch times and ii) poor interactive response times. The paper is divided into three major sections describing Close Encounters of the 1st Kind: citings of UFO experiences, the 2nd Kind: recording evidence of a UFO, and the 3rd Kind: contact and analysis. UFOs do exist and this investigation presents a real case study for evaluating workload analysis and other diagnostic tools.Comment: 13 pages; Invited Lecture at the High Performance Computing Conference, University of Tromso, Norway, June 27-30, 199

    A Toolkit for Simulation of Desktop Grid Environment

    Get PDF
    Peer to Peers, clusters and grids enable a combination of heterogeneous distributed recourses to resolve problems in different fields such as science, engineering and commerce. Organizations within the world wide grid environment network are offering geographically distributed resources which are administrated by schedulers and policies. Studying the resources behavior is time consuming due to their unique behavior and uniqueness. In this type of environment it is nearly impossible to prove the effectiveness of a scheduling algorithm. Hence the main objective of this study is to develop a desktop grid simulator toolkit for measuring and modeling scheduler algorithm performance. The selected methodology for the application development is based on prototyping methodology. The prototypes will be developed using JAVA language united with a MySQL database. Core functionality of the simulator are job generation, volunteer generation, simulating algorithms, generating graphical charts and generating reports. A simulator for desktop grid environment has been developed using Java as the implementation language due to its wide popularity. The final system has been developed after a successful delivery of two prototypes. Despite the implementation of the mentioned core functionalities of a desktop grid simulator, advanced features such as viewing real-time graphical charts, generating PDF reports of the simulation result and exporting the final result as CSV files has been also included among the other features
    • …
    corecore