122,349 research outputs found

    CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

    Get PDF
    Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.https://doi.org/10.1186/1471-2105-12-35

    ASETS: A SDN Empowered Task Scheduling System for HPCaaS on the Cloud

    Get PDF
    With increasing demands for High Performance Computing (HPC), new ideas and methods are emerged to utilize computing resources more efficiently. Cloud Computing appears to provide benefits such as resource pooling, broad network access and cost efficiency for the HPC applications. However, moving the HPC applications to the cloud can face several key challenges, primarily, the virtualization overhead, multi-tenancy and network latency. Software-Defined Networking (SDN) as an emerging technology appears to pave the road and provide dynamic manipulation of cloud networking such as topology, routing, and bandwidth allocation. This paper presents a new scheme called ASETS which targets dynamic configuration and monitoring of cloud networking using SDN to improve the performance of HPC applications and in particular task scheduling for HPC as a service on the cloud (HPCaaS). Further, SETSA, (SDN-Empowered Task Scheduler Algorithm) is proposed as a novel task scheduling algorithm for the offered ASETS architecture. SETSA monitors the network bandwidth to take advantage of its changes when submitting tasks to the virtual machines. Empirical analysis of the algorithm in different case scenarios show that SETSA has significant potentials to improve the performance of HPCaaS platforms by increasing the bandwidth efficiency and decreasing task turnaround time. In addition, SETSAW, (SETSA Window) is proposed as an improvement of the SETSA algorithm

    On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case

    Get PDF
    [EN] Graphics processing units (GPUs) are being adopted in many computing facilities given their extraordinary computing power, which makes it possible to accelerate many general purpose applications from different domains. However, GPUs also present several side effects, such as increased acquisition costs as well as larger space requirements. They also require more powerful energy supplies. Furthermore, GPUs still consume some amount of energy while idle, and their utilization is usually low for most workloads. In a similar way to virtual machines, the use of virtual GPUs may address the aforementioned concerns. In this regard, the remote GPU virtualization mechanism allows an application being executed in a node of the cluster to transparently use the GPUs installed at other nodes. Moreover, this technique allows to share the GPUs present in the computing facility among the applications being executed in the cluster. In this way, several applications being executed in different (or the same) cluster nodes can share 1 or more GPUs located in other nodes of the cluster. Sharing GPUs should increase overall GPU utilization, thus reducing the negative impact of the side effects mentioned before. Reducing the total amount of GPUs installed in the cluster may also be possible. In this paper, we explore some of the benefits that remote GPU virtualization brings to clusters. For instance, this mechanism allows an application to use all the GPUs present in the computing facility. Another benefit of this technique is that cluster throughput, measured as jobs completed per time unit, is noticeably increased when this technique is used. In this regard, cluster throughput can be doubled for some workloads. Furthermore, in addition to increase overall GPU utilization, total energy consumption can be reduced up to 40%. This may be key in the context of exascale computing facilities, which present an important energy constraint. Other benefits are related to the cloud computing domain, where a GPU can be easily shared among several virtual machines. Finally, GPU migration (and therefore server consolidation) is one more benefit of this novel technique.Generalitat Valenciana, Grant/Award Number: PROMETEOII/2013/009; MINECO and FEDER, Grant/Award Number: TIN2014-53495-RSilla Jiménez, F.; Iserte Agut, S.; Reaño González, C.; Prades, J. (2017). On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case. Concurrency and Computation Practice and Experience. 29(13):1-17. https://doi.org/10.1002/cpe.4072S1172913Wu H Diamos G Sheard T Red Fox: An execution environment for relational query processing on GPUs Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization CGO '14 Orlando, FL, USA ACM 2014 44:44 44:54Playne DP Hawick KA Data parallel three-dimensional cahn-hilliard field equation simulation on GPUs with CUDA Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA Las Vegas, Nevada, USA 2009Yamazaki, I., Dong, T., Solcà, R., Tomov, S., Dongarra, J., & Schulthess, T. (2013). Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems. Concurrency and Computation: Practice and Experience, 26(16), 2652-2666. doi:10.1002/cpe.3152Yuancheng Luo D Canny edge detection on NVIDIA CUDA IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008. CVPRW '08 Anchorage, AK, USA IEEE 2008 1 8Surkov, V. (2010). Parallel option pricing with Fourier space time-stepping method on graphics processing units. Parallel Computing, 36(7), 372-380. doi:10.1016/j.parco.2010.02.006Agarwal, P. K., Hampton, S., Poznanovic, J., Ramanthan, A., Alam, S. R., & Crozier, P. S. (2012). Performance modeling of microsecond scale biological molecular dynamics simulations on heterogeneous architectures. Concurrency and Computation: Practice and Experience, 25(10), 1356-1375. doi:10.1002/cpe.2943Yoo, A. B., Jette, M. A., & Grondona, M. (2003). SLURM: Simple Linux Utility for Resource Management. Lecture Notes in Computer Science, 44-60. doi:10.1007/10968987_3Silla F Prades J Iserte S Reaño C Remote GPU virtualization: Is it useful The 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era Barcelona, Spain IEEE Computer Society 2016 41 48Liang TY Chang YW GridCuda: A grid-enabled CUDA programming toolkit 2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications (WAINA) Biopolis, Singapore IEEE 2011 141 146Oikawa M Kawai A Nomura K Yasuoka K Yoshikawa K Narumi T DS-CUDA: A middleware to use many GPUs in the cloud environment Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis SCC '12 IEEE Computer Society Washington, DC, USA 2012 1207 1214Giunta G Montella R Agrillo G Coviello G A GPGPU transparent virtualization component for high performance computing clouds Euro-Par 2010 - Parallel Processing Ischia, Italy Springer 2010Shi L Chen H Sun J vCUDA: GPU accelerated high performance computing in virtual machines IEEE International Symposium on Parallel & Distributed Processing, 2009. IPDPS 2009 Rome, Italy IEEE 2009 1 11Gupta V Gavrilovska A Schwan K GViM: GPU-accelerated virtual machines Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing Nuremberg, Germany 2009 17 24Peña, A. J., Reaño, C., Silla, F., Mayo, R., Quintana-Ortí, E. S., & Duato, J. (2014). A complete and efficient CUDA-sharing solution for HPC clusters. Parallel Computing, 40(10), 574-588. doi:10.1016/j.parco.2014.09.011CUDA API Reference Manual 7.5 https://developer.nvidia.com/cuda-toolkit 2016Merritt AM Gupta V Verma A Gavrilovska A Schwan K Shadowfax: Scaling in heterogeneous cluster systems via GPGPU assemblies Proceedings of the 5th International Workshop on Virtualization Technologies in Distributed Computing VTDC '11 ACM New York, NY, USA 2011 3 10Shadowfax II - scalable implementation of GPGPU assemblies http://keeneland.gatech.edu/software/keeneland/kidronNVIDIA The NVIDIA GPU Computing SDK Version 5.5 2013iperf3: A TCP, UDP, and SCTP network bandwidth measurement tool https://github.com/esnet/iperf 2016Reaño C Silla F Shainer G Schultz S Local and remote GPUs perform similar with EDR 100G InfiniBand Proceedings of the Industrial Track of the 16th International Middleware Conference Middleware Industry '15 Vancouver, Canada 2015Reaño, C., Silla, F., Castelló, A., Peña, A. J., Mayo, R., Quintana-Ortí, E. S., & Duato, J. (2014). Improving the user experience of the rCUDA remote GPU virtualization framework. Concurrency and Computation: Practice and Experience, 27(14), 3746-3770. doi:10.1002/cpe.3409Iserte S Castelló A Mayo R Slurm support for remote GPU virtualization: Implementation and performance study 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) 2014 318 325Vouzis, P. D., & Sahinidis, N. V. (2010). GPU-BLAST: using graphics processors to accelerate protein sequence alignment. Bioinformatics, 27(2), 182-188. doi:10.1093/bioinformatics/btq644Brown, W. M., Kohlmeyer, A., Plimpton, S. J., & Tharrington, A. N. (2012). Implementing molecular dynamics on hybrid high performance computers – Particle–particle particle-mesh. Computer Physics Communications, 183(3), 449-459. doi:10.1016/j.cpc.2011.10.012Liu, Y., Schmidt, B., Liu, W., & Maskell, D. L. (2010). CUDA–MEME: Accelerating motif discovery in biological sequences using CUDA-enabled graphics processing units. Pattern Recognition Letters, 31(14), 2170-2177. doi:10.1016/j.patrec.2009.10.009Pronk, S., Páll, S., Schulz, R., Larsson, P., Bjelkmar, P., Apostolov, R., … Lindahl, E. (2013). GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics, 29(7), 845-854. doi:10.1093/bioinformatics/btt055Klus, P., Lam, S., Lyberg, D., Cheung, M., Pullan, G., McFarlane, I., … Lam, B. Y. (2012). BarraCUDA - a fast short read sequence aligner using graphics processing units. BMC Research Notes, 5(1), 27. doi:10.1186/1756-0500-5-27Kurtz, S., Phillippy, A., Delcher, A. L., Smoot, M., Shumway, M., Antonescu, C., & Salzberg, S. L. (2004). Genome Biology, 5(2), R12. doi:10.1186/gb-2004-5-2-r12Chang, C.-C., & Lin, C.-J. (2011). LIBSVM. ACM Transactions on Intelligent Systems and Technology, 2(3), 1-27. doi:10.1145/1961189.1961199Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., … Schulten, K. (2005). Scalable molecular dynamics with NAMD. Journal of Computational Chemistry, 26(16), 1781-1802. doi:10.1002/jcc.20289NVIDIA Popular GPU-Accelerated Applications Catalog http://www.nvidia.es/content/tesla/pdf/gpu-accelerated-applications-for-hpc.pdf 2016Walters JP Younge AJ Kang D-I GPU-passthrough performance: A comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications 7th IEEE International Conference on Cloud Computing (CLOUD 2014) Anchorage, AK, USA 2014Yang C-T Wang H-Y Ou W-S Liu Y-T Hsu C-H On implementation of GPU virtualization using PCI pass-through 2012 IEEE 4th International Conference on Cloud Computing Technology and Science (CLOUDCOM) Taipei, Taiwan 2012 711 716Pérez F Reaño C Silla F Providing CUDA acceleration to KVM virtual machines in InfiniBand clusters with rCUDA Proceedings of the International Conference on Distributed Applications and Interoperable Systems Crete, Greece 2016Jo, H., Jeong, J., Lee, M., & Choi, D. H. (2013). Exploiting GPUs in Virtual Machine for BioCloud. BioMed Research International, 2013, 1-11. doi:10.1155/2013/939460Prades J Reaño C Silla F CUDA acceleration for Xen virtual machines in Infiniband clusters with rCUDA Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming PPoPP '16 Barcelona, Spain 2016Mellanox Mellanox OFED for Linux User Manual 2015Liu, Y., Wirawan, A., & Schmidt, B. (2013). CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions. BMC Bioinformatics, 14(1). doi:10.1186/1471-2105-14-117Takizawa H Sato K Komatsu K Kobayashi H CheCUDA: A checkpoint/restart tool for CUDA applications Proceedings of the 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies Hiroshima, Japan 200

    Improving the User Experience of the rCUDA Remote GPU Virtualization Framework

    Get PDF
    Graphics processing units (GPUs) are being increasingly embraced by the high-performance computing community as an effective way to reduce execution time by accelerating parts of their applications. remote CUDA (rCUDA) was recently introduced as a software solution to address the high acquisition costs and energy consumption of GPUs that constrain further adoption of this technology. Specifically, rCUDA is a middleware that allows a reduced number of GPUs to be transparently shared among the nodes in a cluster. Although the initial prototype versions of rCUDA demonstrated its functionality, they also revealed concerns with respect to usability, performance, and support for new CUDA features. In response, in this paper, we present a new rCUDA version that (1) improves usability by including a new component that allows an automatic transformation of any CUDA source code so that it conforms to the needs of the rCUDA framework, (2) consistently features low overhead when using remote GPUs thanks to an improved new communication architecture, and (3) supports multithreaded applications and CUDA libraries. As a result, for any CUDA-compatible program, rCUDA now allows the use of remote GPUs within a cluster with low overhead, so that a single application running in one node can use all GPUs available across the cluster, thereby extending the single-node capability of CUDA. Copyright © 2014 John Wiley & Sons, Ltd.This work was funded by the Generalitat Valenciana under Grant PROMETEOII/2013/009 of the PROMETEO program phase II. The author from Argonne National Laboratory was supported by the US Department of Energy, Office of Science, under Contract No. DE-AC02-06CH11357. The authors are also grateful for the generous support provided by Mellanox Technologies.Reaño González, C.; Silla Jiménez, F.; Castello Gimeno, A.; Peña Monferrer, AJ.; Mayo Gual, R.; Quintana Ortí, ES.; Duato Marín, JF. (2015). Improving the User Experience of the rCUDA Remote GPU Virtualization Framework. Concurrency and Computation: Practice and Experience. 27(14):3746-3770. https://doi.org/10.1002/cpe.3409S374637702714NVIDIA NVIDIA industry cases http://www.nvidia.es/object/tesla-case-studiesFigueiredo, R., Dinda, P. A., & Fortes, J. (2005). Guest Editors’ Introduction: Resource Virtualization Renaissance. Computer, 38(5), 28-31. doi:10.1109/mc.2005.159Duato J Igual FD Mayo R Peña AJ Quintana-Ortí ES Silla F An efficient implementation of GPU virtualization in high performance clusters Euro-Par 2009 Workshops, ser. LNCS, 6043 Delft, Netherlands, 385 394Duato J Peña AJ Silla F Mayo R Quintana-Ortí ES Performance of CUDA virtualized remote GPUs in high performance clusters International Conference on Parallel Processing, Taipei, Taiwan 2011 365 374Duato J Peña AJ Silla F Fernández JC Mayo R Quintana-Ortí ES Enabling CUDA acceleration within virtual machines using rCUDA International Conference on High Performance Computing, Bangalore, India 2011 1 10Shi, L., Chen, H., Sun, J., & Li, K. (2012). vCUDA: GPU-Accelerated High-Performance Computing in Virtual Machines. IEEE Transactions on Computers, 61(6), 804-816. doi:10.1109/tc.2011.112Gupta V Gavrilovska A Schwan K Kharche H Tolia N Talwar V Ranganathan P GViM: GPU-accelerated virtual machines 3rd Workshop on System-Level Virtualization for High Performance Computing, Nuremberg, Germany 2009 17 24Giunta G Montella R Agrillo G Coviello G A GPGPU transparent virtualization component for high performance computing clouds Euro-Par 2010 - Parallel Processing, 6271 Ischia, Italy, 379 391Zillians VGPU http://www.zillians.com/vgpuLiang TY Chang YW GridCuda: a grid-enabled CUDA programming toolkit Proceedings of the 25th IEEE International Conference on Advanced Information Networking and Applications Workshops (WAINA), Biopolis, Singapore 2011 141 146Barak A Ben-Nun T Levy E Shiloh A Apackage for OpenCL based heterogeneous computing on clusters with many GPU devices Workshop on Parallel Programming and Applications on Accelerator Clusters, Heraklion, Crete, Greece 2010 1 7Xiao S Balaji P Zhu Q Thakur R Coghlan S Lin H Wen G Hong J Feng W-C VOCL: an optimized environment for transparent virtualization of graphics processing units Proceedings of InPar, San Jose, California, USA 2012 1 12Kim J Seo S Lee J Nah J Jo G Lee J SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters Proceedings of the 26th International Conference on Supercomputing, Venice, Italy 2012 341 352NVIDIA The NVIDIA CUDA Compiler Driver NVCC Version 5, NVIDIA 2012Quinlan D Panas T Liao C ROSE http://rosecompiler.org/Free Software Foundation, Inc. GCC, the GNU Compiler Collection http://gcc.gnu.org/LLVM Clang: a C language family frontend for LLVM http://clang.llvm.org/Martinez G Feng W Gardner M CU2CL: a CUDA-to-OpenCL Translator for Multi- and Many-core Architectures http://eprints.cs.vt.edu/archive/00001161/01/CU2CL.pdfLLVM The LLVM compiler infrastructure http://llvm.org/Reaño C Peña AJ Silla F Duato J Mayo R Quintana-Orti ES CU2rCU: towards the complete rCUDA remote GPU virtualization and sharing solution Proceedings of the 19th International Conference on High Performance Computing (HiPC), Pune, India 2012 1 10NVIDIA The NVIDIA GPU Computing SDK Version 4, NVIDIA 2011Sandia National Labs LAMMPS molecular dynamics simulator http://lammps.sandia.gov/Citrix Systems, Inc. Xen http://xen.org/Peña AJ Virtualization of accelerators in high performance clusters Ph.D. Thesis, 2013NVIDIA CUDA profiler user's guide version 5, NVIDIA 2012Igual, F. D., Chan, E., Quintana-Ortí, E. S., Quintana-Ortí, G., van de Geijn, R. A., & Van Zee, F. G. (2012). The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations. Journal of Parallel and Distributed Computing, 72(9), 1134-1143. doi:10.1016/j.jpdc.2011.10.014Slurm workload manager http://slurm.schedmd.co

    Dynamic Virtualized Deployment of Particle Physics Environments on a High Performance Computing Cluster

    Full text link
    The NEMO High Performance Computing Cluster at the University of Freiburg has been made available to researchers of the ATLAS and CMS experiments. Users access the cluster from external machines connected to the World-wide LHC Computing Grid (WLCG). This paper describes how the full software environment of the WLCG is provided in a virtual machine image. The interplay between the schedulers for NEMO and for the external clusters is coordinated through the ROCED service. A cloud computing infrastructure is deployed at NEMO to orchestrate the simultaneous usage by bare metal and virtualized jobs. Through the setup, resources are provided to users in a transparent, automatized, and on-demand way. The performance of the virtualized environment has been evaluated for particle physics applications
    corecore