138 research outputs found

    Improving the User Experience of the rCUDA Remote GPU Virtualization Framework

    Get PDF
    Graphics processing units (GPUs) are being increasingly embraced by the high-performance computing community as an effective way to reduce execution time by accelerating parts of their applications. remote CUDA (rCUDA) was recently introduced as a software solution to address the high acquisition costs and energy consumption of GPUs that constrain further adoption of this technology. Specifically, rCUDA is a middleware that allows a reduced number of GPUs to be transparently shared among the nodes in a cluster. Although the initial prototype versions of rCUDA demonstrated its functionality, they also revealed concerns with respect to usability, performance, and support for new CUDA features. In response, in this paper, we present a new rCUDA version that (1) improves usability by including a new component that allows an automatic transformation of any CUDA source code so that it conforms to the needs of the rCUDA framework, (2) consistently features low overhead when using remote GPUs thanks to an improved new communication architecture, and (3) supports multithreaded applications and CUDA libraries. As a result, for any CUDA-compatible program, rCUDA now allows the use of remote GPUs within a cluster with low overhead, so that a single application running in one node can use all GPUs available across the cluster, thereby extending the single-node capability of CUDA. Copyright © 2014 John Wiley & Sons, Ltd.This work was funded by the Generalitat Valenciana under Grant PROMETEOII/2013/009 of the PROMETEO program phase II. The author from Argonne National Laboratory was supported by the US Department of Energy, Office of Science, under Contract No. DE-AC02-06CH11357. The authors are also grateful for the generous support provided by Mellanox Technologies.Reaño González, C.; Silla Jiménez, F.; Castello Gimeno, A.; Peña Monferrer, AJ.; Mayo Gual, R.; Quintana Ortí, ES.; Duato Marín, JF. (2015). Improving the User Experience of the rCUDA Remote GPU Virtualization Framework. Concurrency and Computation: Practice and Experience. 27(14):3746-3770. https://doi.org/10.1002/cpe.3409S374637702714NVIDIA NVIDIA industry cases http://www.nvidia.es/object/tesla-case-studiesFigueiredo, R., Dinda, P. A., & Fortes, J. (2005). Guest Editors’ Introduction: Resource Virtualization Renaissance. Computer, 38(5), 28-31. doi:10.1109/mc.2005.159Duato J Igual FD Mayo R Peña AJ Quintana-Ortí ES Silla F An efficient implementation of GPU virtualization in high performance clusters Euro-Par 2009 Workshops, ser. LNCS, 6043 Delft, Netherlands, 385 394Duato J Peña AJ Silla F Mayo R Quintana-Ortí ES Performance of CUDA virtualized remote GPUs in high performance clusters International Conference on Parallel Processing, Taipei, Taiwan 2011 365 374Duato J Peña AJ Silla F Fernández JC Mayo R Quintana-Ortí ES Enabling CUDA acceleration within virtual machines using rCUDA International Conference on High Performance Computing, Bangalore, India 2011 1 10Shi, L., Chen, H., Sun, J., & Li, K. (2012). vCUDA: GPU-Accelerated High-Performance Computing in Virtual Machines. IEEE Transactions on Computers, 61(6), 804-816. doi:10.1109/tc.2011.112Gupta V Gavrilovska A Schwan K Kharche H Tolia N Talwar V Ranganathan P GViM: GPU-accelerated virtual machines 3rd Workshop on System-Level Virtualization for High Performance Computing, Nuremberg, Germany 2009 17 24Giunta G Montella R Agrillo G Coviello G A GPGPU transparent virtualization component for high performance computing clouds Euro-Par 2010 - Parallel Processing, 6271 Ischia, Italy, 379 391Zillians VGPU http://www.zillians.com/vgpuLiang TY Chang YW GridCuda: a grid-enabled CUDA programming toolkit Proceedings of the 25th IEEE International Conference on Advanced Information Networking and Applications Workshops (WAINA), Biopolis, Singapore 2011 141 146Barak A Ben-Nun T Levy E Shiloh A Apackage for OpenCL based heterogeneous computing on clusters with many GPU devices Workshop on Parallel Programming and Applications on Accelerator Clusters, Heraklion, Crete, Greece 2010 1 7Xiao S Balaji P Zhu Q Thakur R Coghlan S Lin H Wen G Hong J Feng W-C VOCL: an optimized environment for transparent virtualization of graphics processing units Proceedings of InPar, San Jose, California, USA 2012 1 12Kim J Seo S Lee J Nah J Jo G Lee J SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters Proceedings of the 26th International Conference on Supercomputing, Venice, Italy 2012 341 352NVIDIA The NVIDIA CUDA Compiler Driver NVCC Version 5, NVIDIA 2012Quinlan D Panas T Liao C ROSE http://rosecompiler.org/Free Software Foundation, Inc. GCC, the GNU Compiler Collection http://gcc.gnu.org/LLVM Clang: a C language family frontend for LLVM http://clang.llvm.org/Martinez G Feng W Gardner M CU2CL: a CUDA-to-OpenCL Translator for Multi- and Many-core Architectures http://eprints.cs.vt.edu/archive/00001161/01/CU2CL.pdfLLVM The LLVM compiler infrastructure http://llvm.org/Reaño C Peña AJ Silla F Duato J Mayo R Quintana-Orti ES CU2rCU: towards the complete rCUDA remote GPU virtualization and sharing solution Proceedings of the 19th International Conference on High Performance Computing (HiPC), Pune, India 2012 1 10NVIDIA The NVIDIA GPU Computing SDK Version 4, NVIDIA 2011Sandia National Labs LAMMPS molecular dynamics simulator http://lammps.sandia.gov/Citrix Systems, Inc. Xen http://xen.org/Peña AJ Virtualization of accelerators in high performance clusters Ph.D. Thesis, 2013NVIDIA CUDA profiler user's guide version 5, NVIDIA 2012Igual, F. D., Chan, E., Quintana-Ortí, E. S., Quintana-Ortí, G., van de Geijn, R. A., & Van Zee, F. G. (2012). The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations. Journal of Parallel and Distributed Computing, 72(9), 1134-1143. doi:10.1016/j.jpdc.2011.10.014Slurm workload manager http://slurm.schedmd.co

    Boosting the performance of remote GPU virtualization using InfiniBand Connect-IB and PCIe 3.0

    Full text link
    © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] A clear trend has emerged involving the acceleration of scientific applications by using GPUs. However, the capabilities of these devices are still generally underutilized. Remote GPU virtualization techniques can help increase GPU utilization rates, while reducing acquisition and maintenance costs. The overhead of using a remote GPU instead of a local one is introduced mainly by the difference in performance between the internode network and the intranode PCIe link. In this paper we show how using the new InfiniBand Connect-IB network adapters (attaining similar throughput to that of the most recently emerged GPUs) boosts the performance of remote GPU virtualization, reducing the overhead to a mere 0.19% in the application tested.This work was funded by the Generalitat Valenciana under Grant PROMETEOII/2013/009 of the PROMETEO program phase II. This material is based upon work supported by the U. S. Department of Energy, Office of Science, Advanced Scientific Computing Research (SC-21), under Contract No. DE-AC02-06CH11357. Authors from the Universitat Politècnica de València and Universitat Jaume I are grateful for the generous support provided by Mellanox Technologies.Reaño González, C.; Silla Jiménez, F.; Peña Monferrer, AJ.; Shainer, G.; Schultz, S.; Castelló Gimeno, A.; Quintana Orti, ES.... (2014). Boosting the performance of remote GPU virtualization using InfiniBand Connect-IB and PCIe 3.0. En 2014 IEEE International Conference on Cluster Computing (CLUSTER). IEEE. 266-267. doi:10.1109/CLUSTER.2014.6968737S26626

    SLURM Support for Remote GPU Virtualization: Implementation and Performance Study

    Full text link
    © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.SLURM is a resource manager that can be leveraged to share a collection of heterogeneous resources among the jobs in execution in a cluster. However, SLURM is not designed to handle resources such as graphics processing units (GPUs). Concretely, although SLURM can use a generic resource plugin (GRes) to manage GPUs, with this solution the hardware accelerators can only be accessed by the job that is in execution on the node to which the GPU is attached. This is a serious constraint for remote GPU virtualization technologies, which aim at providing a user-transparent access to all GPUs in cluster, independently of the specific location of the node where the application is running with respect to the GPU node. In this work we introduce a new type of device in SLURM, "rgpu", in order to gain access from any application node to any GPU node in the cluster using rCUDA as the remote GPU virtualization solution. With this new scheduling mechanism, a user can access any number of GPUs, as SLURM schedules the tasks taking into account all the graphics accelerators available in the complete cluster. We present experimental results that show the benefits of this new approach in terms of increased flexibility for the job scheduler.The researchers at UPV were supported by the the Generalitat Valenciana under Grant PROMETEOII/2013/009 of the PROMETEO program phase II. Researchers at UJI were supported by MINECO, by FEDER funds under Grant TIN2011-23283, and by the Fundacion Caixa-Castelló Bancaixa (Grant P11B2013-21).Iserte Agut, S.; Castello Gimeno, A.; Mayo Gual, R.; Quintana Ortí, ES.; Silla Jiménez, F.; Duato Marín, JF.; Reaño González, C.... (2014). SLURM Support for Remote GPU Virtualization: Implementation and Performance Study. En Computer Architecture and High Performance Computing (SBAC-PAD), 2014 IEEE 26th International Symposium on. IEEE. 318-325. https://doi.org/10.1109/SBAC-PAD.2014.49S31832

    Calcineurin inhibitors cyclosporine A and tacrolimus induce vascular inflammation and endothelial activation through TLR4 signaling

    Full text link
    The introduction of the calcineurin inhibitors (CNIs) cyclosporine and tacrolimus greatly reduced the rate of allograft rejection, although their chronic use is marred by a range of side effects, among them vascular toxicity. In transplant patients, it is proved that innate immunity promotes vascular injury triggered by ischemia-reperfusion damage, atherosclerosis and hypertension. We hypothesized that activation of the innate immunity and inflammation may contribute to CNI toxicity, therefore we investigated whether TLR4 mediates toxic responses of CNIs in the vasculature. Cyclosporine and tacrolimus increased the production of proinflammatory cytokines and endothelial activation markers in cultured murine endothelial and vascular smooth muscle cells as well as in ex vivo cultures of murine aortas. CNI-induced proinflammatory events were prevented by pharmacological inhibition of TLR4. Moreover, CNIs were unable to induce inflammation and endothelial activation in aortas from TLR4−/− mice. CNI-induced cytokine and adhesion molecules synthesis in endothelial cells occurred even in the absence of calcineurin, although its expression was required for maximal effect through upregulation of TLR4 signaling. CNI-induced TLR4 activity increased O2 −/ROS production and NF-κB-regulated synthesis of proinflammatory factors in cultured as well as aortic endothelial and VSMCs. These data provide new insight into the mechanisms associated with CNI vascular inflammationThis work was supported by grants from the Instituto de Salud Carlos III (Ministerio de Economía Competitividad, Gobierno de España): FEDER funds ISCIII RETIC REDINREN RD12/0021, PI11/02242, PI13/00047, PI14/0041, PI14/00386, PI15/01460; Comunidad de Madrid (CIFRA S2010/BMD-2378); Sociedad Española de Nefrología. Salary support: RR-D: CIFRA; CO-S: Fundación Conchita Rábago de Jiménez Díaz; CG-G and RRR-D: REDINREN; AO: Programa Intensificación Actividad Investigadora (ISCIII/Agencia Laín-Entralgo/CM); JE and MRO: Universidad Autónoma de Madrid; AMR: Contrato Miguel Serve (ISCIII

    Estandarización de una bebida deslactosada a base de suero dulce de leche saborizado con pulpa de mora

    Get PDF
    El suero es un subproducto lácteo que se obtiene por la precipitación de la caseína en la producción de quesos, contiene alrededor del 50% de los sólidos de la leche. Durante muchos años se consideró como un desperdicio, siendo empleado principalmente para engorde de animales o vertido a corrientes de agua, sin embargo, este punto de vista ha cambiado debido a que este subproducto es una fuente rica de carbohidratos, proteínas, vitaminas, minerales y compuestos biológicamente activos y cada uno de ellos pueden ser aprovechados de agroindustrialmente (Poveda Elpidia, 2013). Es por esto, que el objetivo de esta investigación fue estandarizar una bebida a base de lactosuero saborizada con pulpa de mora y deslactosada a través de hidrolisis enzimática, mediante el uso de 2 formulaciones (F1 y F2). Se realizó una prueba evaluación sensorial de cada formulación, para establecer la mejor relación lactosuero/pulpa, una vez seleccionada la formulación de preferencia (F1) se realizó el perfil de sabor, la caracterización fisicoquímica (°Brix [refractométria], % Acidez titulable expresada en ácido láctico, pH [potenciométrica] y punto crioscópico), y microbiológica según la normativa nacional. Se reportó que la formulación F1 presentó un nivel de preferencia del 94%, sus valores fisicoquímicos y microbiológicos para el día 7 de elaboración cumplieron con los requisitos establecidos en la NTC 1419/2004 y la Resolución 3929/2013, es decir que la bebida estandarizada deslactosada a base de lactosuero y saborizada con pulpa de mora es apta para el consumo y no representan ningún riesgo a la salud pública; considerando viable el aprovechamiento del lactosuero en la preparación de derivados agroindustriale

    What can we learn from the innovative school for initial and ongoing teacher training? Research proposal

    Get PDF
    La formación inicial y permanente del profesorado es señalada como una de las claves fundamentales para la mejora del sistema. Sin embargo, para que esto ocurra, la formación del profesorado debe estar vinculada con prácticas educativas de calidad. En este artículo se defiende la investigación cualitativa en innovación educativa a través de los principios de la Teoría Fundamentada como una manera eficaz de construir una teoría sobre la innovación educativa que sea válida y útil para la transformación positiva del sistema educativo
    corecore