123 research outputs found

    Design of scalable Java message-passing communications over InfiniBand

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in The Journal of Supercomputing. The final authenticated version is available online at: https://doi.org/10.1007/s11227-011-0654-9[Abstract] This paper presents ibvdev a scalable and efficient low-level Java message-passing communication device over InfiniBand. The continuous increase in the number of cores per processor underscores the need for efficient communication support for parallel solutions. Moreover, current system deployments are aggregating a significant number of cores through advanced network technologies, such as InfiniBand, increasing the complexity of communication protocols, especially when dealing with hybrid shared/distributed memory architectures such as clusters. Here, Java represents an attractive choice for the development of communication middleware for these systems, as it provides built-in networking and multithreading support. As the gap between Java and compiled languages performance has been narrowing for the last years, Java is an emerging option for High Performance Computing (HPC). The developed communication middleware ibvdev increases Java applications performance on clusters of multicore processors interconnected via InfiniBand through: (1) providing Java with direct access to InfiniBand using InfiniBand Verbs API, somewhat restricted so far to MPI libraries; (2) implementing an efficient and scalable communication protocol which obtains start-up latencies and bandwidths similar to MPI performance results; and (3) allowing its integration in any Java parallel and distributed application. In fact, it has been successfully integrated in the Java messaging library MPJ Express. The experimental evaluation of this middleware on an InfiniBand cluster of multicore processors has shown significant point-to-point performance benefits, up to 85% start-up latency reduction and twice the bandwidth compared to previous Java middleware on InfiniBand. Additionally, the impact of ibvdev on message-passing collective operations is significant, achieving up to one order of magnitude performance increases compared to previous Java solutions, especially when combined with multithreading. Finally, the efficiency of this middleware, which is even competitive with MPI in terms of performance, increments the scalability of communications intensive Java HPC applications.Ministerio de Ciencia e Innovación; TIN2010-1673

    HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges

    Full text link
    High Performance Computing (HPC) clouds are becoming an alternative to on-premise clusters for executing scientific applications and business analytics services. Most research efforts in HPC cloud aim to understand the cost-benefit of moving resource-intensive applications from on-premise environments to public cloud platforms. Industry trends show hybrid environments are the natural path to get the best of the on-premise and cloud resources---steady (and sensitive) workloads can run on on-premise resources and peak demand can leverage remote resources in a pay-as-you-go manner. Nevertheless, there are plenty of questions to be answered in HPC cloud, which range from how to extract the best performance of an unknown underlying platform to what services are essential to make its usage easier. Moreover, the discussion on the right pricing and contractual models to fit small and large users is relevant for the sustainability of HPC clouds. This paper brings a survey and taxonomy of efforts in HPC cloud and a vision on what we believe is ahead of us, including a set of research challenges that, once tackled, can help advance businesses and scientific discoveries. This becomes particularly relevant due to the fast increasing wave of new HPC applications coming from big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR

    FastMPJ: a scalable and efficient Java message-passing library

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Cluster Computing. The final authenticated version is available online at: http://dx.doi.org/https://doi.org/10.1007/s10586-014-0345-4[Abstract] The performance and scalability of communications are key for high performance computing (HPC) applications in the current multi-core era. Despite the significant benefits (e.g., productivity, portability, multithreading) of Java for parallel programming, its poor communications support has hindered its adoption in the HPC community. This paper presents FastMPJ, an efficient message-passing in Java (MPJ) library, boosting Java for HPC by: (1) providing high-performance shared memory communications using Java threads; (2) taking full advantage of high-speed cluster networks (e.g., InfiniBand) to provide low-latency and high bandwidth communications; (3) including a scalable collective library with topology aware primitives, automatically selected at runtime; (4) avoiding Java data buffering overheads through zero-copy protocols; and (5) implementing the most widely extended MPI-like Java bindings for a highly productive development. The comprehensive performance evaluation on representative testbeds (InfiniBand, 10 Gigabit Ethernet, Myrinet, and shared memory systems) has shown that FastMPJ communication primitives rival native MPI implementations, significantly improving the efficiency and scalability of Java HPC parallel applications.Ministerio de Educación y Ciencia; AP2010-4348Ministerio de Economía y Competitividad; TIN2010-16735Xunta de Galicia; CN2012/211Xunta de Galicia; GRC2013/05

    Low‐latency Java communication devices on RDMA‐enabled networks

    Get PDF
    This is the peer reviewed version of the following article: Expósito, R. R., Taboada, G. L., Ramos, S., Touriño, J., & Doallo, R. (2015). Low‐latency Java communication devices on RDMA‐enabled networks. Concurrency and Computation: Practice and Experience, 27(17), 4852-4879., which has been published in final form at https://doi.org/10.1002/cpe.3473. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.[Abstract] Providing high‐performance inter‐node communication is a key capability for running high performance computing applications efficiently on parallel architectures. In fact, current systems deployments are aggregating a significant number of cores interconnected via advanced networking hardware with Remote Direct Memory Access (RDMA) mechanisms, that enable zero‐copy and kernel‐bypass features. The use of Java for parallel programming is becoming more promising thanks to some useful characteristics of this language, particularly its built‐in multithreading support, portability, easy‐to‐learn properties, and high productivity, along with the continuous increase in the performance of the Java virtual machine. However, current parallel Java applications generally suffer from inefficient communication middleware, mainly based on protocols with high communication overhead that do not take full advantage of RDMA‐enabled networks. This paper presents efficient low‐level Java communication devices that overcome these constraints by fully exploiting the underlying RDMA hardware, providing low‐latency and high‐bandwidth communications for parallel Java applications. The performance evaluation conducted on representative RDMA networks and parallel systems has shown significant point‐to‐point performance increases compared with previous Java communication middleware, allowing to obtain up to 40% improvement in application‐level performance on 4096 cores of a Cray XE6 supercomputer.Ministerio de Economía y Competitividad; TIN2013-42148-PXunta de Galicia; GRC2013/055Ministerio de Educación y Ciencia; AP2010-434

    Checkpointing as a Service in Heterogeneous Cloud Environments

    Get PDF
    A non-invasive, cloud-agnostic approach is demonstrated for extending existing cloud platforms to include checkpoint-restart capability. Most cloud platforms currently rely on each application to provide its own fault tolerance. A uniform mechanism within the cloud itself serves two purposes: (a) direct support for long-running jobs, which would otherwise require a custom fault-tolerant mechanism for each application; and (b) the administrative capability to manage an over-subscribed cloud by temporarily swapping out jobs when higher priority jobs arrive. An advantage of this uniform approach is that it also supports parallel and distributed computations, over both TCP and InfiniBand, thus allowing traditional HPC applications to take advantage of an existing cloud infrastructure. Additionally, an integrated health-monitoring mechanism detects when long-running jobs either fail or incur exceptionally low performance, perhaps due to resource starvation, and proactively suspends the job. The cloud-agnostic feature is demonstrated by applying the implementation to two very different cloud platforms: Snooze and OpenStack. The use of a cloud-agnostic architecture also enables, for the first time, migration of applications from one cloud platform to another.Comment: 20 pages, 11 figures, appears in CCGrid, 201

    The Sicilian Grid Infrastructure for High Performance Computing

    Get PDF
    The conjugation of High Performance Computing (HPC) and Grid paradigm with applications based on commercial software is one among the major challenges of today e-Infrastructures. Several research communities from either industry or academia need to run high parallel applications based on licensed software over hundreds of CPU cores; a satisfactory fulfillment of such requests is one of the keys for the penetration of this computing paradigm into the industry world and sustainability of Grid infrastructures. This problem has been tackled in the context of the PI2S2 project that created a regional e-Infrastructure in Sicily, the first in Italy over a regional area. Present article will describe the features added in order to integrate an HPC facility into the PI2S2 Grid infrastructure, the adoption of the InifiniBand low-latency net connection, the gLite middleware extended to support MPI/MPI2 jobs, the newly developed license server and the specific scheduling policy adopted. Moreover, it will show the results of some relevant use cases belonging to Computer Fluid-Dynamics (Fluent, OpenFOAM), Chemistry (GAMESS), Astro-Physics (Flash) and Bio-Informatics (ClustalW))

    The Sicilian Grid Infrastructure for High Performance Computing

    Get PDF
    The conjugation of High Performance Computing (HPC) and Grid paradigm with applications based on commercial software is one among the major challenges of today e-Infrastructures. Several research communities from either industry or academia need to run high parallel applications based on licensed software over hundreds of CPU cores; a satisfactory fulfillment of such requests is one of the keys for the penetration of this computing paradigm into the industry world and sustainability of Grid infrastructures. This problem has been tackled in the context of the PI2S2 project that created a regional e-Infrastructure in Sicily, the first in Italy over a regional area. Present paper will describe the features added in order to integrate an HPC facility into the PI2S2 Grid infrastructure, the adoption of the InifiniBand low-latency net connection, the gLite middleware extended to support MPI/MPI2 jobs, the newly developed license server and the specific scheduling policy adopted. Moreover, it will show the results of some relevant use cases belonging to Computer Fluid-Dynamics (Fluent, OpenFOAM), Chemistry (GAMESS), Astro-Physics (Flash) and Bio-Informatics (ClustalW))

    Design and Evaluation of Low-Latency Communication Middleware on High Performance Computing Systems

    Get PDF
    [Resumen]El interés en Java para computación paralela está motivado por sus interesantes características, tales como su soporte multithread, portabilidad, facilidad de aprendizaje,alta productividad y el aumento significativo en su rendimiento omputacional. No obstante, las aplicaciones paralelas en Java carecen generalmente de mecanismos de comunicación eficientes, los cuales utilizan a menudo protocolos basados en sockets incapaces de obtener el máximo provecho de las redes de baja latencia, obstaculizando la adopción de Java en computación de altas prestaciones (High Per- formance Computing, HPC). Esta Tesis Doctoral presenta el diseño, implementación y evaluación de soluciones de comunicación en Java que superan esta limitación. En consecuencia, se desarrollaron múltiples dispositivos de comunicación a bajo nivel para paso de mensajes en Java (Message-Passing in Java, MPJ) que aprovechan al máximo el hardware de red subyacente mediante operaciones de acceso directo a memoria remota que proporcionan comunicaciones de baja latencia. También se incluye una biblioteca de paso de mensajes en Java totalmente funcional, FastMPJ, en la cual se integraron los dispositivos de comunicación. La evaluación experimental ha mostrado que las primitivas de comunicación de FastMPJ son competitivas en comparación con bibliotecas nativas, aumentando significativamente la escalabilidad de aplicaciones MPJ. Por otro lado, esta Tesis analiza el potencial de la computación en la nube (cloud computing) para HPC, donde el modelo de distribución de infraestructura como servicio (Infrastructure as a Service, IaaS) emerge como una alternativa viable a los sistemas HPC tradicionales. La evaluación del rendimiento de recursos cloud específicos para HPC del proveedor líder, Amazon EC2, ha puesto de manifiesto el impacto significativo que la virtualización impone en la red, impidiendo mover las aplicaciones intensivas en comunicaciones a la nube. La clave reside en un soporte de virtualización apropiado, como el acceso directo al hardware de red, junto con las directrices para la optimización del rendimiento sugeridas en esta Tesis.[Resumo]O interese en Java para computación paralela está motivado polas súas interesantes características, tales como o seu apoio multithread, portabilidade, facilidade de aprendizaxe, alta produtividade e o aumento signi cativo no seu rendemento computacional. No entanto, as aplicacións paralelas en Java carecen xeralmente de mecanismos de comunicación e cientes, os cales adoitan usar protocolos baseados en sockets que son incapaces de obter o máximo proveito das redes de baixa latencia, obstaculizando a adopción de Java na computación de altas prestacións (High Performance Computing, HPC). Esta Tese de Doutoramento presenta o deseño, implementaci ón e avaliación de solucións de comunicación en Java que superan esta limitación. En consecuencia, desenvolvéronse múltiples dispositivos de comunicación a baixo nivel para paso de mensaxes en Java (Message-Passing in Java, MPJ) que aproveitan ao máaximo o hardware de rede subxacente mediante operacións de acceso directo a memoria remota que proporcionan comunicacións de baixa latencia. Tamén se inclúe unha biblioteca de paso de mensaxes en Java totalmente funcional, FastMPJ, na cal foron integrados os dispositivos de comunicación. A avaliación experimental amosou que as primitivas de comunicación de FastMPJ son competitivas en comparación con bibliotecas nativas, aumentando signi cativamente a escalabilidade de aplicacións MPJ. Por outra banda, esta Tese analiza o potencial da computación na nube (cloud computing) para HPC, onde o modelo de distribución de infraestrutura como servizo (Infrastructure as a Service, IaaS) xorde como unha alternativa viable aos sistemas HPC tradicionais. A ampla avaliación do rendemento de recursos cloud específi cos para HPC do proveedor líder, Amazon EC2, puxo de manifesto o impacto signi ficativo que a virtualización impón na rede, impedindo mover as aplicacións intensivas en comunicacións á nube. A clave atópase no soporte de virtualización apropiado, como o acceso directo ao hardware de rede, xunto coas directrices para a optimización do rendemento suxeridas nesta Tese.[Abstract]The use of Java for parallel computing is becoming more promising owing to its appealing features, particularly its multithreading support, portability, easy-tolearn properties, high programming productivity and the noticeable improvement in its computational performance. However, parallel Java applications generally su er from inefficient communication middleware, most of which use socket-based protocols that are unable to take full advantage of high-speed networks, hindering the adoption of Java in the High Performance Computing (HPC) area. This PhD Thesis presents the design, development and evaluation of scalable Java communication solutions that overcome these constraints. Hence, we have implemented several lowlevel message-passing devices that fully exploit the underlying network hardware while taking advantage of Remote Direct Memory Access (RDMA) operations to provide low-latency communications. Moreover, we have developed a productionquality Java message-passing middleware, FastMPJ, in which the devices have been integrated seamlessly, thus allowing the productive development of Message-Passing in Java (MPJ) applications. The performance evaluation has shown that FastMPJ communication primitives are competitive with native message-passing libraries, improving signi cantly the scalability of MPJ applications. Furthermore, this Thesis has analyzed the potential of cloud computing towards spreading the outreach of HPC, where Infrastructure as a Service (IaaS) o erings have emerged as a feasible alternative to traditional HPC systems. Several cloud resources from the leading IaaS provider, Amazon EC2, which speci cally target HPC workloads, have been thoroughly assessed. The experimental results have shown the signi cant impact that virtualized environments still have on network performance, which hampers porting communication-intensive codes to the cloud. The key is the availability of the proper virtualization support, such as the direct access to the network hardware, along with the guidelines for performance optimization suggested in this Thesis
    corecore