111,127 research outputs found

    High-Performance Cloud Computing: A View of Scientific Applications

    Full text link
    Scientific computing often requires the availability of a massive number of computers for performing large scale experiments. Traditionally, these needs have been addressed by using high-performance computing solutions and installed facilities such as clusters and super computers, which are difficult to setup, maintain, and operate. Cloud computing provides scientists with a completely new model of utilizing the computing infrastructure. Compute resources, storage resources, as well as applications, can be dynamically provisioned (and integrated within the existing infrastructure) on a pay per use basis. These resources can be released when they are no more needed. Such services are often offered within the context of a Service Level Agreement (SLA), which ensure the desired Quality of Service (QoS). Aneka, an enterprise Cloud computing solution, harnesses the power of compute resources by relying on private and public Clouds and delivers to users the desired QoS. Its flexible and service based infrastructure supports multiple programming paradigms that make Aneka address a variety of different scenarios: from finance applications to computational science. As examples of scientific computing in the Cloud, we present a preliminary case study on using Aneka for the classification of gene expression data and the execution of fMRI brain imaging workflow.Comment: 13 pages, 9 figures, conference pape

    funcX: A Federated Function Serving Fabric for Science

    Full text link
    Exploding data volumes and velocities, new computational methods and platforms, and ubiquitous connectivity demand new approaches to computation in the sciences. These new approaches must enable computation to be mobile, so that, for example, it can occur near data, be triggered by events (e.g., arrival of new data), be offloaded to specialized accelerators, or run remotely where resources are available. They also require new design approaches in which monolithic applications can be decomposed into smaller components, that may in turn be executed separately and on the most suitable resources. To address these needs we present funcX---a distributed function as a service (FaaS) platform that enables flexible, scalable, and high performance remote function execution. funcX's endpoint software can transform existing clouds, clusters, and supercomputers into function serving systems, while funcX's cloud-hosted service provides transparent, secure, and reliable function execution across a federated ecosystem of endpoints. We motivate the need for funcX with several scientific case studies, present our prototype design and implementation, show optimizations that deliver throughput in excess of 1 million functions per second, and demonstrate, via experiments on two supercomputers, that funcX can scale to more than more than 130000 concurrent workers.Comment: Accepted to ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2020). arXiv admin note: substantial text overlap with arXiv:1908.0490

    An Efficient Network API for in-Kernel Applications in Clusters

    Get PDF
    International audienceRunning parallel applications on clusters with high-speed local networks requires fast communication between computing nodes but also low latency and high bandwidth file access. However, the application programming interfaces of high-speed local networks were designed for MPI communication and do not always meet the requirements of other applications like distributed file systems. In this paper, we explore several solutions to improve the use of high-speed network for in-kernel applications. Distributed file systems implemented on top of the GM interface of Myrinet are first examined to demonstrate how hard it is to get an efficient interaction between such applications and the network. Then, we propose solutions to simplify and improve this interaction and integrate them into the kernel interface of the new Myrinet. Performance comparisons between MX and GM, and their usage in both a distributed file system and a zero-copy protocol show nice improvements. Moreover, we are able to improve the performance of the flexible kernel API we designed in MX that allows to remove some intermediate copy

    HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges

    Full text link
    High Performance Computing (HPC) clouds are becoming an alternative to on-premise clusters for executing scientific applications and business analytics services. Most research efforts in HPC cloud aim to understand the cost-benefit of moving resource-intensive applications from on-premise environments to public cloud platforms. Industry trends show hybrid environments are the natural path to get the best of the on-premise and cloud resources---steady (and sensitive) workloads can run on on-premise resources and peak demand can leverage remote resources in a pay-as-you-go manner. Nevertheless, there are plenty of questions to be answered in HPC cloud, which range from how to extract the best performance of an unknown underlying platform to what services are essential to make its usage easier. Moreover, the discussion on the right pricing and contractual models to fit small and large users is relevant for the sustainability of HPC clouds. This paper brings a survey and taxonomy of efforts in HPC cloud and a vision on what we believe is ahead of us, including a set of research challenges that, once tackled, can help advance businesses and scientific discoveries. This becomes particularly relevant due to the fast increasing wave of new HPC applications coming from big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR

    Using offline routing to implement a low latenc 3D FFT in a multinode FPGA system

    Full text link
    Thesis (M.S.)--Boston UniversityApplications that require highly parallel computing along with low latency communication due to strong scaling, such as a calculating a 3D FFT for Molecular Dynamics simulations, can be problematic for traditional high performance computing (HPC) clusters. A multinode FPGA array is a good solution for these types of problems due to the direct high speed connections and flexible internal fabric inherent in FPGAs. Offline routing uses precomputed routing information to direct packets and can avoid much of the switching and congestion communication overhead. Two architectures are explored here which show the feasibility ofusing offline routing techniques to reduce communication latencies in FPGA systems. The first architecture targets a single FPGA that was built for initial exploration and to show how the powerful and flexible a single FPGA can be. It attained a maximum clock frequency of 102MHz and latencies of 64us and 250us for 3D FFT calculations of 32^3 and 64^3 data points respectively. The second architecture targets an FPGA that is intended to be the model for each node in the array. The best multinode version is based on a multilevel switching architecture. It has a maximum clock frequency of 134MHz. When scaled to a cluster, latencies project to 2.4us and 5.5us for 3D FFT calculations of 32^3 and 64^3 data points respectively. The two designs show the potential for using a single FPGA and multi-FPGA arrays for HPC applications where communication latency is critical to the application
    corecore