91 research outputs found

    PROCESS Data Infrastructure and Data Services

    Get PDF
    Due to energy limitation and high operational costs, it is likely that exascale computing will not be achieved by one or two datacentres but will require many more. A simple calculation, which aggregates the computation power of the 2017 Top500 supercomputers, can only reach 418 petaflops. Companies like Rescale, which claims 1.4 exaflops of peak computing power, describes its infrastructure as composed of 8 million servers spread across 30 datacentres. Any proposed solution to address exascale computing challenges has to take into consideration these facts and by design should aim to support the use of geographically distributed and likely independent datacentres. It should also consider, whenever possible, the co-allocation of the storage with the computation as it would take 3 years to transfer 1 exabyte on a dedicated 100 Gb Ethernet connection. This means we have to be smart about managing data more and more geographically dispersed and spread across different administrative domains. As the natural settings of the PROCESS project is to operate within the European Research Infrastructure and serve the European research communities facing exascale challenges, it is important that PROCESS architecture and solutions are well positioned within the European computing and data management landscape namely PRACE, EGI, and EUDAT. In this paper we propose a scalable and programmable data infrastructure that is easy to deploy and can be tuned to support various data-intensive scientific applications

    ECHOFS: a scheduler-guided temporary filesystem to leverage node-local NVMS

    Get PDF
    © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.The growth in data-intensive scientific applications poses strong demands on the HPC storage subsystem, as data needs to be copied from compute nodes to I/O nodes and vice versa for jobs to run. The emerging trend of adding denser, NVM-based burst buffers to compute nodes, however, offers the possibility of using these resources to build temporary file systems with specific I/O optimizations for a batch job. In this work, we present echofs, a temporary filesystem that coordinates with the job scheduler to preload a job's input files into node-local burst buffers. We present the results measured with NVM emulation, and different FS backends with DAX/FUSE on a local node, to show the benefits of our proposal and such coordination.This work was partially supported by the Spanish Ministry of Science and Innovation under the TIN2015–65316 grant, the Generalitat de Catalunya under contract 2014– SGR–1051, as well as the European Union’s Horizon 2020 Research and Innovation Programme, under Grant Agreement no. 671951 (NEXTGenIO). Source code available at https://github.com/bsc-ssrg/echofs.Peer ReviewedPostprint (author's final draft

    Enabling Shared Memory Communication in Networks of MPSoCs

    Get PDF
    Ongoing transistor scaling and the growing complexity of embedded system designs has led to the rise of MPSoCs (Multi‐Processor System‐on‐Chip), combining multiple hard‐core CPUs and accelerators (FPGA, GPU) on the same physical die. These devices are of great interest to the supercomputing community, who are increasingly reliant on heterogeneity to achieve power and performance goals in these closing stages of the race to exascale. In this paper, we present a network interface architecture and networking infrastructure, designed to sit inside the FPGA fabric of a cutting‐edge MPSoC device, enabling networks of these devices to communicate within both a distributed and shared memory context, with reduced need for costly software networking system calls. We will present our implementation and prototype system and discuss the main design decisions relevant to the use of the Xilinx Zynq Ultrascale+, a state‐of‐the‐art MPSoC, and the challenges to be overcome given the device's limitations and constraints. We demonstrate the working prototype system connecting two MPSoCs, with communication between processor and remote memory region and accelerator. We then discuss the limitations of the current implementation and highlight areas of improvement to make this solution production‐ready

    Solutions for the optimization of the software interface on an FPGA-based NIC

    Get PDF
    The theme of the research is the study of solutions for the optimization of the software interface on FPGA-based Network Interface Cards. The research activity was carried out in the APE group at INFN (Istituto Nazionale di Fisica Nucleare), which has been historically active in designing of high performance scalable networks for hybrid nodes (CPU/GPU) clusters. The result of the research is validated on two projects the APE group is currently working on, both allowing fast prototyping for solutions and hardware-software co-design: APEnet (a PCIe FPGA-based 3D torus network controller) and NaNet (FPGA-based family of NICs mainly dedicated to real-time, low-latency computing systems such as fast control systems or High Energy Physics Data Acquisition Systems). NaNet is also used to validate a GPU-controlled device driver to improve network perfomances, i.e. even lower latency of the communication, while used in combination with existing user-space software. This research is also gaining results in the "Horizon2020 FET-HPC ExaNeSt project", which aims to prototype and develop solutions for some of the crucial problems on the way towards production of Exascale-level Supercomputers, where the APE group is actively contribuiting to the development of the network / interconnection infrastructure

    System-Level Support for Composition of Applications

    Get PDF
    ABSTRACT Current HPC system software lacks support for emerging application deployment scenarios that combine one or more simulations with in situ analytics, sometimes called multi-component or multi-enclave applications. This paper presents an initial design study, implementation, and evaluation of mechanisms supporting composite multi-enclave applications in the Hobbes exascale operating system. These mechanisms include virtualization techniques isolating application custom enclaves while using the vendor-supplied host operating system and high-performance inter-VM communication mechanisms. Our initial single-node performance evaluation of these mechanisms on multi-enclave science applications, both real and proxy, demonstrate the ability to support multi-enclave HPC job composition with minimal performance overhead

    Distributed Parallel Extreme Event Analysis in Next Generation Simulation Architectures

    Get PDF
    Numerical simulations present challenges as they reach exascale because they generate petabyte-scale data that cannot be saved without interrupting the simulation due to I/O constraints. Data scientists must be able to reduce, extract, and visualize the data while the simulation is running, which is essential for in transit and post analysis. Next generation architectures in supercomputing include a burst buïŹ€er technology composed of SSDs primarily for the use of checkpointing the simulation in case a restart is required. In the case of turbulence simulations, this checkpoint provides an opportunity to perform analysis on the data without interrupting the simulation. First, we present a method of extracting velocity data in high vorticity regions. This method requires calculating the vorticity of the entire dataset and identifying regions where the threshold is above a speciïŹed value. Next we create a 3D stencil from values above the threshold and dilate the stencil. Finally we use the stencil to extract velocity data from the original dataset. The result is a dataset that is over an order of magnitude smaller and contains all the data required to study extreme events and visualization of vorticity. The next extraction utilizes the zfp lossy compressor to compress the entire velocity dataset. The compressed representation results in a dataset an order of magnitude smaller than the raw simulation data. This provides the researcher approximate data not captured by the velocity extraction. The error introduced is bounded, and results in a dataset that is visually indistinguishable from the original dataset. Finally we present a modular distributed parallel extraction system. This system allows a data scientist to run the previously mentioned extraction algorithms in a distributed parallel cluster of burst buïŹ€er nodes. The extraction algorithms are built as modules for the system and run in parallel on burst buïŹ€er nodes. A feature extraction coordinator synchronizes the simulation with the extraction process. A data scientist only needs to write one module that performs the extraction or visualization on a single subset of data and the system will execute that module at scale on burst buïŹ€ers, managing all the communication, synchronization, and parallelism required to perform the analysis

    Pluggable Optical Connector Interfaces for Electro-Optical Circuit Boards

    Get PDF
    A study is hereby presented on system embedded photonic interconnect technologies, which would address the communications bottleneck in modern exascale data centre systems driven by exponentially rising consumption of digital information and the associated complexity of intra-data centre network management along with dwindling data storage capacities. It is proposed that this bottleneck be addressed by adopting within the system electro-optical printed circuit boards (OPCBs), on which conventional electrical layers provide power distribution and static or low speed signaling, but high speed signals are conveyed by optical channels on separate embedded optical layers. One crucial prerequisite towards adopting OPCBs in modern data storage and switch systems is a reliable method of optically connecting peripheral cards and devices within the system to an OPCB backplane or motherboard in a pluggable manner. However the large mechanical misalignment tolerances between connecting cards and devices inherent to such systems are contrasted by the small sizes of optical waveguides required to support optical communication at the speeds defined by prevailing communication protocols. An innovative approach is therefore required to decouple the contrasting mechanical tolerances in the electrical and optical domains in the system in order to enable reliable pluggable optical connectivity. This thesis presents the design, development and characterisation of a suite of new optical waveguide connector interface solutions for electro-optical printed circuit boards (OPCBs) based on embedded planar polymer waveguides and planar glass waveguides. The technologies described include waveguide receptacles allowing parallel fibre connectors to be connected directly to OPCB embedded planar waveguides and board-to-board connectors with embedded parallel optical transceivers allowing daughtercards to be orthogonally connected to an OPCB backplane. For OPCBs based on embedded planar polymer waveguides and embedded planar glass waveguides, a complete demonstration platform was designed and developed to evaluate the connector interfaces and the associated embedded optical interconnect. Furthermore a large portfolio of intellectual property comprising 19 patents and patent applications was generated during the course of this study, spanning the field of OPCBs, optical waveguides, optical connectors, optical assembly and system embedded optical interconnects
    • 

    corecore