7 research outputs found

    Exploring Future Storage Options for ATLAS at the BNL/SDCC facility

    Get PDF
    The ATLAS experiment is expected to deliver an unprecedented amount of scientific data in the High Luminosity(HL-LHC) era. As the demand for disk storage capacity in ATLAS continues to rise steadily, the BNL Scientific Data and Computing Center (SDCC) faces challenges in terms of cost implications for maintaining multiple disk copies and adapting to the coming ATLAS storage requirements. To address these challenges, the SDCC Storage team has undertaken a thorough analysis of the ATLAS experiment’s requirements, matching them to suitable storage options and strategies, and has explored alternatives to enhance or replace the current storage solution. This paper aims to present the main challenges encountered while supporting big data experiments such as ATLAS. We describe the experiment’s specific requirements and priorities, particularly focusing on the critical storage system characteristics of the high-luminosity run and how the key storage components provided by the Storage team work together: the dCache disk storage system; its archival back-end, HPSS; and its OS-level backend Storage. Specifically, we investigate a novel approach to integrate Lustre and XRootD. In this setup, Lustre serves as backend storage and XRootD acts as an access layer frontend, supporting various grid access protocols. Additionally, we also describe the validation and commissioning tests, including the performance comparison between dCache and XRootd. Furthermore, we provide a performance and cost analysis comparing OpenZFS and LINUX MD RAID, evaluate different storage software stacks, and showcase stress tests conducted to validate Third Party Copy (TPC) functionality

    HetFS: A heterogeneous file system for everyone

    Get PDF
    Storage devices have been getting more and more diverse during the last decade. The advent of SSDs made it painfully clear that rotating devices, such as HDDs or magnetic tapes, were lacking in regards to response time. However, SSDs currently have a limited number of write cycles and a significantly larger price per capacity, which has prevented rotational technologies from begin abandoned. Additionally, Non-Volatile Memories (NVMs) have been lately gaining traction, offering devices that typically outperform NAND-based SSDs but exhibit a full new set of idiosyncrasies. Therefore, in order to appropriately support this diversity, intelligent mechanisms will be needed in the near-future to balance the benefits and drawbacks of each storage technology available to a system. In this paper, we present a first step towards such a mechanism called HetFS, an extension to the ZFS file system that is capable of choosing the storage device a file should be kept in according to preprogrammed filters. We introduce the prototype and show some preliminary results of the effects obtained when placing specific files into different devices.The research leading to these results has received funding from the European Community under the BIGStorage ETN (Project 642963 of the H2020-MSCA-ITN-2014), by the Spanish Ministry of Economy and Competitiveness under the TIN2015-65316 grant and by the Catalan Government under the 2014-SGR- 1051 grant. To learn more about the BigStorage project, please visit http: //bigstorage-project.eu/.Peer ReviewedPostprint (author's final draft

    Ironman: Open Source Containers and Virtualization in bare metal

    Get PDF
    Trabalho de projeto de mestrado, Engenharia Informática (Engenharia de Software) Universidade de Lisboa, Faculdade de Ciências, 2021Computer virtualization has become prevalent throughout the years for both business and personal use. It allows for hosting new machines, on computational resources that are left unused, running as independent computers. Apart from the traditional virtual machines, a more recent form of virtualization was introduced and will be explored in this project, containers, more specifically Linux Containers. While multiple virtualization tools are available, some of them require a premium payment, while others do not support container virtualization. For this project, LXD, an open source virtual instance manager, will be used to manage both virtual machines and containers. For added service availability, clustering support will also be developed. Clustering will enable multiple physical computers to host virtual instances as if they were a single machine. Coupled with the Ceph storage back end it allows for data to be replicated across all computers in the same cluster, enabling instance recovery when a computer from the cluster is faulty. The infrastructure deployment tool Puppet will be used to automate the installation and configuration of an LXD virtualization system for both a clustered and non clustered environment. This allows for simple and automatic physical host configuration limiting the required user input and thus decreasing the possibilities of system misconfiguration. LXD was tested for both environments and ultimately considered an effective virtualization tool, which when configured accordingly can be productized for a production environment

    Performance Implications of Memory Affinity on Filesystem Caches in a Non-Uniform Memory Access Environment

    Get PDF
    Non-Uniform Memory Access imposes unique challenges on every component of an operating system and the applications that run on it. One such component is the filesystem which, while not directly impacted by NUMA in most cases, typically has some form of cache whose performance is constrained by the latency and bandwidth of the memory that it is stored in. One such filesystem is ZFS, which contains its own custom caching system, known as the Adaptive Replacement Cache. This work looks at the impact of NUMA on this cache via sequential read operations, shows how current solutions intended to reduce this impact do not adequately account for these caches, and develops a prototype that reduces the impact of memory affinity by relocating applications to be closer to the caches that they use. This prototype is then tested and shown, in some situations, to restore the performance that would otherwise be lost

    A Performance Evaluation of Hypervisor, Unikernel, and Container Network I/O Virtualization

    Get PDF
    Hypervisors and containers are the two main virtualization techniques that enable cloud computing. Both techniques have performance overheads on CPU, memory, networking, and disk performance compared to bare metal. Unikernels have recently been proposed as an optimization for hypervisor-based virtualization to reduce performance overheads. In this thesis, we evaluate network I/O performance overheads for hypervisor-based virtualization using Kernel-based Virtual Machine (KVM) and the OSv unikernel and for container-based virtualization using Docker comparing the different configurations and optimizations. We measure the raw networking latency and throughput and CPU utilization by using the Netperf benchmarking tool and measure network intensive application performance using the Memcached key-value store and the Mutilate benchmarking tool. We show that compared to bare metal Linux, Docker with bridged networking has the least performance overhead with OSv using vhost-net coming a close second

    Estrategias de optimización y análisis de performance en sistemas de almacenamiento distribuido

    Get PDF
    La complejidad de los sistemas de almacenamiento es creciente dada la cantidad y simultaneidad de clientes conectados, los datos que son accedidos de manera concurrente, los usuarios distribuidos geográficamente, con tiempos de respuesta acotados y un volumen de información transferida que se amplía de forma exponencial. Respondiendo a estas necesidades, surgen soluciones de Almacenamiento Definido por Software (Software Defined Storage), en las que distintos dispositivos conectados a través de una red de datos forman un cluster que ofrece un conjunto de interfaces a las aplicaciones y clientes, y proveen un complejo sistema de gestión, mantenimiento y monitoreo de los distintos componentes. La tesis centra su análisis en Ceph, un sistema de almacenamiento distribuido de código abierto que corre sobre hardware genérico, diseñado para proveer escalabilidad, confiabilidad y alta performance. Se determinan metodología y métricas para el análisis de performance y se lleva adelante un procedimiento de optimización capa por capa, desde la interfaz de red y los discos, hasta las configuraciones en el cliente para lograr minimizar latencia y maximizar el througput. Finalmente, se estabece un modelo de performance a modo de "linea base", con el fin de monitorear el funcionamiento del cluster y utilizar como referencia en ajustes futuros.Facultad de Informátic

    Accelerating Network Communication and I/O in Scientific High Performance Computing Environments

    Get PDF
    High performance computing has become one of the major drivers behind technology inventions and science discoveries. Originally driven through the increase of operating frequencies and technology scaling, a recent slowdown in this evolution has led to the development of multi-core architectures, which are supported by accelerator devices such as graphics processing units (GPUs). With the upcoming exascale era, the overall power consumption and the gap between compute capabilities and I/O bandwidth have become major challenges. Nowadays, the system performance is dominated by the time spent in communication and I/O, which highly depends on the capabilities of the network interface. In order to cope with the extreme concurrency and heterogeneity of future systems, the software ecosystem of the interconnect needs to be carefully tuned to excel in reliability, programmability, and usability. This work identifies and addresses three major gaps in today's interconnect software systems. The I/O gap describes the disparity in operating speeds between the computing capabilities and second storage tiers. The communication gap is introduced through the communication overhead needed to synchronize distributed large-scale applications and the mixed workload. The last gap is the so called concurrency gap, which is introduced through the extreme concurrency and the inflicted learning curve posed to scientific application developers to exploit the hardware capabilities. The first contribution is the introduction of the network-attached accelerator approach, which moves accelerators into a "stand-alone" cluster connected through the Extoll interconnect. The novel communication architecture enables the direct accelerators communication without any host interactions and an optimal application-to-compute-resources mapping. The effectiveness of this approach is evaluated for two classes of accelerators: Intel Xeon Phi coprocessors and NVIDIA GPUs. The next contribution comprises the design, implementation, and evaluation of the support of legacy codes and protocols over the Extoll interconnect technology. By providing TCP/IP protocol support over Extoll, it is shown that the performance benefits of the interconnect can be fully leveraged by a broader range of applications, including the seamless support of legacy codes. The third contribution is twofold. First, a comprehensive analysis of the Lustre networking protocol semantics and interfaces is presented. Afterwards, these insights are utilized to map the LNET protocol semantics onto the Extoll networking technology. The result is a fully functional Lustre network driver for Extoll. An initial performance evaluation demonstrates promising bandwidth and message rate results. The last contribution comprises the design, implementation, and evaluation of two easy-to-use load balancing frameworks, which transparently distribute the I/O workload across all available storage system components. The solutions maximize the parallelization and throughput of file I/O. The frameworks are evaluated on the Titan supercomputing systems for three I/O interfaces. For example for large-scale application runs, POSIX I/O and MPI-IO can be improved by up to 50% on a per job basis, while HDF5 shows performance improvements of up to 32%
    corecore