209 research outputs found

    The evolution of bits and bottlenecks in a scientific workflow trying to keep up with technology: Accelerating 4D image segmentation applied to nasa data

    Get PDF
    In 2016, a team of earth scientists directly engaged a team of computer scientists to identify cyberinfrastructure (CI) approaches that would speed up an earth science workflow. This paper describes the evolution of that workflow as the two teams bridged CI and an image segmentation algorithm to do large scale earth science research. The Pacific Research Platform (PRP) and The Cognitive Hardware and Software Ecosystem Community Infrastructure (CHASE-CI) resources were used to significantly decreased the earth science workflow's wall-clock time from 19.5 days to 53 minutes. The improvement in wall-clock time comes from the use of network appliances, improved image segmentation, deployment of a containerized workflow, and the increase in CI experience and training for the earth scientists. This paper presents a description of the evolving innovations used to improve the workflow, bottlenecks identified within each workflow version, and improvements made within each version of the workflow, over a three-year time period

    Deployment of NFV and SFC scenarios

    Get PDF
    Aquest ítem conté el treball original, defensat públicament amb data de 24 de febrer de 2017, així com una versió millorada del mateix amb data de 28 de febrer de 2017. Els canvis introduïts a la segona versió són 1) correcció d'errades 2) procediment del darrer annex.Telecommunications services have been traditionally designed linking hardware devices and providing mechanisms so that they can interoperate. Those devices are usually specific to a single service and are based on proprietary technology. On the other hand, the current model works by defining standards and strict protocols to achieve high levels of quality and reliability which have defined the carrier-class provider environment. Provisioning new services represent challenges at different levels because inserting the required devices involve changes in the network topology. This leads to slow deployment times and increased operational costs. To overcome the current burdens network function installation and insertion processes into the current service topology needs to be streamlined to allow greater flexibility. The current service provider model has been disrupted by the over-the-top Internet content providers (Facebook, Netflix, etc.), with short product cycles and fast development pace of new services. The content provider irruption has meant a competition and stress over service providers' infrastructure and has forced telco companies to research new technologies to recover market share with flexible and revenue-generating services. Network Function Virtualization (NFV) and Service Function Chaining (SFC) are some of the initiatives led by the Communication Service Providers to regain the lost leadership. This project focuses on experimenting with some of these already available new technologies, which are expected to be the foundation of the new network paradigms (5G, IOT) and support new value-added services over cost-efficient telecommunication infrastructures. Specifically, SFC scenarios have been deployed with Open Platform for NFV (OPNFV), a Linux Foundation project. Some use cases of the NFV technology are demonstrated applied to teaching laboratories. Although the current implementation does not achieve a production degree of reliability, it provides a suitable environment for the development of new functional improvements and evaluation of the performance of virtualized network infrastructures

    Network Optimizations for Distributed Storage Networks

    Get PDF
    Distributed file systems enable the reliable storage of exabytes of information on thousands of servers distributed throughout a network. These systems achieve reliability and performance by storing three or more copies of data in different locations across the network. The management of these copies of data is commonly handled by intermediate servers that track and coordinate the placement of data in the network. This introduces potential network bottlenecks, as multiple transfers to fast storage nodes can saturate the network links connecting intermediate servers to the storage. The advent of open Network Operating Systems presents an opportunity to alleviate this bottleneck, as it is now possible to treat network elements as intermediate nodes in this distributed file system and have them perform the task of replicating data across storage nodes. In this thesis, we propose a new design paradigm for distributed file systems, driven by a new fundamental component of the system which runs on network elements such as switches or routers. We describe the component’s architecture and how it can be integrated into existing distributed file systems to increase their performance. To measure this performance increase over current approaches, we emulate a distributed file system by creating a block-level storage array distributed across multiple iSCSI targets presented in a network. Furthermore we emulate more complicated redundancy schemes likely to be used in distributed file systems in the future to determine what effect this approach may have on those systems and what benefits it offers. We find that this new component offers a decrease in request latency proportional to the number of storage nodes involved in the request. We also find that the benefits of this approach are limited by the ability of switch hardware to process incoming data from the request, but that these limitations can be surmounted through the proposed design paradigm

    Survey of storage systems for high-performance computing

    Get PDF
    In current supercomputers, storage is typically provided by parallel distributed file systems for hot data and tape archives for cold data. These file systems are often compatible with local file systems due to their use of the POSIX interface and semantics, which eases development and debugging because applications can easily run both on workstations and supercomputers. There is a wide variety of file systems to choose from, each tuned for different use cases and implementing different optimizations. However, the overall application performance is often held back by I/O bottlenecks due to insufficient performance of file systems or I/O libraries for highly parallel workloads. Performance problems are dealt with using novel storage hardware technologies as well as alternative I/O semantics and interfaces. These approaches have to be integrated into the storage stack seamlessly to make them convenient to use. Upcoming storage systems abandon the traditional POSIX interface and semantics in favor of alternative concepts such as object and key-value storage; moreover, they heavily rely on technologies such as NVM and burst buffers to improve performance. Additional tiers of storage hardware will increase the importance of hierarchical storage management. Many of these changes will be disruptive and require application developers to rethink their approaches to data management and I/O. A thorough understanding of today's storage infrastructures, including their strengths and weaknesses, is crucially important for designing and implementing scalable storage systems suitable for demands of exascale computing

    Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1

    Get PDF
    With the rising costs of large scale distributed systems many researchers have began looking at utilizing low power architectures for clusters. In this paper, we describe our Astro cluster, which consists of 46 NVIDIA Jetson TK1 nodes each equipped with an ARM Cortex A15 CPU, 192 core Kepler GPU, 2 GB of RAM, and 16 GB of flash storage. The cluster has a number of advantages when compared to conventional clusters including lower power usage, ambient cooling, shared memory between the CPU and GPU, and affordability. The cluster is built using commodity hardware and can be setup for relatively low costs while providing up to 190 single precision GFLOPS of computing power per node due to its combined GPU/CPU architecture. The cluster currently uses one 48-port Gigabit Ethernet switch and runs Linux for Tegra, a modified version of Ubuntu provided by NVIDIA as its operating system. Common file systems such as PVFS, Ceph, and NFS are supported by the cluster and benchmarks such as HPL, LAPACK, and LAMMPS are used to evaluate the system. At peak performance, the cluster is able to produce 328 GFLOPS of double precision and a peak of 810W using the LINPACK benchmark placing the cluster at 324th place on the Green500. Single precision benchmarks result in a peak performance of 6800 GFLOPs. The Astro cluster aims to be a proof-of-concept for future low power clusters that utilize a similar architecture. The cluster is installed with many of the same applications used by top supercomputers and is validated using the several standard supercomputing benchmarks. We show that with the rise of low-power CPUs and GPUs, and the need for lower server costs, this cluster provides insight into how ARM and CPU-GPU hybrid chips will perform in high-performance computing

    High-Performance Persistent Caching in Multi- and Hybrid- Cloud Environments

    Get PDF
    Il modello di lavoro noto come Multi Cloud sta emergendo come una naturale evoluzione del Cloud Computing per rispondere alle nuove esigenze di business delle aziende. Un tipico esempio è il modello noto come Cloud Ibrido dove si ha un Cloud Privato connesso ad un Cloud Pubblico per consentire alle applicazioni di scalare al bisogno e contemporaneamente rispondere ai bisogni di privacy, costi e sicurezza. Data la distribuzione dei dati su diverse strutture, quando delle applicazioni in esecuzione su un centro di calcolo devono utilizzare dati memorizzati remotamente, diventa necessario accedere alla rete che connette le diverse infrastrutture. Questo ha grossi impatti negativi su carichi di lavoro che consumano dati in modo intensivo e che di conseguenza vengono influenzati da ritardi dovuti alla bassa banda e latenza tipici delle connessioni di rete. Applicazioni di Intelligenza Artificiale e Calcolo Scientifico sono esempi di questo tipo di carichi di lavoro che, grazie all’uso sempre maggiore di acceleratori come GPU e FPGA, diventano capaci di consumare dati ad una velocità maggiore di quella con cui diventano disponibili. Implementare un livello di cache che fornisce e memorizza i dati di calcolo dal dispositivo di memorizzazione lento (remoto) a quello più veloce (ma costoso) dove i calcoli sono eseguiti, sembra essere la migliore soluzione per trovare il compromesso ottimale tra il costo dei dispositivi di memorizzazione offerti come servizi Cloud e la grande velocità di calcolo delle moderne applicazioni. Il sistema cache presentato in questo lavoro è stato sviluppato tenendo conto di tutte le peculiarità dei servizi di memorizzazione Cloud che fanno uso di API S3 per comunicare con i clienti. La soluzione proposta è stata ottenuta lavorando con il sistema di memorizzazione distribuito Ceph che implementa molti dei servizi caratterizzanti la semantica S3 ed inoltre, essendo pensato per lavorare su ambienti Cloud si inserisce bene in scenari Multi Cloud

    Establishing Scientific Computing Clouds on Limited Resources using OpenStack

    Get PDF
    Antud töö uurib, kuidas OpenStacki pilveplatvormi kasutada väikese jõudlusega pilvedes teadusarvutuse ja õppetöö eesmärgil. OpenStacki on küllaltki keeruline seadistada ning enamus dokumentatsioonist on paraku suunatud suurte sadade serveritega pilvede loomisele. OpenStacki paljude komponentide erinevaid võimalusi on küllaltki raske hoomata. Antud tö ö püüab need valikud Tartu Ülikooli Mobiilipilve aborile kuuluva kahe serveriga pilve näitel lahti rääkida.This work explores how OpenStack cloud platform could be used on limited hardware resources for scientific computing and teaching purposes. OpenStack has deep learning curve and most of the documentation is targeted for creating large scale clouds with hundreds of servers. OpenStack has a lot of components and configuration options which are quite hard to navigate at first. Thus this work tries to provide the rationale for making those technology choices and bases this on sample two server setup belonging to Tartu University Mobile Cloud Lab
    corecore