419 research outputs found

    Big data workflows: Locality-aware orchestration using software containers

    Get PDF
    The emergence of the Edge computing paradigm has shifted data processing from centralised infrastructures to heterogeneous and geographically distributed infrastructures. Therefore, data processing solutions must consider data locality to reduce the performance penalties from data transfers among remote data centres. Existing Big Data processing solutions provide limited support for handling data locality and are inefficient in processing small and frequent events specific to the Edge environments. This article proposes a novel architecture and a proof-of-concept implementation for software container-centric Big Data workflow orchestration that puts data locality at the forefront. The proposed solution considers the available data locality information, leverages long-lived containers to execute workflow steps, and handles the interaction with different data sources through containers. We compare the proposed solution with Argo Workflows and demonstrate a significant performance improvement in the execution speed for processing the same data units. Finally, we carry out experiments with the proposed solution under different configurations and analyze individual aspects affecting the performance of the overall solution.publishedVersio

    Making Scientific Applications Portable: Software Containers and Package Managers

    Get PDF
    Scientific workflows for high-performance computing (HPC) are becoming increasingly complex. Developing a way to simplify these workflows could save many hours for both HPC users and developers, potentially eliminating any time spent managing software dependencies and experiment set-up. To accomplish this, we propose using two programs together: Docker and Spack. Docker is a container platform and Spack is a package manager designed specifically for HPC. In this paper, we show how Docker and Spack can be used to containerize the extreme-scale Scientific Software Development Kit (xSDK). Doing this makes the xSDK far more accessible to non-computer scientists and lowers time spent by developers on dependency management. Implementing a system such as this on a large scale could change the functioning of the HPC industry

    BioExcel Building Blocks, a software library for interoperable biomolecular simulation workflows.

    Get PDF
    In the recent years, the improvement of software and hardware performance has made biomolecular simulations a mature tool for the study of biological processes. Simulation length and the size and complexity of the analyzed systems make simulations both complementary and compatible with other bioinformatics disciplines. However, the characteristics of the software packages used for simulation have prevented the adoption of the technologies accepted in other bioinformatics fields like automated deployment systems, workflow orchestration, or the use of software containers. We present here a comprehensive exercise to bring biomolecular simulations to the "bioinformatics way of working". The exercise has led to the development of the BioExcel Building Blocks (BioBB) library. BioBB's are built as Python wrappers to provide an interoperable architecture. BioBB's have been integrated in a chain of usual software management tools to generate data ontologies, documentation, installation packages, software containers and ways of integration with workflow managers, that make them usable in most computational environments

    MeDICINE: Rapid Prototyping of Production-Ready Network Services in Multi-PoP Environments

    Get PDF
    Virtualized network services consisting of multiple individual network functions are already today deployed across multiple sites, so called multi-PoP (points of presence) environ- ments. This allows to improve service performance by optimizing its placement in the network. But prototyping and testing of these complex distributed software systems becomes extremely challenging. The reason is that not only the network service as such has to be tested but also its integration with management and orchestration systems. Existing solutions, like simulators, basic network emulators, or local cloud testbeds, do not support all aspects of these tasks. To this end, we introduce MeDICINE, a novel NFV prototyping platform that is able to execute production-ready network func- tions, provided as software containers, in an emulated multi-PoP environment. These network functions can be controlled by any third-party management and orchestration system that connects to our platform through standard interfaces. Based on this, a developer can use our platform to prototype and test complex network services in a realistic environment running on his laptop.Comment: 6 pages, pre-prin

    Distributing Data and Analysis Software Containers For Better Data Sharing in Clinical Research

    Get PDF
    Introduction: Data sharing in clinical research is critical for increasing knowledge discovery. Data and software tools should be FAIR: Findable, Accessible, Inter-operable and Re-usable. Many bottlenecks exist in the process of a clinical investigator using shared data including data acquisition and statistical analysis. The objective of this project is to develop a structure for sharing data and providing rapid automated statistical analysis through creation of a pre-packaged, open-source software container. Methods: We use the open source software container technologies VirtualBox and Vagrant to create a template for sharing clinical data and analysis scripts as a single container. We use a timer to record the time necessary to setup and initialize the software container and view the results. Results: We have created a template for sharing data and analysis scripts together using open source software container technologies VirtualBox and Vagrant. We found the time needed to initialize the container to be 5 minutes and 36 seconds for a macOS-based machine and 7 minutes and 2 seconds for a Windows-based machine. Containers can be downloaded and executed from any Mac or Windows computer allowing both the reuse of and interaction with the data. This greatly reduces the time and effort needed to obtain and analyze clinical data. Conclusion: Reducing the time and effort needed to obtain and analyze clinical data increases the time available for data exploration and the discovery of new knowledge. This can be effectively achieved using software containers and virtualization

    Käyttäjätason ohjelmistokontittaminen pilviradioliityntäverkossa

    Get PDF
    The amount of devices connected through mobile networks has been growing rapidly. This growth will create a demand for network capacity that cannot be met with traditional methods. This problem could be solved by implementing a cloud radio access network (RAN), a new concept, to adapt cloud computing technologies, such as software containers, from the software industry to RANs. This adaptation will also create a need to modify working practices in order to better comply with these new cloud computing technologies. While cloud RAN has recently received much research attention, the actual software implementations have not been widely discussed in the literature. Therefore, this thesis evaluates the feasibility of using software containers in the user-plane applications of cloud RAN in terms of networking and inter-container communications (ICC). This is accomplished by identifying potential approaches for ICC and for container networking as well as measuring the performance of these approaches. Two approaches are proposed for ICC and container networking. The approaches were evaluated in terms of throughput and latency. These approaches were found to be suitable for use in cloud RAN user-plane applications. However, since the measurements were performed in a simplified environment, implementing the approaches into a cloud RAN component will require further work.Mobiiliverkkoihin liitettävien laitteiden määrä kasvaa nopeasti. Tämä kasvu tulee luomaan verkon kapasiteetille kysynnän, johon ei kyetä vastaamaan perinteisin menetelmin. Tämä ongelma voitaineen ratkaista implementoimalla pilviradioliityntäverkko (Cloud RAN), uusi konsepti, joka sovittaa ohjelmistoalalla vakiintuneita pilvilaskentateknologioita käytettäväksi radioliityntäverkoissa (radio access network, RAN). Tämä sovitusprosessi luo tarpeen mukauttaa myös työskentelytavat yhteensopiviksi uusien pilvilaskentateknologioiden kanssa. Vaikka pilviradioliityntäverkkoa on tutkittu aktiivisesti viime aikoina, käytännön ohjelmistototeutukset eivät juuri ole olleet esillä kirjallisuudessa. Tämä diplomityö arvioi ohjelmistokonttien (software containers) soveltuvuutta käytettäväksi pilviradioliityntäverkon käyttäjätason (user-plane) applikaatioissa verkottamisen (networking) ja ohjelmistokonttien välisen kommunikoinnin (inter-container communications, ICC) suhteen. Tämä arviointi suoritetaan identifioimalla mahdollisia toteutuksia ohjelmistokonttien väliselle kommunikaatiolle ja ohjelmistokonttien verkottamiselle sekä mittaamalla näiden toteutuksien suorituskyky. Tässä diplomityössä ehdotetaan tutkittavaksi kaksi toteutusta ohjelmistokonttien väliselle kommunikaatiolle ja ohjelmistokonttien verkottamiselle. Nämä toteutukset arvioitiin välityskyvyn (throughput) ja latenssin suhteen. Näiden toteutuksien todettiin olevan soveliaita käytettäväksi pilviradioliityntäverkon käyttäjätason applikaatioissa. Kuitenkin, koska mittaukset toteutettiin yksinkertaistetussa ympäristössä, vaatii toteutuksien implementointi pilviradioliityntäverkon komponenttiin lisätyötä
    corecore