Search CORE

945 research outputs found

Deploying Jupyter Notebooks at scale on XSEDE resources for Science Gateways and workshops

Author: Jette Morris A.
Kluyver Thomas
Weil Sage A
Wilkins-Diehr Nancy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/07/2018
Field of study

Jupyter Notebooks have become a mainstream tool for interactive computing in every field of science. Jupyter Notebooks are suitable as companion applications for Science Gateways, providing more flexibility and post-processing capability to the users. Moreover they are often used in training events and workshops to provide immediate access to a pre-configured interactive computing environment. The Jupyter team released the JupyterHub web application to provide a platform where multiple users can login and access a Jupyter Notebook environment. When the number of users and memory requirements are low, it is easy to setup JupyterHub on a single server. However, setup becomes more complicated when we need to serve Jupyter Notebooks at scale to tens or hundreds of users. In this paper we will present three strategies for deploying JupyterHub at scale on XSEDE resources. All options share the deployment of JupyterHub on a Virtual Machine on XSEDE Jetstream. In the first scenario, JupyterHub connects to a supercomputer and launches a single node job on behalf of each user and proxies back the Notebook from the computing node back to the user's browser. In the second scenario, implemented in the context of a XSEDE consultation for the IRIS consortium for Seismology, we deploy Docker in Swarm mode to coordinate many XSEDE Jetstream virtual machines to provide Notebooks with persistent storage and quota. In the last scenario we install the Kubernetes containers orchestration framework on Jetstream to provide a fault-tolerant JupyterHub deployment with a distributed filesystem and capability to scale to thousands of users. In the conclusion section we provide a link to step-by-step tutorials complete with all the necessary commands and configuration files to replicate these deployments.Comment: 7 pages, 3 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US

arXiv.org e-Print Archive

Crossref

Transparent Orchestration of Task-based Parallel Applications in Containers Platforms

Author: Badia Sala Rosa Maria
Ejarque Jorge
Lezzi Daniele
Ramón Cortés Cristian
Serven Albert
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

This paper presents a framework to easily build and execute parallel applications in container-based distributed computing platforms in a user-transparent way. The proposed framework is a combination of the COMP Superscalar (COMPSs) programming model and runtime, which provides a straightforward way to develop task-based parallel applications from sequential codes, and containers management platforms that ease the deployment of applications in computing environments (as Docker, Mesos or Singularity). This framework provides scientists and developers with an easy way to implement parallel distributed applications and deploy them in a one-click fashion. We have built a prototype which integrates COMPSs with different containers engines in different scenarios: i) a Docker cluster, ii) a Mesos cluster, and iii) Singularity in an HPC cluster. We have evaluated the overhead in the building phase, deployment and execution of two benchmark applications compared to a Cloud testbed based on KVM and OpenStack and to the usage of bare metal nodes. We have observed an important gain in comparison to cloud environments during the building and deployment phases. This enables better adaptation of resources with respect to the computational load. In contrast, we detected an extra overhead during the execution, which is mainly due to the multi-host Docker networking.This work is partly supported by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316 project, by the Generalitat de Catalunya under contracts 2014-SGR-1051 and 2014-SGR-1272, and by the European Union through the Horizon 2020 research and innovation program under grant 690116 (EUBra-BIGSEA Project). Results presented in this paper were obtained using the Chameleon testbed supported by the National Science Foundation.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

Implementasi Penyimpanan Data Persisten pada Docker Swarm menggunakan Network File System(NFS)

Author: Frederius Andreas
Publication venue
Publication date: 07/12/2018
Field of study

Docker Swarm merupakan teknologi pengembangan sistem terdistribusi untuk melakukan manajemen pada kelompok mesin Docker. Dengan Docker Swarm dapat menjalankan banyak kontainer sekaligus pada kelompok mesin Docker. Pada penerapan sistem terdistribusi menggunakan Docker Swarm diperlukan sebuah penyimpanan data yang persisten. Namun masalahnya Docker Swarm menyimpan data pada kontainer, jika kontainer terhapus maka data akan ikut terhapus. Maka dari itu diperlukan sebuah alternatif penyimpanan data yang persisten. Penelitian sebelumnya menggunakan Storage Class Memory (SCM). SCM adalah teknologi perangkat keras baru yang menawarkan penyimpanan persisten, dan cepat untuk kontainer. Namun SCM merupakan teknologi perangkat keras yang khusus dan memerlukan biaya yang mahal. Alternatif lain dapat menggunakan Network File System (NFS). NFS merupakan open protokol yang dapat digunakan untuk berbagi file pada banyak jaringan komputer dan sistem operasi. Perancangan arsitektur NFS pada Docker Swarm menggunakan arsitektur client-server. Docker Swarm berperan sebagai client dan NFS berperan sebagai server. NFS mampu menyediakan penyimpanan data persisten pada Docker Swarm dengan melakukan sinkronisasi data sekalipun kontainer dihapus dan mesin di-restart. NFS dapat melakukan sinkronisasi data pada Docker Swarm untuk mengambil data yang telah tersimpan pada NFS sehingga data tetap persisten. Kinerja kecepatan write rata-rata pada NFS adalah 30.168 KB sedangkan kinerja kecepatan read rata-rata pada NFS adalah 63.939 KB

bkg

Comparison of application container orchestration platforms

Author: Pankowski Adam
Powroźnik Paweł
Publication venue: Lublin University of Technology
Publication date: 29/12/2023
Field of study

This article presents a comparative analysis of three well-known container orchestration platforms: Docker Swarm, Kubernetes and Apache Mesos, focusing on the deployment of a test application and measuring parameters such as deployment time, memory, CPU and disk utilization, application response time and the time to restore a replica of the application using an auto-recovery mechanism. The aim of the research is to verify the performance and efficiency of the analyzed platforms, facilitating informed decisions while choosing an orchestrator for containerized applications. Two research hypotheses have been stated. The first one assumes that the time required to launch an application using the Docker Swarm tool is the shortest among the analyzed platforms. The second hypothesis is that Kubernetes provides the most efficient results in terms of load scheduling and application scaling. The analysis performed on the Jenkins application showed the superiority of the Docker Swarm platform over the other studied tools in terms of performance

Lublin University of Technology Journals

A Security Monitoring Framework For Virtualization Based HEP Infrastructures

Author: Betev L.
Grigoras C.
Kebschull U.
Lara C.
Pedreira M. Martinez
Ramirez A. Gomez
Publication venue: 'IOP Publishing'
Publication date: 16/04/2017
Field of study

High Energy Physics (HEP) distributed computing infrastructures require automatic tools to monitor, analyze and react to potential security incidents. These tools should collect and inspect data such as resource consumption, logs and sequence of system calls for detecting anomalies that indicate the presence of a malicious agent. They should also be able to perform automated reactions to attacks without administrator intervention. We describe a novel framework that accomplishes these requirements, with a proof of concept implementation for the ALICE experiment at CERN. We show how we achieve a fully virtualized environment that improves the security by isolating services and Jobs without a significant performance impact. We also describe a collected dataset for Machine Learning based Intrusion Prevention and Detection Systems on Grid computing. This dataset is composed of resource consumption measurements (such as CPU, RAM and network traffic), logfiles from operating system services, and system call data collected from production Jobs running in an ALICE Grid test site and a big set of malware. This malware was collected from security research sites. Based on this dataset, we will proceed to develop Machine Learning algorithms able to detect malicious Jobs.Comment: Proceedings of the 22nd International Conference on Computing in High Energy and Nuclear Physics, CHEP 2016, 10-14 October 2016, San Francisco. Submitted to Journal of Physics: Conference Series (JPCS

arXiv.org e-Print Archive

CERN Document Server