99 research outputs found

    Taking the C out of CVMFS

    Get PDF
    The Cern Virtual Machine File System is most well known as a distribution mechanism for the WLCG VOs@@ experiment software; as a result, almost all the existing expertise is in installing clients mount the central Cern repositories. We report the results of an initial experiment in using the cvmfs server packages to provide Glasgow-based repository aimed at software provisioning for small UK-local VOs. In general, although the documentation is sparse, server configuration is reasonably easy, with some experimentation. We discuss the advantages of local CVMFS repositories for sites, with some examples from our test VOs, vo.optics.ac.uk and neiss.org.uk

    A Subset of the CERN Virtual Machine File System: Fast Delivering of Complex Software Stacks for Supercomputing Resources

    Full text link
    Delivering a reproducible environment along with complex and up-to-date software stacks on thousands of distributed and heterogeneous worker nodes is a critical task. The CernVM-File System (CVMFS) has been designed to help various communities to deploy software on worldwide distributed computing infrastructures by decoupling the software from the Operating System. However, the installation of this file system depends on a collaboration with system administrators of the remote resources and an HTTP connectivity to fetch dependencies from external sources. Supercomputers, which offer tremendous computing power, generally have more restrictive policies than grid sites and do not easily provide the mandatory conditions to exploit CVMFS. Different solutions have been developed to tackle the issue, but they are often specific to a scientific community and do not deal with the problem in its globality. In this paper, we provide a generic utility to assist any community in the installation of complex software dependencies on supercomputers with no external connectivity. The approach consists in capturing dependencies of applications of interests, building a subset of dependencies, testing it in a given environment, and deploying it to a remote computing resource. We experiment this proposal with a real use case by exporting Gauss-a Monte-Carlo simulation program from the LHCb experiment-on Mare Nostrum, one of the top supercomputers of the world. We provide steps to encapsulate the minimum required files and deliver a light and easy-to-update subset of CVMFS: 12.4 Gigabytes instead of 5.2 Terabytes for the whole LHCb repository

    Streamlined HPC Environments with CVMFS and CyberGIS-Compute

    Get PDF
    High-Performance Computing (HPC) resources provide the potential for complex, large-scale modeling and analysis, fueling scientific progress over the last few decades, but these advances are not equally distributed across disciplines. Those in computational disciplines are often trained to have the necessary technical skills to utilize HPC (e.g. familiarity with the terminal), but many disciplines face technical hurdles when trying to apply HPC resources to their work. This unequal familiarity with HPC is increasingly a problem as cross-discipline teams work to tackle critical interdisciplinary issues like climate change and sustainability. CyberGIS-Compute is middle-ware designed to democratize to HPC services with the goal of empowering domain scientists, but a key challenge facing model developers on CyberGIS-Compute is creating a containerized software environment for their models. In this paper, we discuss our work to integrate the Cern Virtual Machine File System (CVMFS) into CyberGIS-Compute to provide consistent software environments across science gateways and HPC resources

    The case for preserving our knowledge and data in physics experiments

    Full text link
    This proceeding covers tools and technologies at our disposal for scientific data preservation and shows that this extends the scientific reach of our experiments. It is cost-efficient to warehouse data from completed experiments on the tape archives of our national and international laboratories. These subject-specific data stores also offer the technologies to capture and archive knowledge about experiments in the form of technical notes, electronic logs, websites, etc. Furthermore, it is possible to archive our source code and computing environments. The paper illustrates these challenges with experience from preserving the LEP data for the long term.Comment: 5 pages, 1 figur

    Storageless and caching Tier-2 models in the UK context

    Get PDF
    Operational and other pressures have lead to WLCG experiments moving increasingly to a stratified model for Tier-2 resources, where ``fat" Tier-2s (``T2Ds") and ``thin" Tier-2s (``T2Cs") provide different levels of service. In the UK, this distinction is also encouraged by the terms of the current GridPP5 funding model. In anticipation of this, testing has been performed on the implications, and potential implementation, of such a distinction in our resources. In particular, this presentation presents the results of testing of storage T2Cs, where the ``thin" nature is expressed by the site having either no local data storage, or only a thin caching layer; data is streamed or copied from a ``nearby" T2D when needed by jobs. In OSG, this model has been adopted successfully for CMS AAA sites; but the network topology and capacity in the USA is significantly different to that in the UK (and much of Europe). We present the result of several operational tests: the in-production University College London (UCL) site, which runs ATLAS workloads using storage at the Queen Mary University of London (QMUL) site; the Oxford site, which has had scaling tests performed against T2Ds in various locations in the UK (to test network effects); and the Durham site, which has been testing the specific ATLAS caching solution of ``Rucio Cache" integration with ARC's caching layer

    Micro-CernVM: Slashing the Cost of Building and Deploying Virtual Machines

    Full text link
    The traditional virtual machine building and and deployment process is centered around the virtual machine hard disk image. The packages comprising the VM operating system are carefully selected, hard disk images are built for a variety of different hypervisors, and images have to be distributed and decompressed in order to instantiate a virtual machine. Within the HEP community, the CernVM File System has been established in order to decouple the distribution from the experiment software from the building and distribution of the VM hard disk images. We show how to get rid of such pre-built hard disk images altogether. Due to the high requirements on POSIX compliance imposed by HEP application software, CernVM-FS can also be used to host and boot a Linux operating system. This allows the use of a tiny bootable CD image that comprises only a Linux kernel while the rest of the operating system is provided on demand by CernVM-FS. This approach speeds up the initial instantiation time and reduces virtual machine image sizes by an order of magnitude. Furthermore, security updates can be distributed instantaneously through CernVM-FS. By leveraging the fact that CernVM-FS is a versioning file system, a historic analysis environment can be easily re-spawned by selecting the corresponding CernVM-FS file system snapshot.Comment: Conference paper at the 2013 Computing in High Energy Physics (CHEP) Conference, Amsterda

    HVSTO: Efficient Privacy Preserving Hybrid Storage in Cloud Data Center

    Full text link
    In cloud data center, shared storage with good management is a main structure used for the storage of virtual machines (VM). In this paper, we proposed Hybrid VM storage (HVSTO), a privacy preserving shared storage system designed for the virtual machine storage in large-scale cloud data center. Unlike traditional shared storage, HVSTO adopts a distributed structure to preserve privacy of virtual machines, which are a threat in traditional centralized structure. To improve the performance of I/O latency in this distributed structure, we use a hybrid system to combine solid state disk and distributed storage. From the evaluation of our demonstration system, HVSTO provides a scalable and sufficient throughput for the platform as a service infrastructure.Comment: 7 pages, 8 figures, in proceeding of The Second International Workshop on Security and Privacy in Big Data (BigSecurity 2014

    An ARM cluster for running CMSSW jobs

    Get PDF
    The ARM platform extends from the mobile phone area to development board computers and servers. It could be that in the future the importance of the ARM platform will increase for High Performance Computing/High Throughput Computing (HPC/HTC) if new more powerful (server) boards are released. For this reason Compact Muon Solenoid Software (CMSSW) has previously been ported to ARM in earlier work. The CMSSW is deployed using the CERN Virtual Machine File System (CVMFS) and the jobs are run inside Singularity containers. Some ARM AArch64 CMSSW releases are available in CVMFS for testing and development. In this work CVMFS and Singularity have been compiled and installed on an ARM cluster and the AArch64 CMSSW releases in CVMFS have been used. We report on our experiences with this ARM cluster for CMSSW jobs. Commodity hardware designed around the 64-bit architecture has been the basis of current virtualization trends with the advantage to emulate diverse environments for a wide range of computational scenarios. However, in parallel the mobile revolution have given a rise to ARM SoCs with primary focus on power efficiency. While still in the experimental phase, the power efficiency and 64-bit heterogeneous computing already point to an alternative option for traditional x86_64 CPUs servers for datacenters.In this paper we present the latest CMS open data release published on the CERN Oopen Data portal. Samples of collision and simulated datasets were released together with detailed information about the data provenance. The associated data production chains cover the necessary computing environments, the configuration files and the computational procedures used in each data production step. We describe data curation techniques used to obtain and publish the data provenance information and we study the possibility of reproducing parts of the released data using the publicly available information. The present work demonstrates the usefulness of releasing selected samples of raw and primary data in order to fully ensure the completeness of information about the data production chain for the attention of general data scientists and other non-specialists interested in using particle physics data for education or research purposes.Peer reviewe

    A Survey on Security Aspects of Server Virtualization in Cloud Computing

    Get PDF
    Significant exploitation and utilization of cloud computing in industry is come with and in the identical time vulnerable by unease regarding protection of data hold by cloud computing providers. One of the penalties of moving data processing and storage off business site is that organizations have fewer controls over their infrastructure. seeing that, cloud service (CS) providers must hope that the CS provider is capable to protect their data and infrastructure from both exterior and domestic attacks. Presently however, such hope can only rely on organizational procedures stated by the CS provider and cannot be remotely verified and validated by an external party. The central distinction between cloud computing and conventional enterprise internal Information Technology services is that the proprietor and the consumer of cloud Information Technology infrastructures are separated in cloud. This transform requires a safety responsibility severance in cloud computing. Cloud service providers (CSP) should safe the services they propose and cannot surpass the customers’ authorities. Virtualization is a buildup utterance in the Information Technology world. With the assure to reduce the ever mounting infrastructure inside data centers connected to other important apprehensions such as ease of use and scalability, virtualization technology has been in advance recognition not only with IT experts yet also among administrators and executives as well. The progressively more growing rate of the approval of this technology has exposed these systems to new protection concerns which in recent history have been unnoticed or merely overlooked. This paper presents an in depth state of art gaze at  present most old server virtualization explanations, as well as a writing study on different security matters found inside this virtualization technology. These problems can be practical to all the existing virtualization technologies accessible with no spotlight on a specific answer. Nevertheless, we do susceptibility investigation of two of the mainstream recognized virtualization answers: VMware ESX and Xen. to conclude, we illustrate some clarifications on how to progress the security of online banking and electronic commerce, using virtualization
    • …
    corecore