36 research outputs found
Third-party transfers in WLCG using HTTP
Since its earliest days, the Worldwide LHC Computational Grid (WLCG) has
relied on GridFTP to transfer data between sites. The announcement that Globus
is dropping support of its open source Globus Toolkit (GT), which forms the
basis for several FTP client and servers, has created an opportunity to
reevaluate the use of FTP. HTTP-TPC, an extension to HTTP compatible with
WebDAV, has arisen as a strong contender for an alternative approach.
In this paper, we describe the HTTP-TPC protocol itself, along with the
current status of its support in different implementations, and the
interoperability testing done within the WLCG DOMA working group's TPC
activity. This protocol also provides the first real use-case for token-based
authorisation for this community. We will demonstrate the benefits of such
authorisation by showing how it allows HTTP-TPC to support new technologies
(such as OAuth, OpenID Connect, Macaroons and SciTokens) without changing the
protocol. We will also discuss the next steps for HTTP-TPC and the plans to use
the protocol for WLCG transfers.Comment: 7 pages, 3 figures, to appear in the proceedings of CHEP 202
CS3 2021- Cloud Storage Synchronization and Sharing
The Reva component, at the heart of the CERNBox project at CERN
will soon get new plugins that build on the experience
accumulated with the current production deployment,
where its data is stored centrally in a system called EOS. EOS
represents since 10 years the ultimate development effort
into providing an extremely scalable data storage system that
supports the demanding requirements of the massive physics analysis,
together with the more regular requirements of a wider community (scientists, engineers, administration): synchronisation and sharing, online and universal access and real time collaborative workflows.
Making Reva natively interfaced to EOS through high performance gRPC
and standard HTTPS interfaces will open a new scenario in terms
of scalability and manageability of the CERNBox service,
whose requirements in terms of data will continue to grow in the
next decade. In this contribution we will technically introduce
this near-future scenario
EOS workshop
The Reva component, at the heart of the CERNBox project at CERN
will soon get new plugins that build on the experience
accumulated with the current production deployment,
where its data is stored centrally in EOS at CERN.
Making Reva natively interfaced to EOS through high performance gRPC
and standard HTTPS interfaces will open a new scenario in terms
of scalability and manageability of the CERNBox service,
whose requirements in terms of data will continue to grow in the
next decade. In this contribution we will technically introduce
this near-future scenario
Lecture 7: Worldwide LHC Computing Grid Overview
This presentation will introduce in an informal, but technically correct way the challenges that are linked to the needs of massively distributed computing architectures in the context of the LHC offline computing. The topics include technological and organizational aspects touching many aspects of LHC computing, from data access, to maintenance of large databases and huge collections of files, to the organization of computing farms and monitoring.
Fabrizio Furano holds a Ph.D in Computer Science and has worked in the field of Computing for High Energy Physics for many years. Some of his preferred topics include application architectures, system design and project management, with focus on performance and scalability of data access. Fabrizio has experience in a wide variety of environments, from private companies to academic research in particular in object oriented methodologies, mainly using C++. He has also teaching experience at university level in Software Engineering and C++ Programming
Workshop on Cloud Services for File Synchronisation and Sharing
The Dynamic Federation project aims to give tools and methods to
on-the-fly federate different storage repositories whose
content satisfies some basic requirements of homogeneity.
The Dynafeds have been designed to work in WANs, have given so far excellent results also
in LAN, and are well adapted to working with the HTTP/WebDAV protocols
and derivatives, thus including a broad range of Cloud storage technologies.
In this talk we will introduce the system and its recent larger
deployments, and discuss the improvements and configurations
that can make it work seamlessly with Cloud storage providers.
Among the deployment possibilities we cite:
- seamlessly using different cloud storage providers at once,
thus creating a federation of personal cloud storage
providers
- boosting client data access performance by optimizing redirections to
data that is globally replicated
- easy, catalogue-free insertion/deletion of transient endpoints
- seamlessly mixing cloud storage with WebDAV-enabled Grid storage
- giving a WAN-distributed WebDAV access backend to services like
OwnCloud/CERNBox, to enable collaboration across administrative
domains/sites
Recommended from our members
Data Access Performance Through Parallelization and Vectored Access: Some Results
High Energy Physics data processing and analysis applications typically deal with the problem of accessing and processing data at high speed. Recent studies, development and test work have shown that the latencies due to data access can often be hidden by parallelizing them with the data processing, thus giving the ability to have applications which process remote data with a high level of efficiency. Techniques and algorithms able to reach this result have been implemented in the client side of the Scalla/xrootd system, and in this contribution we describe the results of some tests done in order to compare their performance and characteristics. These techniques, if used together with multiple streams data access, can also be effective in allowing to efficiently and transparently deal with data repositories accessible via a Wide Area Network
Dynamic federation of grid and cloud storage
The Dynamic Federations project (“Dynafed”) enables the deployment of scalable, distributed storage systems composed of independent storage endpoints. While the Uniform Generic Redirector at the heart of the project is protocol-agnostic, we have focused our effort on HTTP-based protocols, including S3 and WebDAV. The system has been deployed on testbeds covering the majority of the ATLAS and LHCb data, and supports geography-aware replica selection. The work done exploits the federation potential of HTTP to build systems that offer uniform, scalable, catalogue-less access to the storage and metadata ensemble and the possibility of seamless integration of other compatible resources such as those from cloud providers. Dynafed can exploit the potential of the S3 delegation scheme, effectively federating on the fly any number of S3 buckets from different providers and applying a uniform authorization to them. This feature has been used to deploy in production the BOINC Data Bridge, which uses the Uniform Generic Redirector with S3 buckets to harmonize the BOINC authorization scheme with the Grid/X509. The Data Bridge has been deployed in production with good results. We believe that the features of a loosely coupled federation of open-protocolbased storage elements open many possibilities of smoothly evolving the current computing models and of supporting new scientific computing projects that rely on massive distribution of data and that would appreciate systems that can more easily be interfaced with commercial providers and can work natively with Web browsers and clients
A world-wide databridge supported by a commercial cloud provider
Volunteer computing has the potential to provide significant additional computing capacity for the LHC experiments. One of the challenges with exploiting volunteer computing is to support a global community of volunteers that provides heterogeneous resources. However, high energy physics applications require more data input and output than the CPU intensive applications that are typically used by other volunteer computing projects. While the so-called databridge has already been successfully proposed as a method to span the untrusted and trusted domains of volunteer computing and Grid computing respective, globally transferring data between potentially poor-performing residential networks and CERN could be unreliable, leading to wasted resources usage. The expectation is that by placing a storage endpoint that is part of a wider, flexible geographical databridge deployment closer to the volunteers, the transfer success rate and the overall performance can be improved. This contribution investigates the provision of a globally distributed databridge implemented upon a commercial cloud provider