12 research outputs found
Site Authorization Service (SAZ)
In this paper we present a methodology to provide an additional level of
centralized control for the grid resources. This centralized control is applied
to site-wide distribution of various grids and thus providing an upper hand in
the maintenance.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, CA, USA, March 2003, 3 pages, PSN TUBT00
Recommended from our members
Run II data analysis on the grid
In this document, we begin the technical design for the distributed RunII computing for CDF and D0. The present paper defines the three components of the data handling area of Run II computing, namely the Data Handling System, the Storage System and the Application. We outline their functionality and interaction between them. We identify necessary and desirable elements of the interfaces
Computing in High Energy and Nuclear Physics (CHEP) 2012
In HEP, scientific research is performed by large collaborations of organizations and individuals. Log book of a scientific collaboration is important part of the collaboration record. Often, it contains experimental data.
At FNAL, we developed an Electronic Collaboration Logbook (ECL) application which is used by about 20 different collaborations, experiments and groups at FNAL. ECL is the latest iteration of the project formerly known as Control Room Logbook (CRL).
We have been working on mobile (IOS and Android) clients for ECL.
We will present history, current status and future plans of the project, as well as design, implementation and support solutions made by the project
25th International Conference on Computing in High Energy & Nuclear Physics
Metadata management is one of three major areas and parts of functionality of scientific data management along with replica management and workflow management. Metadata is the information describing the data stored in a data item, a file or an object. It includes the data item provenance, recording conditions, format and other attributes. MetaCat is a metadata management database designed and developed for High Energy Physics experiments. As a component of a data management system, itâs main objectives are to provide efficient metadata storage and management and fast data items selection functionality. MetaCat is supposed to work on the scale of 100 million files (or objects) and beyond. The article will discuss the functionality of MetaCat and technological solutions used to implement the product
MetaCat - metadata catalog for data management systems
Metadata management is one of three major areas of scientific data management along with replica management and workflow management. Metadata is the information describing the data stored in a data item, a file or an object. It includes the data item provenance, recording conditions, format and other attributes. MetaCat is a metadata management database designed and developed for High Energy Physics experiments. As a component of a data management system, itâs main objectives are to provide efficient metadata storage and management and fast data selection functionality. MetaCat is required to work on the scale of 100 million files (or objects) and beyond. The article will discuss the functionality of MetaCat and technological solutions used to implement the product
Striped Data Analysis Framework
A columnar data representation is known to be an efficient way for data storage, specifically in cases when the analysis is often done based only on a small fragment of the available data structures. A data representation like Apache Parquet is a step forward from a columnar representation, which splits data horizontally to allow for easy parallelization of data analysis. Based on the general idea of columnar data storage, working on the [LDRD Project], we have developed a striped data representation, which, we believe, is better suited to the needs of High Energy Physics data analysis. A traditional columnar approach allows for efficient data analysis of complex structures. While keeping all the benefits of columnar data representations, the striped mechanism goes further by enabling easy parallelization of computations without requiring special hardware. We will present an implementation and some performance characteristics of such a data representation mechanism using a distributed no-SQL database or a local file system, unified under the same API and data representation model. The representation is efficient and at the same time simple so that it allows for a common data model and APIs for wide range of underlying storage mechanisms such as distributed no-SQL databases and local file systems. Striped storage adopts Numpy arrays as its basic data representation format, which makes it easy and efficient to use in Python applications. The Striped Data Server is a web service, which allows to hide the server implementation details from the end user, easily exposes data to WAN users, and allows to utilize well known and developed data caching solutions to further increase data access efficiency. We are considering the Striped Data Server as the core of an enterprise scale data analysis platform for High Energy Physics and similar areas of data processing. We have been testing this architecture with a 2TB dataset from a CMS dark matter search and plan to expand it to multiple 100 TB or even PB scale. We will present the striped format, Striped Data Server architecture and performance test results
Striped Data Analysis Framework
A columnar data representation is known to be an efficient way for data storage, specifically in cases when the analysis is often done based only on a small fragment of the available data structures. A data representation like Apache Parquet is a step forward from a columnar representation, which splits data horizontally to allow for easy parallelization of data analysis. Based on the general idea of columnar data storage, working on the [LDRD Project], we have developed a striped data representation, which, we believe, is better suited to the needs of High Energy Physics data analysis. A traditional columnar approach allows for efficient data analysis of complex structures. While keeping all the benefits of columnar data representations, the striped mechanism goes further by enabling easy parallelization of computations without requiring special hardware. We will present an implementation and some performance characteristics of such a data representation mechanism using a distributed no-SQL database or a local file system, unified under the same API and data representation model. The representation is efficient and at the same time simple so that it allows for a common data model and APIs for wide range of underlying storage mechanisms such as distributed no-SQL databases and local file systems. Striped storage adopts Numpy arrays as its basic data representation format, which makes it easy and efficient to use in Python applications. The Striped Data Server is a web service, which allows to hide the server implementation details from the end user, easily exposes data to WAN users, and allows to utilize well known and developed data caching solutions to further increase data access efficiency. We are considering the Striped Data Server as the core of an enterprise scale data analysis platform for High Energy Physics and similar areas of data processing. We have been testing this architecture with a 2TB dataset from a CMS dark matter search and plan to expand it to multiple 100 TB or even PB scale. We will present the striped format, Striped Data Server architecture and performance test results
DUNE Offline Computing Conceptual Design Report
This document describes Offline Software and Computing for the Deep Underground Neutrino Experiment (DUNE) experiment, in particular, the conceptual design of the offline computing needed to accomplish its physics goals. Our emphasis in this document is the development of the computing infrastructure needed to acquire, catalog, reconstruct, simulate and analyze the data from the DUNE experiment and its prototypes. In this effort, we concentrate on developing the tools and systems thatfacilitate the development and deployment of advanced algorithms. Rather than prescribing particular algorithms, our goal is to provide resources that are flexible and accessible enough to support creative software solutions as HEP computing evolves and to provide computing that achieves the physics goals of the DUNE experiment
DUNE Offline Computing Conceptual Design Report
This document describes Offline Software and Computing for the Deep Underground Neutrino Experiment (DUNE) experiment, in particular, the conceptual design of the offline computing needed to accomplish its physics goals. Our emphasis in this document is the development of the computing infrastructure needed to acquire, catalog, reconstruct, simulate and analyze the data from the DUNE experiment and its prototypes. In this effort, we concentrate on developing the tools and systems that
facilitate the development and deployment of advanced algorithms. Rather than prescribing particular algorithms, our goal is to provide resources that are flexible and accessible enough to support creative software solutions as HEP computing evolves and to provide computing that achieves the physics goals of the DUNE experiment.This document describes the conceptual design for the Offline Software and Computing for the Deep Underground Neutrino Experiment (DUNE). The goals of the experiment include 1) studying neutrino oscillations using a beam of neutrinos sent from Fermilab in Illinois to the Sanford Underground Research Facility (SURF) in Lead, South Dakota, 2) studying astrophysical neutrino sources and rare processes and 3) understanding the physics of neutrino interactions in matter. We describe the development of the computing infrastructure needed to achieve the physics goals of the experiment by storing, cataloging, reconstructing, simulating, and analyzing 30 PB of data/year from DUNE and its prototypes. Rather than prescribing particular algorithms, our goal is to provide resources that are flexible and accessible enough to support creative software solutions and advanced algorithms as HEP computing evolves. We describe the physics objectives, organization, use cases, and proposed technical solutions
DUNE Offline Computing Conceptual Design Report
This document describes Offline Software and Computing for the Deep Underground Neutrino Experiment (DUNE) experiment, in particular, the conceptual design of the offline computing needed to accomplish its physics goals. Our emphasis in this document is the development of the computing infrastructure needed to acquire, catalog, reconstruct, simulate and analyze the data from the DUNE experiment and its prototypes. In this effort, we concentrate on developing the tools and systems thatfacilitate the development and deployment of advanced algorithms. Rather than prescribing particular algorithms, our goal is to provide resources that are flexible and accessible enough to support creative software solutions as HEP computing evolves and to provide computing that achieves the physics goals of the DUNE experiment