5,927 research outputs found
Enabling parallel and interactive distributed computing data analysis for the ALICE experiment
AliEn (ALICE Environment) is the production environment developed by
the ALICE collaboration at CERN. It provides a set of Grid tools enabling
the full offline computational work-flow of the experiment (simulation, reconstruction
and data analysis) in a distributed and heterogeneous computing
environment.
In addition to the analysis on the Grid, ALICE users perform local interactive
analysis using ROOT and the Parallel ROOT Facility (PROOF).
PROOF enables physicists to analyse in parallel medium-sized (200-300
TB) data sets in a short time scale.
The default installation of PROOF is on a static dedicated cluster, typically
200-300 cores. This well-proven approach is not devoid of limitations,
more specifically for analysis of larger datasets or when the installation of
a dedicated cluster is not possible. Using a new framework called Proof on
Demand (PoD), PROOF can be used directly on Grid-enabled clusters, by
dynamically assigning interactive nodes on user request.
This thesis presents the PoD on AliEn project. The integration of Proof on
Demand in the AliEn framework provides private dynamic PROOF clusters
as a Grid service. This functionality is transparent to the user who will
submit interactive jobs to the AliEn system.
The ROOT framework, among other things, is used by physicists to carry
out the Monte Carlo Simulation of the detector. The engineers working on
the mechanical design of the detector need to collaborate with the physicists.
However, the softwares used by the engineers are not compatible with
ROOT.
This thesis describes a second result obtained during this PhD project: the
implementation of the TGeoCad Interface that allows the conversion of
ROOT geometries to STEP format, compatible with CAD systems. The
interface provides an important communication and collaboration tool between
physicists and engineers, dealing with the simulation and the design
of the detector geometry
The AliEn system, status and perspectives
AliEn is a production environment that implements several components of the
Grid paradigm needed to simulate, reconstruct and analyse HEP data in a
distributed way. The system is built around Open Source components, uses the
Web Services model and standard network protocols to implement the computing
platform that is currently being used to produce and analyse Monte Carlo data
at over 30 sites on four continents. The aim of this paper is to present the
current AliEn architecture and outline its future developments in the light of
emerging standards.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, Ca, USA, March 2003, 10 pages, Word, 10 figures. PSN
MOAT00
The PROOF Distributed Parallel Analysis Framework based on ROOT
The development of the Parallel ROOT Facility, PROOF, enables a physicist to
analyze and understand much larger data sets on a shorter time scale. It makes
use of the inherent parallelism in event data and implements an architecture
that optimizes I/O and CPU utilization in heterogeneous clusters with
distributed storage. The system provides transparent and interactive access to
gigabytes today. Being part of the ROOT framework PROOF inherits the benefits
of a performant object storage system and a wealth of statistical and
visualization tools. This paper describes the key principles of the PROOF
architecture and the implementation of the system. We will illustrate its
features using a simple example and present measurements of the scalability of
the system. Finally we will discuss how PROOF can be interfaced and make use of
the different Grid solutions.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, CA, USA, March 2003, 5 pages, LaTeX, 4 eps figures. PSN
TULT00
ROOT Status and Future Developments
In this talk we will review the major additions and improvements made to the
ROOT system in the last 18 months and present our plans for future
developments. The additons and improvements range from modifications to the I/O
sub-system to allow users to save and restore objects of classes that have not
been instrumented by special ROOT macros, to the addition of a geometry package
designed for building, browsing, tracking and visualizing detector geometries.
Other improvements include enhancements to the quick analysis sub-system
(TTree::Draw()), the addition of classes that allow inter-file object
references (TRef, TRefArray), better support for templated and STL classes,
amelioration of the Automatic Script Compiler and the incorporation of new
fitting and mathematical tools. Efforts have also been made to increase the
modularity of the ROOT system with the introduction of more abstract interfaces
and the development of a plug-in manager. In the near future, we intend to
continue the development of PROOF and its interfacing with GRID environments.
We plan on providing an interface between Geant3, Geant4 and Fluka and the new
geometry package. The ROOT GUI classes will finally be available on Windows and
we plan to release a GUI inspector and builder. In the last year, ROOT has
drawn the endorsement of additional experiments and institutions. It is now
officially supported by CERN and used as key I/O component by the LCG project.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, Ca, USA, March 2003, 5 pages, MSWord, pSN MOJT00
A Computer Aided Detection system for mammographic images implemented on a GRID infrastructure
The use of an automatic system for the analysis of mammographic images has
proven to be very useful to radiologists in the investigation of breast cancer,
especially in the framework of mammographic-screening programs. A breast
neoplasia is often marked by the presence of microcalcification clusters and
massive lesions in the mammogram: hence the need for tools able to recognize
such lesions at an early stage. In the framework of the GPCALMA (GRID Platform
for Computer Assisted Library for MAmmography) project, the co-working of
italian physicists and radiologists built a large distributed database of
digitized mammographic images (about 5500 images corresponding to 1650
patients) and developed a CAD (Computer Aided Detection) system, able to make
an automatic search of massive lesions and microcalcification clusters. The CAD
is implemented in the GPCALMA integrated station, which can be used also for
digitization, as archive and to perform statistical analyses. Some GPCALMA
integrated stations have already been implemented and are currently on clinical
trial in some italian hospitals. The emerging GRID technology can been used to
connect the GPCALMA integrated stations operating in different medical centers.
The GRID approach will support an effective tele- and co-working between
radiologists, cancer specialists and epidemiology experts by allowing remote
image analysis and interactive online diagnosis.Comment: 5 pages, 5 figures, to appear in the Proceedings of the 13th
IEEE-NPSS Real Time Conference 2003, Montreal, Canada, May 18-23 200
Distributed storage and cloud computing: a test case
Since 2003 the computing farm hosted by the INFN Tier3 facility in Trieste supports the activities of many scientific communities. Hundreds of jobs from 45 different VOs, including those of the LHC experiments, are processed simultaneously. Given that normally the requirements of the different computational communities are not synchronized, the probability that at any given time the resources owned by one of the participants are not fully utilized is quite high. A balanced compensation should in principle allocate the free resources to other users, but there are limits to this mechanism. In fact, the Trieste site may not hold the amount of data needed to attract enough analysis jobs, and even in that case there could be a lack of bandwidth for their access. The Trieste ALICE and CMS computing groups, in collaboration with other Italian groups, aim to overcome the limitations of existing solutions using two approaches: sharing the data among all the participants taking full advantage of GARR-X wide area networks (10 GB/s) and integrating the resources dedicated to batch analysis with the ones reserved for dynamic interactive analysis, through modern solutions as cloud computing
Improvements of LHC data analysis techniques at Italian WLCG sites. Case-study of the transfer of this technology to other research areas
In 2012, 14 Italian institutions participating in LHC Experiments won a grant from the Italian Ministry of Research (MIUR), with the aim of optimising analysis activities, and in general the Tier2/Tier3 infrastructure. We report on the activities being researched upon, on the considerable improvement in the ease of access to resources by physicists, also those with no specific computing interests. We focused on items like distributed storage federations, access to batch-like facilities, provisioning of user interfaces on demand and cloud systems. R&D on next-generation databases, distributed analysis interfaces, and new computing architectures was also carried on. The project, ending in the first months of 2016, will produce a white paper with recommendations on best practices for data-analysis support by computing centers
- …