Search CORE

5,063 research outputs found

HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

Author: Bauerdick Lothar A. T.
Bockelman Brian
Dykstra Dave
Fisk Ian
Fuess Stuart
Garzoglio Gabriele
Girone Maria
Gutsche Oliver
Holzman Burt
Hufnagel Dirk
Kennedy Robert
Kim Hyunwoo
Magini Nicolo
Mason David
Spentzouris Panagiotis
Timm Steve
Tiradani Anthony
Vaandering Eric W.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/09/2017
Field of study

Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing nterest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized both local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. In addition, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.Comment: 15 pages, 9 figure

arXiv.org e-Print Archive

CERN Document Server

Tupleware: Redefining Modern Analytics

Author: Cetintemel Ugur
Crotty Andrew
Dursun Kayhan
Galakatos Alex
Kraska Tim
Zdonik Stan
Publication venue
Publication date: 30/07/2014
Field of study

There is a fundamental discrepancy between the targeted and actual users of current analytics frameworks. Most systems are designed for the data and infrastructure of the Googles and Facebooks of the world---petabytes of data distributed across large cloud deployments consisting of thousands of cheap commodity machines. Yet, the vast majority of users operate clusters ranging from a few to a few dozen nodes, analyze relatively small datasets of up to a few terabytes, and perform primarily compute-intensive operations. Targeting these users fundamentally changes the way we should build analytics systems. This paper describes the design of Tupleware, a new system specifically aimed at the challenges faced by the typical user. Tupleware's architecture brings together ideas from the database, compiler, and programming languages communities to create a powerful end-to-end solution for data analysis. We propose novel techniques that consider the data, computations, and hardware together to achieve maximum performance on a case-by-case basis. Our experimental evaluation quantifies the impact of our novel techniques and shows orders of magnitude performance improvement over alternative systems

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Automation of Determination of Optimal Intra-Compute Node Parallelism

Author: Brown James C.
Gómez-Iglesias Antonio
Publication venue
Publication date: 01/01/2016
Field of study

Maximizing the productivity of modern multicore and manycore chips requires optimizing parallelism at the compute node level. This is, however, a complex multi-step process. It is an iterative method requiring determining optimal degrees of parallel scalability and optimizing memory access behavior. Further, there are multiple cases to be considered, programs which use only MPI or OpenMP and hybrid (MPI +OpenMP) programs. This paper presents a set of three coordinated workﬂows for determining the optimal parallelism at the program level for MPI programs and at the loop level for hybrid (MPI+OpenMP) cases. The paper also details mostly automated implementations of these workﬂows using the PerfExpert infrastructure. Finally the paper presents case studies demonstrating both the applicability and the effectiveness of optimizing parallelism at the compute node level. The results shown in the paper will provide valuable information to further advance in the full automation of the workﬂows. The software implementing the parallelism scalability optimization is open source and available for download.Texas Advanced Computing Center (TACC)Computer Science

Texas ScholarWorks

Big Data in HEP: A comprehensive use case study

Author: Cremonesi Matteo
Elmer Peter
Gutsche Oliver
Jayatilaka Bo
Kowalkowski Jim
Pivarski Jim
Sehrish Saba
Surez Cristina Mantilla
Svyatkovskiy Alexey
Tran Nhan
Publication venue: 'IOP Publishing'
Publication date: 12/03/2017
Field of study

Experimental Particle Physics has been at the forefront of analyzing the worlds largest datasets for decades. The HEP community was the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems collectively called Big Data technologies have emerged to support the analysis of Petabyte and Exabyte datasets in industry. While the principles of data analysis in HEP have not changed (filtering and transforming experiment-specific data formats), these new technologies use different approaches and promise a fresh look at analysis of very large datasets and could potentially reduce the time-to-physics with increased interactivity. In this talk, we present an active LHC Run 2 analysis, searching for dark matter with the CMS detector, as a testbed for Big Data technologies. We directly compare the traditional NTuple-based analysis with an equivalent analysis using Apache Spark on the Hadoop ecosystem and beyond. In both cases, we start the analysis with the official experiment data formats and produce publication physics plots. We will discuss advantages and disadvantages of each approach and give an outlook on further studies needed.Comment: Proceedings for 22nd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2016

arXiv.org e-Print Archive

CERN Document Server

CMS Monte Carlo production in the WLCG computing Grid

Author: Abbrescia M.
Bacchi W.
Caballero J.
Codispoti G.
De Filippis N.
De Weirdt S.
Donvito G.
Elmer P.
Eulisse G.
Evans D.
Fanfani A.
Flossdorf A.
Guan W.
Hammad G.
Hernandez J. M.
Hof C.
Kalini S.
Kavka C.
Khomitch A.
Kreuzer P.
Lazaridis C.
Maes J.
Maggi G.
Mohapatra A.
Myers S.
Pompili A.
Sanches J. A.
Sarkar S.
Van Lingen F.
van Mulders P.
Villella I.
Wakefield S.
Publication venue: 'AIP Publishing'
Publication date: 01/01/2008
Field of study

Monte Carlo production in CMS has received a major boost in performance and scale since the past CHEP06 conference. The production system has been re-engineered in order to incorporate the experience gained in running the previous system and to integrate production with the new CMS event data model, data management system and data processing framework. The system is interfaced to the two major computing Grids used by CMS, the LHC Computing Grid (LCG) and the Open Science Grid (OSG). Operational experience and integration aspects of the new CMS Monte Carlo production system is presented together with an analysis of production statistics. The new system automatically handles job submission, resource monitoring, job queuing, job distribution according to the available resources, data merging, registration of data into the data bookkeeping, data location, data transfer and placement systems. Compared to the previous production system automation, reliability and performance have been considerably improved. A more efficient use of computing resources and a better handling of the inherent Grid unreliability have resulted in an increase of production scale by about an order of magnitude, capable of running in parallel at the order of ten thousand jobs and yielding more than two million events per day

Caltech Authors

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

DI-fusion

CERN Document Server