Search CORE

18 research outputs found

ReSS: A Resource Selection Service for the Open Science Grid

Author: Garzoglio Gabriele
Levshina Tanya
Mhashilkar Parag
Timm Steve
Publication venue: Fermi National Accelerator Laboratory
Publication date: 01/01/2008
Field of study

The Open Science Grid offers access to hundreds of computing and storage resources via standard Grid interfaces. Before the deployment of an automated resource selection system, users had to submit jobs directly to these resources. They would manually select a resource and specify all relevant attributes in the job description prior to submitting the job. The necessity of a human intervention in resource selection and attribute specification hinders automated job management components from accessing OSG resources and it is inconvenient for the users. The Resource Selection Service (ReSS) project addresses these shortcomings. The system integrates condor technology, for the core match making service, with the gLite CEMon component, for gathering and publishing resource information in the Glue Schema format. Each one of these components communicates over secure protocols via web services interfaces. The system is currently used in production on OSG by the DZero Experiment, the Engagement Virtual Organization, and the Dark Energy. It is also the resource selection service for the Fermilab Campus Grid, FermiGrid. ReSS is considered a lightweight solution to push-based workload management. This paper describes the architecture, performance, and typical usage of the system

CiteSeerX

UNT Digital Library

Recommended from our members

ReSS: Resource Selection Service for National and Campus Grid Infrastructure

Author: Garzoglio Gabriele
Levshina Tanya
Mhashilkar Parag
Timm Steve
Publication venue: Fermi National Accelerator Laboratory
Publication date: 01/05/2009
Field of study

The Open Science Grid (OSG) offers access to around hundred Compute elements (CE) and storage elements (SE) via standard Grid interfaces. The Resource Selection Service (ReSS) is a push-based workload management system that is integrated with the OSG information systems and resources. ReSS integrates standard Grid tools such as Condor, as a brokering service and the gLite CEMon, for gathering and publishing resource information in GLUE Schema format. ReSS is used in OSG by Virtual Organizations (VO) such as Dark Energy Survey (DES), DZero and Engagement VO. ReSS is also used as a Resource Selection Service for Campus Grids, such as FermiGrid. VOs use ReSS to automate the resource selection in their workload management system to run jobs over the grid. In the past year, the system has been enhanced to enable publication and selection of storage resources and of any special software or software libraries (like MPI libraries) installed at computing resources. In this paper, we discuss the Resource Selection Service, its typical usage on the two scales of a National Cyber Infrastructure Grid, such as OSG, and of a campus Grid, such as FermiGrid

UNT Digital Library

Metrics Correlation and Analysis Service (MCAS)

Author: Andrew Baranovski
Dave Dykstra
Gabriele Garzoglio
Parag Mhashilkar
Tanya Levshina
Ted Hesselroth
Publication venue
Publication date: 06/03/2020
Field of study

Abstract. The complexity of Grid workflow activities and their associated software stacks inevitably involves multiple organizations, ownership, and deployment domains. In this setting, important and common tasks such as the correlation and display of metrics and debugging information (fundamental ingredients of troubleshooting) are challenged by the informational entropy inherent to independently maintained and operated software components. Because such an information pool is disorganized, it is a difficult environment for business intelligence analysis i.e. troubleshooting, incident investigation, and trend spotting. The mission of the MCAS project is to deliver a software solution to help with adaptation, retrieval, correlation, and display of workflow-driven data and of type-agnostic events, generated by loosely coupled or fully decoupled middleware

CiteSeerX

Optimizing Large Data Transfers over 100Gbps Wide Area Networks

Author: Anupam Rajendran
Dave Dykstra
Gabriele Garzoglio
Hyunwoo Kim
Ioan Raicu
Parag Mhashilkar
Publication venue
Publication date: 11/10/2013
Field of study

testbed, which offers the opportunity for evaluating applications and middleware used by scientific experiments. This testbed is a prototype of a 100 Gbps wide-area network backbone, which links several Department of Energy (DOE) national laboratories, universities and other research institutions. These scientific experiments involve movement of large datasets for collaborations among researchers at different sites and thus require advanced infrastructure for supporting large and fast data transfers. A 100 Gbps network testbed is a key component of the ANI project and is used for DOE’s science research programs. This work presents results towards obtaining maximum throughput in large data transfers by optimizing and fine-tuning scientific applications and middleware to use this advanced infrastructure efficiently. A detailed performance evaluation is discussed measuring both applications, from High Energy Physics (HEP) and from data transfer middleware (GridFTP, Globus Online, Storage Resource Management, XrootD and Squid) at 100 Gbps speeds and 53 ms of latency. Results show that up to 97 % efficiency of such high bandwidth high latency network is possible, achieving 80-90 Gbps in most test cases with a peak transfer rate of 100 Gbps

CiteSeerX

Crossref

The pilot way to Grid resources using glideinWMS

Author: Bradley Daniel C
Holzman Burt
Mhashilkar Parag
Padhi Sanjay
Sfiligoi Igor
Wurthwrin Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/06/2014
Field of study

Grid computing has become very popular in big and widespread scientific communities with high computing demands, like high energy physics. Computing resources are being distributed over many independent sites with only a thin layer of grid middleware shared between them. This deployment model has proven to be very convenient for computing resource providers, but has introduced several problems for the users of the system, the three major being the complexity of job scheduling, the non-uniformity of compute resources, and the lack of good job monitoring. Pilot jobs address all the above problems by creating a virtual private computing pool on top of grid resources. This paper presents both the general pilot concept, as well as a concrete implementation, called glideinWMS, deployed in the Open Science Grid

CERN Document Server

Recommended from our members

Metrics correlation and analysis service (MCAS)

Author: /Fermilab
Baranovski Andrew
Dykstra Dave
Garzoglio Gabriele
Hesselroth Ted
Levshina Tanya
Mhashilkar Parag
Publication venue: Fermi National Accelerator Laboratory
Publication date
Field of study

The complexity of Grid workflow activities and their associated software stacks inevitably involves multiple organizations, ownership, and deployment domains. In this setting, important and common tasks such as the correlation and display of metrics and debugging information (fundamental ingredients of troubleshooting) are challenged by the informational entropy inherent to independently maintained and operated software components. Because such an information 'pond' is disorganized, it a difficult environment for business intelligence analysis i.e. troubleshooting, incident investigation and trend spotting. The mission of the MCAS project is to deliver a software solution to help with adaptation, retrieval, correlation, and display of workflow-driven data and of type-agnostic events, generated by disjoint middleware

UNT Digital Library

HPC resource integration into CMS Computing via HEPCloud

Author: Aftab Khan Farrukh
Bloom Kenneth
Gutsche Oliver
Holzman Burt
Hufnagel Dirk
Mason David
Mhashilkar Parag
Timm Steven
Tiradani Anthony
Publication venue: 'EDP Sciences'
Publication date: 01/01/2019
Field of study

The higher energy and luminosity from the LHC in Run 2 have put increased pressure on CMS computing resources. Extrapolating to even higher luminosities (and thus higher event complexities and trigger rates) beyond Run 3, it becomes clear that simply scaling up the the current model of CMS computing alone will become economically unfeasible. High Performance Computing (HPC) facilities, widely used in scientific computing outside of HEP, have the potential to help fill the gap. Here we describe the U.S.CMS efforts to integrate US HPC resources into CMS Computing via the HEPCloud project at Fermilab. We present advancements in our ability to use NERSC resources at scale and efforts to integrate other HPC sites as well. We present experience in the elastic use of HPC resources, quickly scaling up use when so required by CMS workflows. We also present performance studies of the CMS multi-threaded framework on both Haswell and KNL HPC resources

Directory of Open Access Journals

HPC resource integration into CMS Computing via HEPCloud

Author: Bloom Kenneth
Gutsche Oliver
Holzman Burt
Hufnagel Dirk
Khan Farrukh Aftab
Mason David
Mhashilkar Parag
Timm Steven
Tiradani Anthony
Publication venue: 'EDP Sciences'
Publication date: 16/10/2018
Field of study

The higher energy and luminosity from the LHC in Run 2 have put increased pressure on CMS computing resources. Extrapolating to even higher luminosities (and thus higher event complexities and trigger rates) beyond Run3, it becomes clear that simply scaling up the the current model of CMS computing alone will become economically unfeasible. High Performance Computing (HPC) facilities, widely used in scientific computing outside of HEP, have the potential to help fill the gap. Here we describe the U.S.CMS efforts to integrate US HPC resources into CMS Computing via the HEPCloud project at Fermilab. We present advancements in our ability to use NERSC resources at scale and efforts to integrate other HPC sites as well. We present experience in the elastic use of HPC resources, quickly scaling up use when so required by CMS workflows. We also present performance studies of the CMS multi-threaded framework on both Haswell and KNL HPC resources

EDP Sciences OAI-PMH repository (1.2.0)

CERN Document Server

The pilot way to Grid resources using glideinWMS

Author: /Fermilab
/Fermilab
/UC San Diego
/Wisconsin U. Madison
Bradley Daniel C.
Holzman Burt
Mhashilkar Parag
Padhi Sanjay
Sfiligoi Igor
Wurthwrin Frank
Publication venue: Fermi National Accelerator Laboratory
Publication date: 01/01/2009
Field of study

Grid computing has become very popular in big and widespread scientific communities with high computing demands, like high energy physics. Computing resources are being distributed over many independent sites with only a thin layer of Grid middleware shared between them. This deployment model has proven to be very convenient for computing resource providers, but has introduced several problems for the users of the system, the three major being the complexity of job scheduling, the nonuniformity of computer resources, and the lack of good job monitoring. Pilot jobs address all the above problems by creating a virtual private computing pool on top of Grid resources. This paper presents both the general pilot concept, as well as a concrete implementation, called glideinWMS, deployed in the Open Science Grid

Crossref

UNT Digital Library