40,630 research outputs found
Resource Management Services for a Grid Analysis Environment
Selecting optimal resources for submitting jobs on a computational Grid or
accessing data from a data grid is one of the most important tasks of any Grid
middleware. Most modern Grid software today satisfies this responsibility and
gives a best-effort performance to solve this problem. Almost all decisions
regarding scheduling and data access are made by the software automatically,
giving users little or no control over the entire process. To solve this
problem, a more interactive set of services and middleware is desired that
provides users more information about Grid weather, and gives them more control
over the decision making process. This paper presents a set of services that
have been developed to provide more interactive resource management
capabilities within the Grid Analysis Environment (GAE) being developed
collaboratively by Caltech, NUST and several other institutes. These include a
steering service, a job monitoring service and an estimator service that have
been designed and written using a common Grid-enabled Web Services framework
named Clarens. The paper also presents a performance analysis of the developed
services to show that they have indeed resulted in a more interactive and
powerful system for user-centric Grid-enabled physics analysis.Comment: 8 pages, 7 figures. Workshop on Web and Grid Services for Scientific
Data Analysis at the Int Conf on Parallel Processing (ICPP05). Norway June
200
ScotGrid: Providing an Effective Distributed Tier-2 in the LHC Era
ScotGrid is a distributed Tier-2 centre in the UK with sites in Durham,
Edinburgh and Glasgow. ScotGrid has undergone a huge expansion in hardware in
anticipation of the LHC and now provides more than 4MSI2K and 500TB to the LHC
VOs. Scaling up to this level of provision has brought many challenges to the
Tier-2 and we show in this paper how we have adopted new methods of organising
the centres, from fabric management and monitoring to remote management of
sites to management and operational procedures, to meet these challenges. We
describe how we have coped with different operational models at the sites,
where Glagsow and Durham sites are managed "in house" but resources at
Edinburgh are managed as a central university resource. This required the
adoption of a different fabric management model at Edinburgh and a special
engagement with the cluster managers. Challenges arose from the different job
models of local and grid submission that required special attention to resolve.
We show how ScotGrid has successfully provided an infrastructure for ATLAS and
LHCb Monte Carlo production. Special attention has been paid to ensuring that
user analysis functions efficiently, which has required optimisation of local
storage and networking to cope with the demands of user analysis. Finally,
although these Tier-2 resources are pledged to the whole VO, we have
established close links with our local physics user communities as being the
best way to ensure that the Tier-2 functions effectively as a part of the LHC
grid computing framework..Comment: Preprint for 17th International Conference on Computing in High
Energy and Nuclear Physics, 7 pages, 1 figur
A Security Monitoring Framework For Virtualization Based HEP Infrastructures
High Energy Physics (HEP) distributed computing infrastructures require
automatic tools to monitor, analyze and react to potential security incidents.
These tools should collect and inspect data such as resource consumption, logs
and sequence of system calls for detecting anomalies that indicate the presence
of a malicious agent. They should also be able to perform automated reactions
to attacks without administrator intervention. We describe a novel framework
that accomplishes these requirements, with a proof of concept implementation
for the ALICE experiment at CERN. We show how we achieve a fully virtualized
environment that improves the security by isolating services and Jobs without a
significant performance impact. We also describe a collected dataset for
Machine Learning based Intrusion Prevention and Detection Systems on Grid
computing. This dataset is composed of resource consumption measurements (such
as CPU, RAM and network traffic), logfiles from operating system services, and
system call data collected from production Jobs running in an ALICE Grid test
site and a big set of malware. This malware was collected from security
research sites. Based on this dataset, we will proceed to develop Machine
Learning algorithms able to detect malicious Jobs.Comment: Proceedings of the 22nd International Conference on Computing in High
Energy and Nuclear Physics, CHEP 2016, 10-14 October 2016, San Francisco.
Submitted to Journal of Physics: Conference Series (JPCS
- …