3,763 research outputs found
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
Addressing the Challenges in Federating Edge Resources
This book chapter considers how Edge deployments can be brought to bear in a
global context by federating them across multiple geographic regions to create
a global Edge-based fabric that decentralizes data center computation. This is
currently impractical, not only because of technical challenges, but is also
shrouded by social, legal and geopolitical issues. In this chapter, we discuss
two key challenges - networking and management in federating Edge deployments.
Additionally, we consider resource and modeling challenges that will need to be
addressed for a federated Edge.Comment: Book Chapter accepted to the Fog and Edge Computing: Principles and
Paradigms; Editors Buyya, Sriram
Data acquisition software for the CMS strip tracker
The CMS silicon strip tracker, providing a sensitive area of approximately 200 m2 and comprising 10 million readout channels, has recently been completed at the tracker integration facility at CERN. The strip tracker community is currently working to develop and integrate the online and offline software frameworks, known as XDAQ and CMSSW respectively, for the purposes of data acquisition and detector commissioning and monitoring. Recent developments have seen the integration of many new services and tools within the online data acquisition system, such as event building, online distributed analysis, an online monitoring framework, and data storage management. We review the various software components that comprise the strip tracker data acquisition system, the software architectures used for stand-alone and global data-taking modes. Our experiences in commissioning and operating one of the largest ever silicon micro-strip tracking systems are also reviewed
Monitoring the CMS strip tracker readout system
The CMS Silicon Strip Tracker at the LHC comprises a sensitive area of approximately 200 m2 and 10 million readout channels. Its data acquisition system is based around a custom analogue front-end chip. Both the control and the readout of the front-end electronics are performed by off-detector VME boards in the counting room, which digitise the raw event data and perform zero-suppression and formatting. The data acquisition system uses the CMS online software framework to configure, control and monitor the hardware components and steer the data acquisition. The first data analysis is performed online within the official CMS reconstruction framework, which provides many services, such as distributed analysis, access to geometry and conditions data, and a Data Quality Monitoring tool based on the online physics reconstruction. The data acquisition monitoring of the Strip Tracker uses both the data acquisition and the reconstruction software frameworks in order to provide real-time feedback to shifters on the operational state of the detector, archiving for later analysis and possibly trigger automatic recovery actions in case of errors. Here we review the proposed architecture of the monitoring system and we describe its software components, which are already in place, the various monitoring streams available, and our experiences of operating and monitoring a large-scale system
Many-Task Computing and Blue Waters
This report discusses many-task computing (MTC) generically and in the
context of the proposed Blue Waters systems, which is planned to be the largest
NSF-funded supercomputer when it begins production use in 2012. The aim of this
report is to inform the BW project about MTC, including understanding aspects
of MTC applications that can be used to characterize the domain and
understanding the implications of these aspects to middleware and policies.
Many MTC applications do not neatly fit the stereotypes of high-performance
computing (HPC) or high-throughput computing (HTC) applications. Like HTC
applications, by definition MTC applications are structured as graphs of
discrete tasks, with explicit input and output dependencies forming the graph
edges. However, MTC applications have significant features that distinguish
them from typical HTC applications. In particular, different engineering
constraints for hardware and software must be met in order to support these
applications. HTC applications have traditionally run on platforms such as
grids and clusters, through either workflow systems or parallel programming
systems. MTC applications, in contrast, will often demand a short time to
solution, may be communication intensive or data intensive, and may comprise
very short tasks. Therefore, hardware and software for MTC must be engineered
to support the additional communication and I/O and must minimize task dispatch
overheads. The hardware of large-scale HPC systems, with its high degree of
parallelism and support for intensive communication, is well suited for MTC
applications. However, HPC systems often lack a dynamic resource-provisioning
feature, are not ideal for task communication via the file system, and have an
I/O system that is not optimized for MTC-style applications. Hence, additional
software support is likely to be required to gain full benefit from the HPC
hardware
Metadata for ATLAS
This document provides an overview of the metadata, which are needed to characterize ATLAS event data at different levels (a complete run, data streams within a run, luminosity blocks within a run, individual events)
Software Defined Networks based Smart Grid Communication: A Comprehensive Survey
The current power grid is no longer a feasible solution due to
ever-increasing user demand of electricity, old infrastructure, and reliability
issues and thus require transformation to a better grid a.k.a., smart grid
(SG). The key features that distinguish SG from the conventional electrical
power grid are its capability to perform two-way communication, demand side
management, and real time pricing. Despite all these advantages that SG will
bring, there are certain issues which are specific to SG communication system.
For instance, network management of current SG systems is complex, time
consuming, and done manually. Moreover, SG communication (SGC) system is built
on different vendor specific devices and protocols. Therefore, the current SG
systems are not protocol independent, thus leading to interoperability issue.
Software defined network (SDN) has been proposed to monitor and manage the
communication networks globally. This article serves as a comprehensive survey
on SDN-based SGC. In this article, we first discuss taxonomy of advantages of
SDNbased SGC.We then discuss SDN-based SGC architectures, along with case
studies. Our article provides an in-depth discussion on routing schemes for
SDN-based SGC. We also provide detailed survey of security and privacy schemes
applied to SDN-based SGC. We furthermore present challenges, open issues, and
future research directions related to SDN-based SGC.Comment: Accepte
Online Fault Classification in HPC Systems through Machine Learning
As High-Performance Computing (HPC) systems strive towards the exascale goal,
studies suggest that they will experience excessive failure rates. For this
reason, detecting and classifying faults in HPC systems as they occur and
initiating corrective actions before they can transform into failures will be
essential for continued operation. In this paper, we propose a fault
classification method for HPC systems based on machine learning that has been
designed specifically to operate with live streamed data. We cast the problem
and its solution within realistic operating constraints of online use. Our
results show that almost perfect classification accuracy can be reached for
different fault types with low computational overhead and minimal delay. We
have based our study on a local dataset, which we make publicly available, that
was acquired by injecting faults to an in-house experimental HPC system.Comment: Accepted for publication at the Euro-Par 2019 conferenc
- …