7,371 research outputs found

    High-Throughput Computing on High-Performance Platforms: A Case Study

    Full text link
    The computing systems used by LHC experiments has historically consisted of the federation of hundreds to thousands of distributed resources, ranging from small to mid-size resource. In spite of the impressive scale of the existing distributed computing solutions, the federation of small to mid-size resources will be insufficient to meet projected future demands. This paper is a case study of how the ATLAS experiment has embraced Titan---a DOE leadership facility in conjunction with traditional distributed high- throughput computing to reach sustained production scales of approximately 52M core-hours a years. The three main contributions of this paper are: (i) a critical evaluation of design and operational considerations to support the sustained, scalable and production usage of Titan; (ii) a preliminary characterization of a next generation executor for PanDA to support new workloads and advanced execution modes; and (iii) early lessons for how current and future experimental and observational systems can be integrated with production supercomputers and other platforms in a general and extensible manner

    Survey and Analysis of Production Distributed Computing Infrastructures

    Full text link
    This report has two objectives. First, we describe a set of the production distributed infrastructures currently available, so that the reader has a basic understanding of them. This includes explaining why each infrastructure was created and made available and how it has succeeded and failed. The set is not complete, but we believe it is representative. Second, we describe the infrastructures in terms of their use, which is a combination of how they were designed to be used and how users have found ways to use them. Applications are often designed and created with specific infrastructures in mind, with both an appreciation of the existing capabilities provided by those infrastructures and an anticipation of their future capabilities. Here, the infrastructures we discuss were often designed and created with specific applications in mind, or at least specific types of applications. The reader should understand how the interplay between the infrastructure providers and the users leads to such usages, which we call usage modalities. These usage modalities are really abstractions that exist between the infrastructures and the applications; they influence the infrastructures by representing the applications, and they influence the ap- plications by representing the infrastructures

    Observing the clouds : a survey and taxonomy of cloud monitoring

    Get PDF
    This research was supported by a Royal Society Industry Fellowship and an Amazon Web Services (AWS) grant. Date of Acceptance: 10/12/2014Monitoring is an important aspect of designing and maintaining large-scale systems. Cloud computing presents a unique set of challenges to monitoring including: on-demand infrastructure, unprecedented scalability, rapid elasticity and performance uncertainty. There are a wide range of monitoring tools originating from cluster and high-performance computing, grid computing and enterprise computing, as well as a series of newer bespoke tools, which have been designed exclusively for cloud monitoring. These tools express a number of common elements and designs, which address the demands of cloud monitoring to various degrees. This paper performs an exhaustive survey of contemporary monitoring tools from which we derive a taxonomy, which examines how effectively existing tools and designs meet the challenges of cloud monitoring. We conclude by examining the socio-technical aspects of monitoring, and investigate the engineering challenges and practices behind implementing monitoring strategies for cloud computing.Publisher PDFPeer reviewe

    High performance computing simulator for the performance assessment of trajectory based operations

    Get PDF
    High performance computing (HPC), both at hardware and software level, has demonstrated significant improve- ments in processing large datasets in a timely manner. However, HPC in the field of air traffic management (ATM) can be much more than only a time reducing tool. It could also be used to build an ATM simulator in which distributed scenarios where decentralized elements (airspace users) interact through a centralized manager in order to generate a trajectory-optimized conflict-free scenario. In this work, we introduce an early prototype of an ATM simulator, focusing on air traffic flow management at strategic, pre-tactical and tactical levels, which allows the calculation of safety and efficiency indicators for optimized trajectories, both at individual and network level. The software architecture of the simulator, relying on a HPC cluster of computers, has been preliminary tested with a set of flights whose trajectory vertical profiles have been optimized according to two different concepts of operations: conventional cruise operations (i.e. flying at constant altitudes and according to the flight levels scheme rules) and continuous climb cruise operations (i.e., optimizing the trajectories with no vertical constraints). The novel ATM simulator has been tested to show preliminary benchmarking results between these two concepts of operations. The simulator here presented can contribute as a testbed to evaluate the potential benefits of future Trajectory Based Operations and to understand the complex relationships among the different ATM key performance areasPeer ReviewedPostprint (published version

    The Strategy of the Commons: Modelling the Annual Cost of Successful ICT Services for European Research

    Get PDF
    The provision of ICT services for research is increasingly using Cloud services to complement the traditional federation of computing centres. Due to the complex funding structure and differences in the basic business model, comparing the cost-effectiveness of these options requires a new approach to cost assessment. This paper presents a cost assessment method addressing the limitations of the standard methods and some of the initial results of the study. This acts as an illustration of the kind of cost assessment issues high-utilisation rate ICT services should consider when choosing between different infrastructure options. The research is co-funded by the European Commission Seventh Framework Programme through the e-FISCAL project (contract number RI-283449)
    • …
    corecore