15 research outputs found

    Predicting dataset popularity for the CMS experiment

    Full text link
    The CMS experiment at the LHC accelerator at CERN relies on its computing infrastructure to stay at the frontier of High Energy Physics, searching for new phenomena and making discoveries. Even though computing plays a significant role in physics analysis we rarely use its data to predict the system behavior itself. A basic information about computing resources, user activities and site utilization can be really useful for improving the throughput of the system and its management. In this paper, we discuss a first CMS analysis of dataset popularity based on CMS meta-data which can be used as a model for dynamic data placement and provide the foundation of data-driven approach for the CMS computing infrastructure.Comment: Submitted to proceedings of 17th International workshop on Advanced Computing and Analysis Techniques in physics research (ACAT

    PhEDEx Data Service

    Get PDF
    The PhEDEx Data Service provides access to information from the central PhEDEx database, as well as certificate-authenticated managerial operations such as requesting the transfer or deletion of data. The Data Service is integrated with the SiteDB service for fine-grained access control, providing a safe and secure environment for operations. A plug-in architecture allows server-side modules to be developed rapidly and easily by anyone familiar with the schema, and can automatically return the data in a variety of formats for use by different client technologies. Using HTTP access via the Data Service instead of direct database connections makes it possible to build monitoring web-pages with complex drill-down operations, suitable for debugging or presentation from many aspects. This will form the basis of the new PhEDEx website in the near future, as well as providing access to PhEDEx information and certificate-authenticated services for other CMS dataflow and workflow management tools such as CRAB, WMCore, DBS and the dashboard. A PhEDEx command-line client tool provides one-stop access to all the functions of the PhEDEx Data Service interactively, for use in simple scripts that do not access the service directly. The client tool provides certificate-authenticated access to managerial functions, so all the functions of the PhEDEx Data Service are available to it. The tool can be expanded by plug-ins which can combine or extend the client-side manipulation of data from the Data Service, providing a powerful environment for manipulating data within PhEDEx

    The Spring 2002 DAQ TDR Production

    No full text
    In Spring 2002 the CMS Production team produced a large sample of Monte Carlo events for the CMS DAQ TDR. This note documents the process by which those events were produced, with details of the tools, the architecture of the production machinery, and the individual sites that took part

    Resource Monitoring Tool for CMS production

    No full text
    A monitoring tool is described which not only tracks and recognises errors but also works together with a management system that is responsible for resource allocation. In cluster/grid computing, the resources of all accessible computers are at the disposal of end users. With that much power at hand, the responsibility of the software managing these resources also increases. The better utilization of resources means that a monitoring system should make the collected data persistent, so that the management system has up-to-date information but also has a meaningful historical record. This database can then be consulted for finding the best available resources in a given scenario, and can also be used for understanding historical trends. The Resource Monitoring Tool, RMT, is such a tool, which caters for these needs. Its framework is designed in such a way that its potential can be enhanced easily by adding more modules

    Software packaging with DAR

    No full text
    One of the important tasks in distributed computing is to deliver software applications to the computing resources. Distribution after Release (DAR) tool, is being used to package software applications for the world-wide event production by the CMS Collaboration. This presentation will focus on the concept of packaging applications based on the runtime environment. We discuss solutions for more effective software distribution based on two years experience with DAR. Finally, we will give an overview of the application distribution process and the interfaces to the CMS production tools

    Towards Managed Terabit/s Scientific Data Flows

    No full text
    Scientific collaborations on a global scale, such as the LHC experiments at CERN [1], rely today on the presence of high performance, high availability networks. In this paper we review the developments performed over the last several years on high throughput applications, multilayer software-defined network path provisioning, path selection and load balancing methods, and the integration of these methods with the mainstream data transfer and management applications of CMS [2], one of the major LHC experiments. These developments are folded into a compact system capable of moving data among research sites at the 1 Terabit per second scale. Several aspects that went into the design and target different components of the system are presented, including: evaluation of the 40 and 100Gbps capable hardware on both network and server side, data movement applications, flow management and the network-application interface leveraging advanced network services. We report on comparative results between several multi-path algorithms, the performance increase obtained using this approach, and present results from the related SC'13 demonstration
    corecore