100 research outputs found

    Simulating Distributed Systems

    Get PDF
    The simulation framework developed within the "Models of Networked Analysis at Regional Centers" (MONARC) project as a design and optimization tool for large scale distributed systems is presented. The goals are to provide a realistic simulation of distributed computing systems, customized for specific physics data processing tasks and to offer a flexible and dynamic environment to evaluate the performance of a range of possible distributed computing architectures. A detailed simulation of a large system, the CMS High Level Trigger (HLT) production farm, is also presented

    The MONARC toolset for simulating large network-distributed processing systems

    Get PDF
    The next generation of High Energy Physics experiments have envisaged the use of network-distributed Petabyte-scale data handling and computing systems of unprecedented complexity. The general concept is that of a "Data Grid Hierarchy" in which the central facility at the European Laboratory for Particle Physics (CERN) in Geneva will interact and coherently manage tasks shared by and distributed amongst national "Tier1 (National) Regional Centres" situated in the US, Europe, and Asia. CERN and the Tier1 Centers will further communicate and task-share with the Tier2 Regional Centers, Tier3 centers serving individual universities or research groups, and thousands of "Tier4" desktops and small servers. The design and optimization of systems with this level of complexity requires a realistic description and modeling of the data access patterns, the data flow across the local and wide area networks, and the scheduling and workload presented by hundreds of jobs running concurrently on large scale distributed systems exchanging very large amounts of data. The simulation toolset developed within the "Models Of Networked Analysis at Regional Centers" - MONARC project provides a code and execution time-efficient design and optimisation framework for large scale distributed systems. A process-oriented approach for discrete event simulation has been adopted because it is well suited to describe various activities running concurrently, as well the stochastic arrival patterns typical of this class of simulations. Threaded objects or "Active Objects" provide a natural way to map the specific behaviour of distributed data processing (and the required flows of data across the networks) into the simulation program. This simulation program is based on Java2(™) technology because of the support for the necessary methods and techniques needed to develop an efficient and flexible distributed process oriented simulation. This includes a convenient set of interactive graphical presentation and analysis tools, which are essential for the development and effective use of the simulation system. The design elements, status and features of the MONARC simulation tool are presented. The program allows realistic modelling of complex data access patterns by multiple concurrent users in large scale computing systems in a wide range of possible architectures. Comparison between queuing theory and realistic client-server measurements is also presented

    A Self-Organizing Neural Network for Job Scheduling in Distributed Systems

    Get PDF
    The aim of this work is to describe a possible approach for the optimization of the job scheduling in large distributed systems, based on a self-organizing Neural Network. This dynamic scheduling system should be seen as adaptive middle layer software, aware of current available resources and making the scheduling decisions using the "past experience. It aims to optimize job specific parameters as well as the resource utilization. The scheduling system is able to dynamically learn and cluster information in a large dimensional parameter space and at the same time to explore new regions in the parameters space. This self-organizing scheduling system may offer a possible solution to provide an effective use of resources for the off-line data processing jobs for future HEP experiments

    A monitoring framework for large scale networks

    No full text
    Network monitoring is vital to ensure proper network operation over time, and is tightly integrated with data intensive processing tasks used by modern large scale distributed systems. We present a set of dedicated services developed within the MonALISA framework to provide network management. Such services provide in near real-time the globally aggregated status of an entire network. The time evolution of global network topology is presented in a dedicated GUI. Changes in the global topology at this level occur quite frequently and even small modifications in the connectivity map may significantly affect the network performance. The global topology graphs are correlated with active end-to-end network performance measurements, done using the Fast Data Transfer application, between all sites. Access to both real-time and historical data, as provided by MonALISA, is also important for developing services able to predict the usage pattern, to aid in efficiently allocating resources globally

    A Distributed Agent-based Architecture for Dynamic Services

    No full text
    A prototype system for agent-based distributed dynamic services that will be applied to the development of Data Grids for high-energy physics is presented. The agent-based systems we are designing and developing gather, disseminate and coordinate configuration, time-dependent state and other information in the Grid system as a whole. Thes

    DIAMOnDS - DIstributed Agents for MObile and Dynamic Services

    No full text
    This paper highlights the main features of a new mobile agents' framework for dynamic services based on Jini [2] . Section 2 discusses how code mobility is achieved in DIAMOnDS. Section 3 explains the system architecture and the four main modules developed in the prototype system. Section 4 discusses the security mechanism implemented. Section 5 and 6 describe the specific agents built using the prototype system and the tests conducted to prove the validity of the system. After concluding in the Section 7, current status and the future work is discussed in Section
    • …
    corecore