260,937 research outputs found

    3D analytical modelling and iterative solution for high performance computing clusters

    Get PDF
    Mobile Cloud Computing enables the migration of services to the edge of the Internet. Therefore, high-performance computing clusters are widely deployed to improve computational capabilities of such environments. However, they are prone to failures and need analytical models to predict their behaviour in order to deliver desired quality-of-service and quality-of-experience to mobile users. This paper proposes a 3D analytical model and a problem-solving approach for sustainability evaluation of high-performance computing clusters. The proposed solution uses an iterative approach to obtain performance measurements to overcome the state space explosion problem. The availability modelling and evaluation of master and computing nodes are performed using a multi-repairman approach. The optimum number of repairmen is also obtained to get realistic results and reduce the overall cost. The proposed model is validated using discrete event simulation. The analytical approach is much faster and in good agreement with the simulations. The analysis focuses on mean queue length, throughput, and mean response time outputs. The maximum differences between analytical and simulation results in the considered scenarios of up to a billion states are less than1.149%,3.82%, and3.76%respectively. These differences are well within the5%of confidence interval of the simulation and the proposed model

    Architecture of Job scheduling simulator for demand response based resource provisioning

    Get PDF
    We study a new service model based on the Demand Response (DR) resource provisioning at High Performance Computing (HPC) centers. This DR-based resource provisioning model allows administrators of HPC centers to provide computing services with incentives to users to compensate for the performance loss due to power saving operations. In a power conservation mode, a job’s performance may decrease, both in terms of a job waiting time and a job execution time. With DR-based resource provisioning, the submitted jobs are divided into two categories, allowed jobs and disallowed jobs, depending on the user’s tolerance in the performance degradation. The allowed jobs, if indeed affected by the power saving operations, will receive compensation in accordance with an incentive system which determines the reward to the user. For designing an appropriate demand response model, we need to focus on the increase in the job’s execution time and the job’s waiting time, and the corresponding decrease in the power consumption. These are important factors in deriving an incentive system. Currently, no existing approaches can reliably quantify the effectiveness and the contribution of these factors in HPC job scheduling and resource provisioning. In this paper, we propose a newly developed job scheduling simulator that can evaluate DR-based resource provisioning approach under various operating conditions. We designed and implemented the job scheduling simulator for HPC demand-response resource provisioning using a general-purpose discrete-event simulator. Our experiments show that the job scheduling simulator can properly represent the demand response resource provisioning using different job scheduling scenarios

    Optimised access to user analysis data using the gLite DPM

    Get PDF
    The ScotGrid distributed Tier-2 now provides more that 4MSI2K and 500TB for LHC computing, which is spread across three sites at Durham, Edinburgh and Glasgow. Tier-2 sites have a dual role to play in the computing models of the LHC VOs. Firstly, their CPU resources are used for the generation of Monte Carlo event data. Secondly, the end user analysis data is distributed across the grid to the site's storage system and held on disk ready for processing by physicists' analysis jobs. In this paper we show how we have designed the ScotGrid storage and data management resources in order to optimise access by physicists to LHC data. Within ScotGrid, all sites use the gLite DPM storage manager middleware. Using the EGEE grid to submit real ATLAS analysis code to process VO data stored on the ScotGrid sites, we present an analysis of the performance of the architecture at one site, and procedures that may be undertaken to improve such. The results will be presented from the point of view of the end user (in terms of number of events processed/second) and from the point of view of the site, which wishes to minimise load and the impact that analysis activity has on other users of the system

    CMS Monte Carlo production in the WLCG computing Grid

    Get PDF
    Monte Carlo production in CMS has received a major boost in performance and scale since the past CHEP06 conference. The production system has been re-engineered in order to incorporate the experience gained in running the previous system and to integrate production with the new CMS event data model, data management system and data processing framework. The system is interfaced to the two major computing Grids used by CMS, the LHC Computing Grid (LCG) and the Open Science Grid (OSG). Operational experience and integration aspects of the new CMS Monte Carlo production system is presented together with an analysis of production statistics. The new system automatically handles job submission, resource monitoring, job queuing, job distribution according to the available resources, data merging, registration of data into the data bookkeeping, data location, data transfer and placement systems. Compared to the previous production system automation, reliability and performance have been considerably improved. A more efficient use of computing resources and a better handling of the inherent Grid unreliability have resulted in an increase of production scale by about an order of magnitude, capable of running in parallel at the order of ten thousand jobs and yielding more than two million events per day

    Mobile Computing in Physics Analysis - An Indicator for eScience

    Full text link
    This paper presents the design and implementation of a Grid-enabled physics analysis environment for handheld and other resource-limited computing devices as one example of the use of mobile devices in eScience. Handheld devices offer great potential because they provide ubiquitous access to data and round-the-clock connectivity over wireless links. Our solution aims to provide users of handheld devices the capability to launch heavy computational tasks on computational and data Grids, monitor the jobs status during execution, and retrieve results after job completion. Users carry their jobs on their handheld devices in the form of executables (and associated libraries). Users can transparently view the status of their jobs and get back their outputs without having to know where they are being executed. In this way, our system is able to act as a high-throughput computing environment where devices ranging from powerful desktop machines to small handhelds can employ the power of the Grid. The results shown in this paper are readily applicable to the wider eScience community.Comment: 8 pages, 7 figures. Presented at the 3rd Int Conf on Mobile Computing & Ubiquitous Networking (ICMU06. London October 200

    A parallel grid-based implementation for real time processing of event log data in collaborative applications

    Get PDF
    Collaborative applications usually register user interaction in the form of semi-structured plain text event log data. Extracting and structuring of data is a prerequisite for later key processes such as the analysis of interactions, assessment of group activity, or the provision of awareness and feedback. Yet, in real situations of online collaborative activity, the processing of log data is usually done offline since structuring event log data is, in general, a computationally costly process and the amount of log data tends to be very large. Techniques to speed and scale up the structuring and processing of log data with minimal impact on the performance of the collaborative application are thus desirable to be able to process log data in real time. In this paper, we present a parallel grid-based implementation for processing in real time the event log data generated in collaborative applications. Our results show the feasibility of using grid middleware to speed and scale up the process of structuring and processing semi-structured event log data. The Grid prototype follows the Master-Worker (MW) paradigm. It is implemented using the Globus Toolkit (GT) and is tested on the Planetlab platform
    • …
    corecore