719 research outputs found

    Checkpointing algorithms and fault prediction

    Get PDF
    This paper deals with the impact of fault prediction techniques on checkpointing strategies. We extend the classical first-order analysis of Young and Daly in the presence of a fault prediction system, characterized by its recall and its precision. In this framework, we provide an optimal algorithm to decide when to take predictions into account, and we derive the optimal value of the checkpointing period. These results allow to analytically assess the key parameters that impact the performance of fault predictors at very large scale.Comment: Supported in part by ANR Rescue. Published in Journal of Parallel and Distributed Computing. arXiv admin note: text overlap with arXiv:1207.693

    Load Balancing and Job Migration Algorithms for Autonomic Grid Environment

    Get PDF
    Resource management and load balancing are the main areas of concern in a distributed, heterogeneous and dynamic environment like Grid. Load balancing may further cause Job migration or in some cases resubmission of Job. In this paper a number of job migration algorithms have been surveyed and studied which have resulted because of the Load balancing problem. A comparative analysis of these algorithms has also been presented which summarizes the utility and applicability of different algorithms in different environment and circumstances

    Maintenance Modelling

    Get PDF

    Performance Evaluation of Scheduling Algorithms for Real Time Cloud Computing Systems

    Get PDF
    Cloud computing shares data and oers services transparently among its users. With the increase in number of users of cloud the tasks to be scheduled increases. The performance of cloud depends on the task scheduling algorithms used in the scheduling components or brokering components. Scheduling of tasks on cloud computing systems is one of the research problem, Where the matching of machines and completion time of the tasks are considered. Tasks matching of machines problem is that, assume number of active hosts are Y, number of VMs in each host are Z. Maximum number of possible Virtual Machines(VMs) to schedule a single task is (y*z). If we need to schedule X tasks, number of possibilities are (y *z)^x. So scheduling of tasks is NP Hard problem. NP Hard means this scheduling of tasks on VMs not having polynomial time complexity, but it may have algorithm for verifying solution. Fault-tolerance becomes an important key to establish dependability in cloud computing system. In task scheduling, if task not completed in it's deadline ,then it is one type of fault in scheduling of tasks. In this thesis this type of faults are taken and try to overcome it. In this thesis we present a non-preemptive scheduling algorithm, By inserting the ideal time for postponing the task by ensuring the other task will completes its execution with in the deadline. In simulation the proposed algorithm maximizes the prot of 25%, throughput of 25% and minimizes the penalty of 20% over EDF

    Performance Evaluation of Scheduling Algorithms for Real Time Cloud Computing Systems

    Get PDF
    Cloud computing shares data and oers services transparently among its users. With the increase in number of users of cloud the tasks to be scheduled increases. The performance of cloud depends on the task scheduling algorithms used in the scheduling components or brokering components. Scheduling of tasks on cloud computing systems is one of the research problem, Where the matching of machines and completion time of the tasks are considered. Tasks matching of machines problem is that, assume number of active hosts are Y, number of VMs in each host are Z. Maximum number of possible Virtual Machines(VMs) to schedule a single task is (y*z). If we need to schedule X tasks, number of possibilities are (y *z)^x. So scheduling of tasks is NP Hard problem. NP Hard means this scheduling of tasks on VMs not having polynomial time complexity, but it may have algorithm for verifying solution. Fault-tolerance becomes an important key to establish dependability in cloud computing system. In task scheduling, if task not completed in it's deadline ,then it is one type of fault in scheduling of tasks. In this thesis this type of faults are taken and try to overcome it. In this thesis we present a non-preemptive scheduling algorithm, By inserting the ideal time for postponing the task by ensuring the other task will completes its execution with in the deadline. In simulation the proposed algorithm maximizes the prot of 25%, throughput of 25% and minimizes the penalty of 20% over EDF

    An Automated Approach of Detection of Memory Leaks for Remote Server Controllers

    Get PDF
    Memory leaks are a major concern to the long running applications like servers which make the working set to grow with the program. This eventually leads to system crashing. This paper discusses a staged approach to detect leaks in firmware of remote server controller. Remote server controller monitors the server remotely with many processes running in the background. Any memory leak in the long running applications pose a threat to the performance of the system. The approach adopted here filters the processes running in the system with leaks based on time threshold in the first stage. These processes with leaks are passed to the next stage where precise memory leak detection is done using the open source dynamic instrumentation tool Valgrind. The system leverages an automated leak detection approach that invokes the leak detection process on encountering any severity in the system and generates a consolidated leak report. The proposed approach has less impact on the performance of the system and is faster compared to many available systems as there is no need to modify or re-compile the program. In addition, the automated approach offers an effective technique for detecting possible leakages in early software development phases

    Extended Abstracts: PMCCS3: Third International Workshop on Performability Modeling of Computer and Communication Systems

    Get PDF
    Coordinated Science Laboratory was formerly known as Control Systems LaboratoryThe pages of the front matter that are missing from the PDF were blank

    Data Analytics as a Service: A look inside the PANACEA project

    Get PDF
    corecore