1,080 research outputs found
A Survey of Fault-Tolerance and Fault-Recovery Techniques in Parallel Systems
Supercomputing systems today often come in the form of large numbers of
commodity systems linked together into a computing cluster. These systems, like
any distributed system, can have large numbers of independent hardware
components cooperating or collaborating on a computation. Unfortunately, any of
this vast number of components can fail at any time, resulting in potentially
erroneous output. In order to improve the robustness of supercomputing
applications in the presence of failures, many techniques have been developed
to provide resilience to these kinds of system faults. This survey provides an
overview of these various fault-tolerance techniques.Comment: 11 page
Resource and Application Models for Advanced Grid Schedulers
As Grid computing is becoming an inevitable future, managing, scheduling and monitoring dynamic, heterogeneous resources will present new challenges. Solutions will have to be agile and adaptive, support self-organization and autonomous management, while maintaining optimal resource utilisation. Presented in this paper are basic principles and architectural concepts for efficient resource allocation in heterogeneous Grid environment
Multi-round Master-Worker Computing: a Repeated Game Approach
We consider a computing system where a master processor assigns tasks for
execution to worker processors through the Internet. We model the workers
decision of whether to comply (compute the task) or not (return a bogus result
to save the computation cost) as a mixed extension of a strategic game among
workers. That is, we assume that workers are rational in a game-theoretic
sense, and that they randomize their strategic choice. Workers are assigned
multiple tasks in subsequent rounds. We model the system as an infinitely
repeated game of the mixed extension of the strategic game. In each round, the
master decides stochastically whether to accept the answer of the majority or
verify the answers received, at some cost. Incentives and/or penalties are
applied to workers accordingly. Under the above framework, we study the
conditions in which the master can reliably obtain tasks results, exploiting
that the repeated games model captures the effect of long-term interaction.
That is, workers take into account that their behavior in one computation will
have an effect on the behavior of other workers in the future. Indeed, should a
worker be found to deviate from some agreed strategic choice, the remaining
workers would change their own strategy to penalize the deviator. Hence, being
rational, workers do not deviate. We identify analytically the parameter
conditions to induce a desired worker behavior, and we evaluate experi-
mentally the mechanisms derived from such conditions. We also compare the
performance of our mechanisms with a previously known multi-round mechanism
based on reinforcement learning.Comment: 21 pages, 3 figure
Report of the 2014 NSF Cybersecurity Summit for Large Facilities and Cyberinfrastructure
This event was supported in part by the National Science Foundation under Grant Number 1234408. Any opinions, findings, and conclusions or recommendations expressed at the event or in this report are those of the authors and do not necessarily reflect the views of the National Science Foundation
Ransomware: Current Trend, Challenges, and Research Directions
Ransomware attacks have become a global incidence, with the primary aim of making monetary gains through illicit means. The attack started through e-mails and has expanded through spamming and phishing. Ransomware encrypts targets’ files and display notifications, requesting for payment before the data can be unlocked. Ransom demand is usually in form of virtual currency, bitcoin, because it is difficult to track. In this paper, we give a brief overview of the current trend, challenges, and research progress in the bid to finding lasting solutions to the menace of ransomware that currently challenge computer and network security, and data privacy
- …