28,399 research outputs found

    The Robustness of Resource Allocation in Parallel and Distributed Computing Systems

    Get PDF
    This paper gives an overview of the material to be discussed in the invited keynote presentation by H. J. Siegel. Performing computing and communication tasks on parallel and distributed systems involves the coordinated use of different types of machines, networks, interfaces, and other resources. Decisions about how best to allocate resources are often based on estimated values of task and system parameters, due to uncertainties in the system environment. An important research problem is the development of resource management strategies that can guarantee a particular system performance given such uncertainties. We have designed a methodology for deriving the degree of robustness of a resource allocation - the maximum amount of collective uncertainty in system parameters within which a user-specified level of system performance (QoS) can be guaranteed. Our four-step procedure for deriving a robustness metric for an arbitrary system will be presented. We will illustrate this procedure and its usefulness by deriving robustness metrics for some example distributed systems

    Robustness of resource allocation in parallel and distributed computing systems, The

    Get PDF
    Includes bibliographical references (page [9]).This paper gives an overview of the material to be discussed in the invited keynote presentation by H. J. Siegel; it summarizes our research in [1]. Performing computing and communication tasks on parallel and distributed systems involves the coordinated use of different types of machines, networks, interfaces, and other resources. Decisions about how best to allocate resources are often based on estimated values of task and system parameters, due to uncertainties in the system environment. An important research problem is the development of resource management strategies that can guarantee a particular system performance given such uncertainties. We have designed a methodology for deriving the degree of robustness of a resource allocation - the maximum amount of collective uncertainty in system parameters within which a user-specified level of system performance (QoS) can be guaranteed. Our four-step procedure for deriving a robustness metric for an arbitrary system will be presented. We will illustrate this procedure and its usefulness by deriving robustness metrics for some example distributed systems

    Models and heuristics for robust resource allocation in parallel and distributed computing systems

    Get PDF
    Includes bibliographical references.This is an overview of the robust resource allocation research efforts that have been and continue to be conducted by the CSU Robustness in Computer Systems Group. Parallel and distributed computing systems, consisting of a (usually heterogeneous) set of machines and networks, frequently operate in environments where delivered performance degrades due to unpredictable circumstances. Such unpredictability can be the result of sudden machine failures, increases in system load, or errors caused by inaccurate initial estimation. The research into developing models and heuristics for parallel and distributed computing systems that create robust resource allocations is presented.This research was supported by NSF under grant No. CNS-0615170 and by the Colorado State University George T. Abell Endowment

    Measuring robustness for distributed computing systems

    Get PDF
    Includes bibliographical references (page 6).Performing computing and communication tasks on parallel and distributed systems may involve the coordinated use of different types of machines, networks, interfaces, and other resources. All of these resources should be allocated in a way that maximizes some system performance measure. However, allocation decisions and performance prediction are often based on "nominal" values of application and system parameters. The actual values of these parameters may differ from the nominal ones, e.g., because of inaccuracies in the initial estimation or because of changes over time caused by an unpredictable system environment. An important question then arises: given a system design, what extent of departure from the assumed circumstances will cause the performance to be unacceptably degraded? That is, how robust is the system? To address this issue, one needs to derive a design methodology for deriving the degree of robustness of a resource allocation - the maximum amount of collective uncertainty in application and system parameters within which a user specified level of performance can be guaranteed. Our procedure for this is presented in this paper. The main contributions of this research are (1) a mathematical description of a metric for the robustness of a resource allocation with respect to desired system performance features against multiple perturbations in multiple system and environmental conditions, (2) a procedure for deriving a robustness metric for an arbitrary system, and (3) example applications of this procedure to several different systems

    Management and Service-aware Networking Architectures (MANA) for Future Internet Position Paper: System Functions, Capabilities and Requirements

    Get PDF
    Future Internet (FI) research and development threads have recently been gaining momentum all over the world and as such the international race to create a new generation Internet is in full swing: GENI, Asia Future Internet, Future Internet Forum Korea, European Union Future Internet Assembly (FIA). This is a position paper identifying the research orientation with a time horizon of 10 years, together with the key challenges for the capabilities in the Management and Service-aware Networking Architectures (MANA) part of the Future Internet (FI) allowing for parallel and federated Internet(s)
    • …
    corecore