4 research outputs found

    Towards an Autonomic Cluster Management System (ACMS) with Reflex Autonomicity

    Get PDF
    Cluster computing, whereby a large number of simple processors or nodes are combined together to apparently function as a single powerful computer, has emerged as a research area in its own right. The approach offers a relatively inexpensive means of providing a fault-tolerant environment and achieving significant computational capabilities for high-performance computing applications. However, the task of manually managing and configuring a cluster quickly becomes daunting as the cluster grows in size. Autonomic computing, with its vision to provide self-management, can potentially solve many of the problems inherent in cluster management. We describe the development of a prototype Autonomic Cluster Management System (ACMS) that exploits autonomic properties in automating cluster management and its evolution to include reflex reactions via pulse monitoring

    Birds of a Feather Session: “Autonomic Computing: Panacea or Poppycock?”

    Get PDF

    Autonomic Pulse Communications for Adaptive Transmission Range in Decentralised Robot Swarms

    Get PDF

    The SysMES Framework: System Management for Networked Embedded Systems and Clusters

    Get PDF
    Automated system management for large distributed and heterogeneous environments is a common challenge in modern computer sciences. Desired properties of such a management system are, among others, a minimal dependency on human operators for problem recognition and solution, adaptability to increasing loads, fault tolerance and the flexibility to integrate new management resources at runtime. Existing tools address parts of these requirements however there is no single integrated framework which possesses all mentioned characteristics. SysMES was developed as an integrated framework for automated monitoring and management of networked devices. In order to achieve the requirements of scalability and fault tolerance, a fully distributed and decentralized architecture has been chosen. The framework comprises a monitoring module, a rule engine and an executive module for the execution of actions. A formal language has been defined which allows administrators to define complex spatial and temporal rule conditions for failure states and according reactions. These rules are used in order to reduce the number and duration of manual interventions in the managed environment by automated problem solution. SysMES is based on standards ensuring interoperability and manufacturer independence. The object-oriented modeling of management resources allows several abstraction levels for handling the complexity of managing large and heterogeneous environments. Management resources can be extended and (re)configured without downtime for increased flexibility. Multiple tests and a reference installation demonstrate the suitability of SysMES for automated management of large heterogeneous environments
    corecore