359,842 research outputs found

    Towards Automated Network Configuration Management

    Get PDF
    Modern networks are designed to satisfy a wide variety of competing goals related to network operation requirements such as reachability, security, performance, reliability and availability. These high level goals are realized through a complex chain of low level configuration commands performed on network devices. As networks become larger, more complex and more heterogeneous, human errors become the most significant threat to network operation and the main cause of network outage. In addition, the gap between high-level requirements and low-level configuration data is continuously increasing and difficult to close. Although many solutions have been introduced to reduce the complexity of configuration management, network changes, in most cases, are still manually performed via low--level command line interfaces (CLIs). The Internet Engineering Task Force (IETF) has introduced NETwork CONFiguration (NETCONF) protocol along with its associated data--modeling language, YANG, that significantly reduce network configuration complexity. However, NETCONF is limited to the interaction between managers and agents, and it has weak support for compliance to high-level management functionalities. We design and develop a network configuration management system called AutoConf that addresses the aforementioned problems. AutoConf is a distributed system that manages, validates, and automates the configuration of IP networks. We propose a new framework to augment NETCONF/YANG framework. This framework includes a Configuration Semantic Model (CSM), which provides a formal representation of domain knowledge needed to deploy a successful management system. Along with CSM, we develop a domain--specific language called Structured Configuration language to specify configuration tasks as well as high--level requirements. CSM/SCL together with NETCONF/YANG makes a powerful management system that supports network--wide configuration. AutoConf supports two levels of verifications: consistency verification and behavioral verification. We apply a set of logical formalizations to verifying the consistency and dependency of configuration parameters. In behavioral verification, we present a set of formal models and algorithms based on Binary Decision Diagram (BDD) to capture the behaviors of forwarding control lists that are deployed in firewalls, routers, and NAT devices. We also adopt an enhanced version of Dyna-Q algorithm to support dynamic adaptation of network configuration in response to changes occurred during network operation. This adaptation approach maintains a coherent relationship between high level requirements and low level device configuration. We evaluate AutoConf by running several configuration scenarios such as interface configuration, RIP configuration, OSPF configuration and MPLS configuration. We also evaluate AutoConf by running several simulation models to demonstrate the effectiveness and the scalability of handling large-scale networks

    Autonomic Management in a Distributed Storage System

    Get PDF
    This thesis investigates the application of autonomic management to a distributed storage system. Effects on performance and resource consumption were measured in experiments, which were carried out in a local area test-bed. The experiments were conducted with components of one specific distributed storage system, but seek to be applicable to a wide range of such systems, in particular those exposed to varying conditions. The perceived characteristics of distributed storage systems depend on their configuration parameters and on various dynamic conditions. For a given set of conditions, one specific configuration may be better than another with respect to measures such as resource consumption and performance. Here, configuration parameter values were set dynamically and the results compared with a static configuration. It was hypothesised that under non-changing conditions this would allow the system to converge on a configuration that was more suitable than any that could be set a priori. Furthermore, the system could react to a change in conditions by adopting a more appropriate configuration. Autonomic management was applied to the peer-to-peer (P2P) and data retrieval components of ASA, a distributed storage system. The effects were measured experimentally for various workload and churn patterns. The management policies and mechanisms were implemented using a generic autonomic management framework developed during this work. The experimental evaluations of autonomic management show promising results, and suggest several future research topics. The findings of this thesis could be exploited in building other distributed storage systems that focus on harnessing storage on user workstations, since these are particularly likely to be exposed to varying, unpredictable conditions.Comment: PhD Thesis, University of St Andrews, 2009. Supervisor: Graham Kirb

    Automatic performance optimisation of component-based enterprise systems via redundancy

    Get PDF
    Component technologies, such as J2EE and .NET have been extensively adopted for building complex enterprise applications. These technologies help address complex functionality and flexibility problems and reduce development and maintenance costs. Nonetheless, current component technologies provide little support for predicting and controlling the emerging performance of software systems that are assembled from distinct components. Static component testing and tuning procedures provide insufficient performance guarantees for components deployed and run in diverse assemblies, under unpredictable workloads and on different platforms. Often, there is no single component implementation or deployment configuration that can yield optimal performance in all possible conditions under which a component may run. Manually optimising and adapting complex applications to changes in their running environment is a costly and error-prone management task. The thesis presents a solution for automatically optimising the performance of component-based enterprise systems. The proposed approach is based on the alternate usage of multiple component variants with equivalent functional characteristics, each one optimized for a different execution environment. A management framework automatically administers the available redundant variants and adapts the system to external changes. The framework uses runtime monitoring data to detect performance anomalies and significant variations in the application's execution environment. It automatically adapts the application so as to use the optimal component configuration under the current running conditions. An automatic clustering mechanism analyses monitoring data and infers information on the components' performance characteristics. System administrators use decision policies to state high-level performance goals and configure system management processes. A framework prototype has been implemented and tested for automatically managing a J2EE application. Obtained results prove the framework's capability to successfully manage a software system without human intervention. The management overhead induced during normal system execution and through management operations indicate the framework's feasibility

    Improving efficiency and resilience in large-scale computing systems through analytics and data-driven management

    Full text link
    Applications running in large-scale computing systems such as high performance computing (HPC) or cloud data centers are essential to many aspects of modern society, from weather forecasting to financial services. As the number and size of data centers increase with the growing computing demand, scalable and efficient management becomes crucial. However, data center management is a challenging task due to the complex interactions between applications, middleware, and hardware layers such as processors, network, and cooling units. This thesis claims that to improve robustness and efficiency of large-scale computing systems, significantly higher levels of automated support than what is available in today's systems are needed, and this automation should leverage the data continuously collected from various system layers. Towards this claim, we propose novel methodologies to automatically diagnose the root causes of performance and configuration problems and to improve efficiency through data-driven system management. We first propose a framework to diagnose software and hardware anomalies that cause undesired performance variations in large-scale computing systems. We show that by training machine learning models on resource usage and performance data collected from servers, our approach successfully diagnoses 98% of the injected anomalies at runtime in real-world HPC clusters with negligible computational overhead. We then introduce an analytics framework to address another major source of performance anomalies in cloud data centers: software misconfigurations. Our framework discovers and extracts configuration information from cloud instances such as containers or virtual machines. This is the first framework to provide comprehensive visibility into software configurations in multi-tenant cloud platforms, enabling systematic analysis for validating the correctness of software configurations. This thesis also contributes to the design of robust and efficient system management methods that leverage continuously monitored resource usage data. To improve performance under power constraints, we propose a workload- and cooling-aware power budgeting algorithm that distributes the available power among servers and cooling units in a data center, achieving up to 21% improvement in throughput per Watt compared to the state-of-the-art. Additionally, we design a network- and communication-aware HPC workload placement policy that reduces communication overhead by up to 30% in terms of hop-bytes compared to existing policies.2019-07-02T00:00:00

    CACHE DATA REPLACEMENT POLICY BASED ON RECENTLY USED ACCESS DATA AND EUCLIDEAN DISTANCE

    Get PDF
    Data access management in web-based applications that use relational databases must be well thought out because the data continues to grow every day. The Relational Database Management System (RDBMS) has a relatively slow access speed because the data is stored on disk. This causes problems with decreased database server performance and slow response times. One strategy to overcome this is to implement caching at the application level. This paper proposed SIMGD framework that models Application Level Caching (ALC) to speed up relational data access in web applications. The ALC strategy maps each controller and model that has access to the database into a node-data in the in-Memory Database (IMDB). Not all node-data can be included in IMDB due to limited capacity. Therefore, the SIMGD framework uses the Euclidean distance calculation method for each node-data with its top access data as a cache replacement policy. Node-data with Euclidean distance closer to their top access data have a high priority to be maintained in the caching server. Simulation results show at the 25KB cache configuration, the SIMGD framework excels in achieving hit ratios compared to the LRU algorithm of 6.46% and 6.01%, respectively

    Future MAUS payload and the TWIN-MAUS configuration

    Get PDF
    The German MAUS project (materials science autonomous experiments in weightlessness) was initiated in 1979 for optimum utilization of NASA's Get Away Special (GAS) program. The standard MAUS system was developed to meet GAS requirements and can accommodate a wide variety of GAS-type experiments. The system offers a range of services to experimenters within the framework of standardized interfaces. Four MAUS payloads being prepared for future space shuttle flight opportunities are described. The experiments include critical Marangoni convection, oscillatory Marangoni convection, pool boiling, and gas bubbles in glass melts. Scientific objectives as well as equipment hardware are presented together with recent improvements to the MAUS standard system, e.g., a new experiment control and data management unit and a semiconductor memory. A promising means of increasing resources in the field of GAS experiments is the interconnection of GAS containers. This important feature has been studied to meet the challenge of future advanced payloads. In the TWIN-MAUS configuration, electrical power and data will be transferred between two containers mounted adjacent to each other
    • …
    corecore