2,661 research outputs found

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    Performance Benchmarks for Custom Applications: Considerations and Strategies

    Get PDF
    The motivation for this research came from the need to solve a problem affecting not only the company used in this study, but also the many other companies in the information technology industry having similar problem: how to conduct performance benchmarks for custom applications in an effective, unbiased, and accurate manner. This paper presents the pros and cons of existing benchmark methodologies. It proposes a combination of the best characteristics of these benchmarks into a methodology that addresses the problem from an application perspective considering the overall synergy between operating system and software. The author also discusses a software design to implement the proposed methodology. The methodology proposed is generic enough to be adapted to any particular application performance-benchmarking situation

    Performance Modeling of the PeopleSoft Multi-Tier Remote Computing Architecture

    Full text link
    Complex client-server configurations being designed today require a new and closely coordinated approach to analytic modeling and measurement. A closed queuing network model for a two-tiered PeopleSoft 6 client-server system with an Oracle database server is demonstrated using a new performance modeling tool that applies mean value analysis. The focus of this work is on the measurement and modeling of the PeopleSoft architecture to provide useful capacity planning insights for an actual large-scale university-wide deployment. A testbed and database exerciser are then developed to measure model parameters and perform the initial validation tests. The testbed also provides preliminary test data on a proposed three-tiered deployment architecture that includes the Citrix WinFrame environment as an intermediate level between the client and the Oracle server.http://deepblue.lib.umich.edu/bitstream/2027.42/107929/1/citi-tr-97-5.pd

    Performance studies of file system design choices for two concurrent processing paradigms

    Get PDF

    Real-time co-ordinated resource management in a computational enviroment

    Get PDF
    Design co-ordination is an emerging engineering design management philosophy with its emphasis on timeliness and appropriateness. Furthermore, a key element of design coordination has been identified as resource management, the aim of which is to facilitate the optimised use of resources throughout a dynamic and changeable process. An approach to operational design co-ordination has been developed, which incorporates the appropriate techniques to ensure that the aim of co-ordinated resource management can be fulfilled. This approach has been realised within an agent-based software system, called the Design Coordination System (DCS), such that a computational design analysis can be managed in a coherent and co-ordinated manner. The DCS is applied to a computational analysis for turbine blade design provided by industry. The application of the DCS involves resources, i.e. workstations within a computer network, being utilised to perform the computational analysis involving the use of a suite of software tools to calculate stress and vibration characteristics of turbine blades. Furthermore, the application of the system shows that the utilisation of resources can be optimised throughout the computational design analysis despite the variable nature of the computer network

    A real-time diagnostic and performance monitor for UNIX

    Get PDF
    There are now over one million UNIX sites and the pace at which new installations are added is steadily increasing. Along with this increase, comes a need to develop simple efficient, effective and adaptable ways of simultaneously collecting real-time diagnostic and performance data. This need exists because distributed systems can give rise to complex failure situations that are often un-identifiable with single-machine diagnostic software. The simultaneous collection of error and performance data is also important for research in failure prediction and error/performance studies. This paper introduces a portable method to concurrently collect real-time diagnostic and performance data on a distributed UNIX system. The combined diagnostic/performance data collection is implemented on a distributed multi-computer system using SUN4's as servers. The approach uses existing UNIX system facilities to gather system dependability information such as error and crash reports. In addition, performance data such as CPU utilization, disk usage, I/O transfer rate and network contention is also collected. In the future, the collected data will be used to identify dependability bottlenecks and to analyze the impact of failures on system performance
    • 

    corecore