2,440 research outputs found

    Checkpointing algorithms and fault prediction

    Get PDF
    This paper deals with the impact of fault prediction techniques on checkpointing strategies. We extend the classical first-order analysis of Young and Daly in the presence of a fault prediction system, characterized by its recall and its precision. In this framework, we provide an optimal algorithm to decide when to take predictions into account, and we derive the optimal value of the checkpointing period. These results allow to analytically assess the key parameters that impact the performance of fault predictors at very large scale.Comment: Supported in part by ANR Rescue. Published in Journal of Parallel and Distributed Computing. arXiv admin note: text overlap with arXiv:1207.693

    Review of recent research towards power cable life cycle management

    Get PDF
    Power cables are integral to modern urban power transmission and distribution systems. For power cable asset managers worldwide, a major challenge is how to manage effectively the expensive and vast network of cables, many of which are approaching, or have past, their design life. This study provides an in-depth review of recent research and development in cable failure analysis, condition monitoring and diagnosis, life assessment methods, fault location, and optimisation of maintenance and replacement strategies. These topics are essential to cable life cycle management (LCM), which aims to maximise the operational value of cable assets and is now being implemented in many power utility companies. The review expands on material presented at the 2015 JiCable conference and incorporates other recent publications. The review concludes that the full potential of cable condition monitoring, condition and life assessment has not fully realised. It is proposed that a combination of physics-based life modelling and statistical approaches, giving consideration to practical condition monitoring results and insulation response to in-service stress factors and short term stresses, such as water ingress, mechanical damage and imperfections left from manufacturing and installation processes, will be key to success in improved LCM of the vast amount of cable assets around the world

    Prediction of remaining life of power transformers based on left truncated and right censored lifetime data

    Get PDF
    Prediction of the remaining life of high-voltage power transformers is an important issue for energy companies because of the need for planning maintenance and capital expenditures. Lifetime data for such transformers are complicated because transformer lifetimes can extend over many decades and transformer designs and manufacturing practices have evolved. We were asked to develop statistically-based predictions for the lifetimes of an energy company's fleet of high-voltage transmission and distribution transformers. The company's data records begin in 1980, providing information on installation and failure dates of transformers. Although the dataset contains many units that were installed before 1980, there is no information about units that were installed and failed before 1980. Thus, the data are left truncated and right censored. We use a parametric lifetime model to describe the lifetime distribution of individual transformers. We develop a statistical procedure, based on age-adjusted life distributions, for computing a prediction interval for remaining life for individual transformers now in service. We then extend these ideas to provide predictions and prediction intervals for the cumulative number of failures, over a range of time, for the overall fleet of transformers.Comment: Published in at http://dx.doi.org/10.1214/00-AOAS231 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Quality-driven optimized resource allocation

    Get PDF
    The assurance of a good software product quality necessitates a managed software process. Periodic product evaluation (inspection and testing) should be executed during the development process in order to simultaneously guarantee the timeliness and quality aspects of the development workflow. A faithful prediction of the efforts needed forms the basis of a project management (PM) in order to perform a proper human resource allocation to the different development and QA activities. However, even robust resource demand and quality estimation tools, like COCOMO II and COQUALMO do not cover the timeliness point of view sufficiently due to their static nature. Correspondingly, continuous quality monitoring and quality driven supervisory control of the development process became vital aspects in PM. A well-established complementary approach uses the Weibull model to describe the dynamics of the development and QA process by a mathematical model based on the observations gained during the development process. Supervisory PM control has to concentrate development and QA resources to eliminate quality bottlenecks, as different parts (modules) of the product under development may reveal different defect density levels. Nevertheless, traditional heuristic quality management is unable to perform optimal resource allocation in the case of complex target programs. This paper presents a model-based quality-driven optimized resource allocation method. It combines the COQUALMO model as early quality predictor and empirical knowledge formulated by a Weibull model gained by the continuous monitoring of the QA process flow. An exact mathematical optimization technique is used for human resource, like tester allocation

    ENHANCEMENT AND COMPARISON OF ANT COLONY OPTIMIZATION FOR SOFTWARE RELIABILITY MODELS

    Get PDF
    In Common parlance, the traditional software reliability estimation methods often rely on assumptions like statistical distributions that are often dubious and unrealistic. The ability to predict the number of faults during development phase and a proper testing process helps in specifying timely release of software and efficient management of project resources. In the Present Study Enhancement and Comparison of Ant Colony Optimization Methods for Software Reliability Models are studied and the estimation accuracy was calculated. The Enhanced method shows significant advantages in finding the goodness of fit for software reliability model such as finite and infinite failure Poisson model and binomial models

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    Software timing analysis for complex hardware with survivability and risk analysis

    Get PDF
    The increasing automation of safety-critical real-time systems, such as those in cars and planes, leads, to more complex and performance-demanding on-board software and the subsequent adoption of multicores and accelerators. This causes software's execution time dispersion to increase due to variable-latency resources such as caches, NoCs, advanced memory controllers and the like. Statistical analysis has been proposed to model the Worst-Case Execution Time (WCET) of software running such complex systems by providing reliable probabilistic WCET (pWCET) estimates. However, statistical models used so far, which are based on risk analysis, are overly pessimistic by construction. In this paper we prove that statistical survivability and risk analyses are equivalent in terms of tail analysis and, building upon survivability analysis theory, we show that Weibull tail models can be used to estimate pWCET distributions reliably and tightly. In particular, our methodology proves the correctness-by-construction of the approach, and our evaluation provides evidence about the tightness of the pWCET estimates obtained, which allow decreasing them reliably by 40% for a railway case study w.r.t. state-of-the-art exponential tails.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02- 06CH11357, program manager Laura Biven, and by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft
    • 

    corecore