14 research outputs found

    Two-Dimensional Software Defect Models with Test Execution History

    Get PDF

    Ordenaciones estocásticas e inferencia bayesiana en fiabilidad de software

    Get PDF
    Within the last decade of the 20th century and the first few years of the 21st century, the demand for complex software systems has increased, and therefore, the reliability of software systems has become a major concern for our modern society. Software reliability is defined as the probability of failure free software operations for a specified period of time in a specified environment. Many current software reliability techniques and practices are detailed by Lyu and Pham. From a statistical point of view, the random variables that characterize software reliability are the epoch times in which a failure of software takes place or the times between failures. Most of the well known models for software reliability are centered around the interfailure times or the point processes that they generate. A software reliability model specifies the general form of the dependence of the failure process on the principal factors that affect it: fault introduction, fault removal, and the operational environment. The purpose of this thesis is threefold: (1) to study stochastic properties of times between failures relative to independent but not identically distributed random variables; (2) to investigate properties of the epoch times of nonhomogeneous pure birth processes as an extension of nonhomogeneous Poisson processes used in the literature in software reliability modelling and, (3) to develop a software reliability model based on the use of covariate information such as software metrics. Firstly, properties of statistics based on heterogeneous samples will be investigated with the aid of stochastic orders. Stochastic orders between probability distributions is a widely studied concept. There are several kinds of stochastic orders that are used to compare different aspects of probability distributions like location, variability, skewness, dependence, etc. Secondly, ageing notions and stochastic orderings of the epoch times of nonhomogeneous pure birth processes are studied. Ageing notions are another important concepts in reliability theory. Many classes of life distributions are characterized or defined according to their aging properties in the literature. Finally, we exhibit a non-parametric model based on Gaussian processes to predict number of software failures and times between failures. Gaussian processes are a flexible and attractive method for a wide variety of supervised learning problems, such as regression and classification in machine learning. This thesis is organized as follows. In Chapter 1, we present some basic software reliability measures. After providing a brief review of stochastic point processes and models of ordered random variables, it discusses the relationship between these kind of models and types of failure data. This is then followed by a brief review of some stochastic orderings and ageing notions. The chapter concludes with a review of some well known software reliability models. The results of Chapter 2 concern stochastic orders for spacings of the order statistics of independent exponential random variables with different scale parameters. These results on stochastic orderings and spacings are based on the relation between the spacings and the times between successive software failures. Due to the complicated expression of the distribution in the non-iid case, only limited results are found in the literature. In the first part of this chapter, we investigate the hazard rate ordering of simple spacings and normalized spacings of a sample of heterogeneous exponential random variables. In the second part of this chapter, we study the two sample problem. Specifically, we compare both simple spacings and normalized spacings from two samples of heterogeneous exponential random variables according to the likelihood ratio ordering. We also show applications of these results to multiple-outlier models. In Chapter 3, motivated by the equality in distribution between sequential order statistics and the first n epoch times of a nonhomogeneous pure birth process, we consider the problem of comparing the components of sequential k-out-of-n systems according to magnitude and location orders. In particular, this chapter discusses conditions on the underlying distribution functions on which the sequential order statistics are based, to obtain ageing notions and stochastic comparisons of sequential order statistics. We also present a nonhomogeneous pure birth process approach to software reliability modelling. A large number of models have been proposed in the literature to predict software failures, but a few incorporate some significant metrics data observed in software testing. In Chapter 4, we develop a new procedure to predict both interfailure times and numbers of software failures using metrics information, from a Bayesian perspective. In particular, we develop a hierarchical non-parametric regression model based on exponential interfailure times or Poisson failure counts, where the rates are modeled as Gaussian processes with software metrics data as inputs, together with some illustrative concrete examples. In Chapter 5 we show some general conclusions and describe the most significant contributions of this thesis. -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------En la última década del siglo 20 y en los primeros años del siglo 21, la demanda de sistemas informáticos ha aumentado considerablemente, muestra de ello es su presencia en satélites espaciales, aviones, cadenas de montaje automatizadas, incluso cada vez están más cercanos a nuestra vida cotidiana como en automóviles, electrodomésticos o teléfonos móviles. Un sistema informático consta de dos tipos de componentes: el hardware y el software. Entre ellos la principal diferencia es que el software no se desgasta. Así, un programa informático podría funcionar al cabo de años con la misma corrección con que lo hizo el primer día sin necesidad de modificación alguna. En general, la calidad de un producto puede valorarse desde diversos puntos de vista. El software no es una excepción, y existen por tanto diferentes enfoques para la valoración de su calidad. Aquí nos centraremos en uno de dichos enfoques: la fiabilidad. Por fiabilidad se entiende la probabilidad de ausencia de fallos durante la operación de un producto de software. Existen diferentes técnicas estadísticas para medir la fiabilidad de un programa informático, algunas de ellas son detalladas en Lyu y Pham. Desde un punto de vista estadístico, las variables aleatorias que caracterizan la fiabilidad del software son los instantes de tiempo en los que se produce un fallo de software, así como, los tiempos entre fallos. Uno de los objetivos principales de esta tesis es modelizar el comportamiento de dichas variables aleatorias. Resulta interesante estudiar el comportamiento estocástico de dichas variables, ya que, de este modo, podemos conocer propiedades de las mismas relacionadas con sus funciones de supervivencia o con sus funciones de tasa de fallo. En este sentido, en el Capítulo 2, presentamos resultados referidos con ordenaciones estocásticas de los tiempos entre fallos de software, relativos a variables aleatorias independientes no idénticamente distribuidas. Estos resultados se basan en la relación que liga dichos tiempos con los espaciamientos (spacings). Tanto los estadísticos de orden como los espaciamientos tienen un gran interés en el contexto del Análisis de Supervivencia, así como en la Teoría de Fiabilidad. En la mayoría de los trabajos existentes, se asume que las variables implicadas son independientes e idénticamente distribuidas (iid). Debido a la complejidad analítica que conlleva relajar alguna de estas dos hipótesis, no hay demasiadas referencias para el caso en el que las variables no sean iid. Kochar y Korwar comprobaron que, cuando el número de exponenciales que se contemplan son tres, los espaciamientos normalizados cumplen la ordenación de tasa de fallo y conjeturaron lo mismo para el caso general de n variables aleatorias exponenciales heterogéneas. En la Sección 2.2, se presentan avances relacionados con dicha conjetura, así como, resultados relativos a la ordenación de tasa de fallo de espaciamientos sin normalizar. También han sido estudiados en este capítulo problemas asociados con espaciamientos obtenidos a partir de muestras aleatorias de dos poblaciones. En particular, hemos obtenido condiciones suficientes para que se verifique la ordenación de razón de verosimilitud entre espaciamientos de dos muestras de exponenciales heterogéneas. Por otra parte, hemos trabajado con estadísticos de orden secuenciales, ya que incluyen un gran número de variables aleatorias ordenadas. Además, este tipo de estadísticos de orden son interesantes porque están ligados con los tiempos en los que ocurre un fallo de procesos no homogéneos de nacimiento puro. Cabe destacar, que este tipo de variables son dependientes y no idénticamente distribuidas, lo que aumenta la complejidad del problema. Nuestro objetivo aquí, es estudiar qué condiciones deben verificar las distribuciones subyacentes a partir de las cuales se definen los estadísticos de orden secuenciales para que éstos cumplan algún tipo de ordenación estocástica. Los resultados obtenidos en este sentido se presentan en el Capítulo 3. En este capítulo, también estudiamos otro concepto importante en fiabilidad, la noción de envejecimiento. Los diferentes conceptos de envejecimiento describen como una componente o un sistema mejora o empeora con la edad. En este sentido, el envejecimiento positivo significa que las componentes tienden a empeorar debido al desgaste. Exactamente esto es lo que le ocurre al hardware. Mientras que, cuando un sistema supera ciertos tests y mejora, diremos que el envejecimiento es negativo, como le sucede al software. En el segundo capítulo de la tesis, estudiamos condiciones bajo las cuales algunas propiedades de envejecimiento verificadas por las distribuciones subyacentes, a partir de las cuales se definen los estadísticos de orden secuenciales, se cumplen también para los estadísticos de orden secuenciales. Si bien es cierto que se han desarrollado en los últimos cuarenta años un gran número de modelos de fiabilidad de software, la mayoría de ellos no tienen en consideración la información proporcionada por covariables. Otra aportación de esta tesis, la cual se encuentra en el Capítulo 4, consiste en la utilización de métricas del software como variables independientes para predecir o bien el número de fallos de un programa informático o bien los tiempos entre sucesivos fallos del software. Una métrica de un programa informático sirve para medir la complejidad y la calidad del mismo, así como, la productividad de los programadores con respecto a su eficiencia y competencia. En esta tesis, hacemos uso de métricas para medir la complejidad de un programa informático a través del volumen del mismo contabilizando el número de líneas de código. En la literatura existen algunos modelos lineales para predecir datos de fallos del software mediante métodos de inferencia clásicos. Sin embargo, nosotros optamos por utilizar procesos gaussianos que relajan la linealidad y que han sido ampliamente usados en problemas de aprendizaje automático, tanto en regresión como en clasificación. Por último, en el Capítulo 5, resumimos las principales aportaciones de esta tesis

    Reliability models for HPC applications and a Cloud economic model

    Get PDF
    With the enormous number of computing resources in HPC and Cloud systems, failures become a major concern. Therefore, failure behaviors such as reliability, failure rate, and mean time to failure need to be understood to manage such a large system efficiently. This dissertation makes three major contributions in HPC and Cloud studies. First, a reliability model with correlated failures in a k-node system for HPC applications is studied. This model is extended to improve accuracy by accounting for failure correlation. Marshall-Olkin Multivariate Weibull distribution is improved by excess life, conditional Weibull, to better estimate system reliability. Also, the univariate method is proposed for estimating Marshall-Olkin Multivariate Weibull parameters of a system composed of a large number of nodes. Then, failure rate, and mean time to failure are derived. The model is validated by using log data from Blue Gene/L system at LLNL. Results show that when failures of nodes in the system have correlation, the system becomes less reliable. Secondly, a reliability model of Cloud computing is proposed. The reliability model and mean time to failure and failure rate are estimated based on a system of k nodes and s virtual machines under four scenarios: 1) Hardware components fail independently, and software components fail independently; 2) software components fail independently, and hardware components are correlated in failure; 3) correlated software failure and independent hardware failure; and 4) dependent software and hardware failure. Results show that if the failure of the nodes and/or software in the system possesses a degree of dependency, the system becomes less reliable. Also, an increase in the number of computing components decreases the reliability of the system. Finally, an economic model for a Cloud service provider is proposed. This economic model aims at maximizing profit based on the right pricing and rightsizing in the Cloud data center. Total cost is a key element in the model and it is analyzed by considering the Total Cost of Ownership (TCO) of the Cloud

    Reliability analysis of a repairable dependent parallel system

    Get PDF

    Vol. 10, No. 2 (Full Issue)

    Get PDF

    Modelo de apoio à decisão para a manutenção condicionada de equipamentos produtivos

    Get PDF
    Doctoral Thesis for PhD degree in Industrial and Systems EngineeringIntroduction: This thesis describes a methodology to combine Bayesian control chart and CBM (Condition-Based Maintenance) for developing a new integrated model. In maintenance management, it is a challenging task for decision-maker to conduct an appropriate and accurate decision. Proper and well-performed CBM models are beneficial for maintenance decision making. The integration of Bayesian control chart and CBM is considered as an intelligent model and a suitable strategy for forecasting items failures as well as allow providing an effectiveness maintenance cost. CBM models provides lower inventory costs for spare parts, reduces unplanned outage, and minimize the risk of catastrophic failure, avoiding high penalties associated with losses of production or delays, increasing availability. However, CBM models need new aspects and the integration of new type of information in maintenance modeling that can improve the results. Objective: The thesis aims to develop a new methodology based on Bayesian control chart for predicting failures of item incorporating simultaneously two types of data: key quality control measurement and equipment condition parameters. In other words, the project research questions are directed to give the lower maintenance costs for real process control. Method: The mathematical approach carried out in this study for developing an optimal Condition Based Maintenance policy included the Weibull analysis for verifying the Markov property, Delay time concept used for deterioration modeling and PSO and Monte Carlo simulation. These models are used for finding the upper control limit and the interval monitoring that minimizes the (maintenance) cost function. Result: The main contribution of this thesis is that the proposed model performs better than previous models in which the hypothesis of using simultaneously data about condition equipment parameters and quality control measurements improve the effectiveness of integrated model Bayesian control chart for Condition Based Maintenance.Introdução: Esta tese descreve uma metodologia para combinar Bayesian control chart e CBM (Condition- Based Maintenance) para desenvolver um novo modelo integrado. Na gestão da manutenção, é importante que o decisor possa tomar decisões apropriadas e corretas. Modelos CBM bem concebidos serão muito benéficos nas tomadas de decisão sobre manutenção. A integração dos gráficos de controlo Bayesian e CBM é considerada um modelo inteligente e uma estratégica adequada para prever as falhas de componentes bem como produzir um controlo de custos de manutenção. Os modelos CBM conseguem definir custos de inventário mais baixos para as partes de substituição, reduzem interrupções não planeadas e minimizam o risco de falhas catastróficas, evitando elevadas penalizações associadas a perdas de produção ou atrasos, aumentando a disponibilidade. Contudo, os modelos CBM precisam de alterações e a integração de novos tipos de informação na modelação de manutenção que permitam melhorar os resultados.Objetivos: Esta tese pretende desenvolver uma nova metodologia baseada Bayesian control chart para prever as falhas de partes, incorporando dois tipos de dados: medições-chave de controlo de qualidade e parâmetros de condição do equipamento. Por outras palavras, as questões de investigação são direcionadas para diminuir custos de manutenção no processo de controlo.Métodos: Os modelos matemáticos implementados neste estudo para desenvolver uma política ótima de CBM incluíram a análise de Weibull para verificação da propriedade de Markov, conceito de atraso de tempo para a modelação da deterioração, PSO e simulação de Monte Carlo. Estes modelos são usados para encontrar o limite superior de controlo e o intervalo de monotorização para minimizar a função de custos de manutenção.Resultados: A principal contribuição desta tese é que o modelo proposto melhora os resultados dos modelos anteriores, baseando-se na hipótese de que, usando simultaneamente dados dos parâmetros dos equipamentos e medições de controlo de qualidade. Assim obtém-se uma melhoria a eficácia do modelo integrado de Bayesian control chart para a manutenção condicionada

    Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis

    Get PDF
    This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements)

    Optimal Policies in Reliability Modelling of Systems Subject to Sporadic Shocks and Continuous Healing

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Recent years have seen a growth in research on system reliability and maintenance. Various studies in the scientific fields of reliability engineering, quality and productivity analyses, risk assessment, software reliability, and probabilistic machine learning are being undertaken in the present era. The dependency of human life on technology has made it more important to maintain such systems and maximize their potential. In this dissertation, some methodologies are presented that maximize certain measures of system reliability, explain the underlying stochastic behavior of certain systems, and prevent the risk of system failure. An overview of the dissertation is provided in Chapter 1, where we briefly discuss some useful definitions and concepts in probability theory and stochastic processes and present some mathematical results required in later chapters. Thereafter, we present the motivation and outline of each subsequent chapter. In Chapter 2, we compute the limiting average availability of a one-unit repairable system subject to repair facilities and spare units. Formulas for finding the limiting average availability of a repairable system exist only for some special cases: (1) either the lifetime or the repair-time is exponential; or (2) there is one spare unit and one repair facility. In contrast, we consider a more general setting involving several spare units and several repair facilities; and we allow arbitrary life- and repair-time distributions. Under periodic monitoring, which essentially discretizes the time variable, we compute the limiting average availability. The discretization approach closely approximates the existing results in the special cases; and demonstrates as anticipated that the limiting average availability increases with additional spare unit and/or repair facility. In Chapter 3, the system experiences two types of sporadic impact: valid shocks that cause damage instantaneously and positive interventions that induce partial healing. Whereas each shock inflicts a fixed magnitude of damage, the accumulated effect of k positive interventions nullifies the damaging effect of one shock. The system is said to be in Stage 1, when it can possibly heal, until the net count of impacts (valid shocks registered minus valid shocks nullified) reaches a threshold m1m_1. The system then enters Stage 2, where no further healing is possible. The system fails when the net count of valid shocks reaches another threshold m2(>m1)m_2 (> m_1). The inter-arrival times between successive valid shocks and those between successive positive interventions are independent and follow arbitrary distributions. Thus, we remove the restrictive assumption of an exponential distribution, often found in the literature. We find the distributions of the sojourn time in Stage 1 and the failure time of the system. Finally, we find the optimal values of the choice variables that minimize the expected maintenance cost per unit time for three different maintenance policies. In Chapter 4, the above defined Stage 1 is further subdivided into two parts: In the early part, called Stage 1A, healing happens faster than in the later stage, called Stage 1B. The system stays in Stage 1A until the net count of impacts reaches a predetermined threshold mAm_A; then the system enters Stage 1B and stays there until the net count reaches another predetermined threshold m1(>mA)m_1 (>m_A). Subsequently, the system enters Stage 2 where it can no longer heal. The system fails when the net count of valid shocks reaches another predetermined higher threshold m2(>m1)m_2 (> m_1). All other assumptions are the same as those in Chapter 3. We calculate the percentage improvement in the lifetime of the system due to the subdivision of Stage 1. Finally, we make optimal choices to minimize the expected maintenance cost per unit time for two maintenance policies. Next, we eliminate the restrictive assumption that all valid shocks and all positive interventions have equal magnitude, and the boundary threshold is a preset constant value. In Chapter 5, we study a system that experiences damaging external shocks of random magnitude at stochastic intervals, continuous degradation, and self-healing. The system fails if cumulative damage exceeds a time-dependent threshold. We develop a preventive maintenance policy to replace the system such that its lifetime is utilized prudently. Further, we consider three variations on the healing pattern: (1) shocks heal for a fixed finite duration τ\tau; (2) a fixed proportion of shocks are non-healable (that is, τ=0\tau=0); (3) there are two types of shocks---self healable shocks heal for a finite duration, and non-healable shocks. We implement a proposed preventive maintenance policy and compare the optimal replacement times in these new cases with those in the original case, where all shocks heal indefinitely. Finally, in Chapter 6, we present a summary of the dissertation with conclusions and future research potential

    Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

    Get PDF
    466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts

    NHPP-Based Software Reliability Model with Marshall-Olkin Failure Time Distribution

    No full text
    corecore