767 research outputs found

    Stochastic model checking for predicting component failures and service availability

    Get PDF
    When a component fails in a critical communications service, how urgent is a repair? If we repair within 1 hour, 2 hours, or n hours, how does this affect the likelihood of service failure? Can a formal model support assessing the impact, prioritisation, and scheduling of repairs in the event of component failures, and forecasting of maintenance costs? These are some of the questions posed to us by a large organisation and here we report on our experience of developing a stochastic framework based on a discrete space model and temporal logic to answer them. We define and explore both standard steady-state and transient temporal logic properties concerning the likelihood of service failure within certain time bounds, forecasting maintenance costs, and we introduce a new concept of envelopes of behaviour that quantify the effect of the status of lower level components on service availability. The resulting model is highly parameterised and user interaction for experimentation is supported by a lightweight, web-based interface

    Hybrid Stochastic Models for Remaining Lifetime Prognosis

    Get PDF
    The United States Air Force is developing its next generation aircraft and is seeking to reduce the risk of catastrophic failures, maintenance activities, and the logistics footprint while improving its sortie generation rate through a process called autonomic logistics. Vital to the successful implementation of this process is remaining lifetime prognosis of critical aircraft components. Complicating this problem is the absence of failure time information; however, sensors located on the aircraft are providing degradation measures. This research has provided a method to address at least a portion of this problem by uniting analytical lifetime distribution models with environment and/or degradation measures to obtain the remaining lifetime distribution

    Analytical Results for a Single-Unit System Subject To Markovian Wear and Shocks

    Get PDF
    This thesis develops and analyzers a mathematical model for the reliability measures of a single-unit system subject to continuous wear due to its operating environment and randomly occurring shocks that inflict a random amount of damage to the unit. Assuming a Markovian operating environment and shock arrival mechanism, Laplace-Stieltjes transform expressions are obtained for the failure time distribution and all of its moments. Moreover, an analytical expression is derived for the long-run availability of the single-unit system when it is subject to an inspect-and-replace maintenance policy. The analytical results are illustrated, and their results compared with those of Monte Carlo-simulated failure data. The numerical results indicate that the reliability measures may be accurately computed via numerical inversion of the transform expressions in a straightforward manner when the input parameters are known a priori. In stark contrast to the simulation model which requires several hours to obtain the reliability measures, the analytical procedure computes the same measures in only a few seconds

    Reactive point processes: A new approach to predicting power failures in underground electrical systems

    Full text link
    Reactive point processes (RPPs) are a new statistical model designed for predicting discrete events in time based on past history. RPPs were developed to handle an important problem within the domain of electrical grid reliability: short-term prediction of electrical grid failures ("manhole events"), including outages, fires, explosions and smoking manholes, which can cause threats to public safety and reliability of electrical service in cities. RPPs incorporate self-exciting, self-regulating and saturating components. The self-excitement occurs as a result of a past event, which causes a temporary rise in vulner ability to future events. The self-regulation occurs as a result of an external inspection which temporarily lowers vulnerability to future events. RPPs can saturate when too many events or inspections occur close together, which ensures that the probability of an event stays within a realistic range. Two of the operational challenges for power companies are (i) making continuous-time failure predictions, and (ii) cost/benefit analysis for decision making and proactive maintenance. RPPs are naturally suited for handling both of these challenges. We use the model to predict power-grid failures in Manhattan over a short-term horizon, and to provide a cost/benefit analysis of different proactive maintenance programs.Comment: Published at http://dx.doi.org/10.1214/14-AOAS789 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Unreliable Retrial Queues in a Random Environment

    Get PDF
    This dissertation investigates stability conditions and approximate steady-state performance measures for unreliable, single-server retrial queues operating in a randomly evolving environment. In such systems, arriving customers that find the server busy or failed join a retrial queue from which they attempt to regain access to the server at random intervals. Such models are useful for the performance evaluation of communications and computer networks which are characterized by time-varying arrival, service and failure rates. To model this time-varying behavior, we study systems whose parameters are modulated by a finite Markov process. Two distinct cases are analyzed. The first considers systems with Markov-modulated arrival, service, retrial, failure and repair rates assuming all interevent and service times are exponentially distributed. The joint process of the orbit size, environment state, and server status is shown to be a tri-layered, level-dependent quasi-birth-and-death (LDQBD) process, and we provide a necessary and sufficient condition for the positive recurrence of LDQBDs using classical techniques. Moreover, we apply efficient numerical algorithms, designed to exploit the matrix-geometric structure of the model, to compute the approximate steady-state orbit size distribution and mean congestion and delay measures. The second case assumes that customers bring generally distributed service requirements while all other processes are identical to the first case. We show that the joint process of orbit size, environment state and server status is a level-dependent, M/G/1-type stochastic process. By employing regenerative theory, and exploiting the M/G/1-type structure, we derive a necessary and sufficient condition for stability of the system. Finally, for the exponential model, we illustrate how the main results may be used to simultaneously select mean time customers spend in orbit, subject to bound and stability constraints

    Phase-Type Approximations for Wear Processes in A Semi-Markov Environment

    Get PDF
    The reliability of a single-unit system experiencing degradation (wear) due to the influence of a general, observable environment process is considered. In particular, the failure time distribution is evaluated using only observations of the unit\u27s current operating environment which is characterized as a finite semi-Markov process (SMP). In order to impose the Markov property, generally distributed environment state sojourn times are approximated as phase-type (PH) random variables using observations of state holding times and transition rates. The use of PH distributions facilitates the use of existing analytical results for reliability evaluation of units subject to an environment process that evolves as a continuous-time Markov chain. The procedure is illustrated through three numerical examples, and results are compared with those obtained via Monte Carlo simulation. The maximum absolute deviation in probability for failure time distributions was on the order of 0.004. The results of this thesis provide a novel approach to the reliability analysis of units operating in randomly evolving environments for which degradation or failure time observations are difficult or impossible to obtain

    Optimal Periodic Inspection of a Stochastically Degrading System

    Get PDF
    This thesis develops and analyzes a procedure to determine the optimal inspection interval that maximizes the limiting average availability of a stochastically degrading component operating in a randomly evolving environment. The component is inspected periodically, and if the total observed cumulative degradation exceeds a fixed threshold value, the component is instantly replaced with a new, statistically identical component. Degradation is due to a combination of continuous wear caused by the component\u27s random operating environment, as well as damage due to randomly occurring shocks of random magnitude. In order to compute an optimal inspection interval and corresponding limiting average availability, a nonlinear program is formulated and solved using a direct search algorithm in conjunction with numerical Laplace transform inversion. Techniques are developed to significantly decrease the time required to compute the approximate optimal solutions. The mathematical programming formulation and solution techniques are illustrated through a series of increasingly complex example problems

    Age Replacement and Service Rate Control of Stochastically Degrading Queues

    Get PDF
    This thesis considers the problem of optimally selecting a periodic replacement time for a multiserver queueing system in which each server is subject to degradation as a function of the mean service rate and a stochastic and dynamic environment. Also considered is the problem of optimal service rate selection for such a system. In both cases, the performance metric is the long-run average cost rate. Analytical expressions are obtained, in terms of Laplace transforms, for the nonlinear objective functions, necessitating the use of numerical Laplace transform inversion to evaluate candidate solutions in conjunction with standard numerical algorithms. Due to the convexity of the objective function, the optimal replacement time is computed using a hybrid bisection-secant method which yields globally optimal solutions. The optimal service rates are obtained via gradient search methods but are only guaranteed to provide locally optimal solutions. The analytical results are implemented on three notional examples that demonstrate the benefits of dynamically adjusting service rates under the described maintenance policy
    • …
    corecore