767 research outputs found
Stochastic model checking for predicting component failures and service availability
When a component fails in a critical communications service, how urgent is a repair? If we repair within 1 hour, 2 hours, or
n hours, how does this affect the likelihood of service failure? Can a formal model support assessing the impact, prioritisation, and
scheduling of repairs in the event of component failures, and forecasting of maintenance costs? These are some of the questions
posed to us by a large organisation and here we report on our experience of developing a stochastic framework based on a discrete
space model and temporal logic to answer them. We define and explore both standard steady-state and transient temporal logic
properties concerning the likelihood of service failure within certain time bounds, forecasting maintenance costs, and we introduce a
new concept of envelopes of behaviour that quantify the effect of the status of lower level components on service availability. The
resulting model is highly parameterised and user interaction for experimentation is supported by a lightweight, web-based interface
Hybrid Stochastic Models for Remaining Lifetime Prognosis
The United States Air Force is developing its next generation aircraft and is seeking to reduce the risk of catastrophic failures, maintenance activities, and the logistics footprint while improving its sortie generation rate through a process called autonomic logistics. Vital to the successful implementation of this process is remaining lifetime prognosis of critical aircraft components. Complicating this problem is the absence of failure time information; however, sensors located on the aircraft are providing degradation measures. This research has provided a method to address at least a portion of this problem by uniting analytical lifetime distribution models with environment and/or degradation measures to obtain the remaining lifetime distribution
Analytical Results for a Single-Unit System Subject To Markovian Wear and Shocks
This thesis develops and analyzers a mathematical model for the reliability measures of a single-unit system subject to continuous wear due to its operating environment and randomly occurring shocks that inflict a random amount of damage to the unit. Assuming a Markovian operating environment and shock arrival mechanism, Laplace-Stieltjes transform expressions are obtained for the failure time distribution and all of its moments. Moreover, an analytical expression is derived for the long-run availability of the single-unit system when it is subject to an inspect-and-replace maintenance policy. The analytical results are illustrated, and their results compared with those of Monte Carlo-simulated failure data. The numerical results indicate that the reliability measures may be accurately computed via numerical inversion of the transform expressions in a straightforward manner when the input parameters are known a priori. In stark contrast to the simulation model which requires several hours to obtain the reliability measures, the analytical procedure computes the same measures in only a few seconds
Reactive point processes: A new approach to predicting power failures in underground electrical systems
Reactive point processes (RPPs) are a new statistical model designed for
predicting discrete events in time based on past history. RPPs were developed
to handle an important problem within the domain of electrical grid
reliability: short-term prediction of electrical grid failures ("manhole
events"), including outages, fires, explosions and smoking manholes, which can
cause threats to public safety and reliability of electrical service in cities.
RPPs incorporate self-exciting, self-regulating and saturating components. The
self-excitement occurs as a result of a past event, which causes a temporary
rise in vulner ability to future events. The self-regulation occurs as a result
of an external inspection which temporarily lowers vulnerability to future
events. RPPs can saturate when too many events or inspections occur close
together, which ensures that the probability of an event stays within a
realistic range. Two of the operational challenges for power companies are (i)
making continuous-time failure predictions, and (ii) cost/benefit analysis for
decision making and proactive maintenance. RPPs are naturally suited for
handling both of these challenges. We use the model to predict power-grid
failures in Manhattan over a short-term horizon, and to provide a cost/benefit
analysis of different proactive maintenance programs.Comment: Published at http://dx.doi.org/10.1214/14-AOAS789 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Unreliable Retrial Queues in a Random Environment
This dissertation investigates stability conditions and approximate steady-state performance measures for unreliable, single-server retrial queues operating in a randomly evolving environment. In such systems, arriving customers that find the server busy or failed join a retrial queue from which they attempt to regain access to the server at random intervals. Such models are useful for the performance evaluation of communications and computer networks which are characterized by time-varying arrival, service and failure rates. To model this time-varying behavior, we study systems whose parameters are modulated by a finite Markov process. Two distinct cases are analyzed. The first considers systems with Markov-modulated arrival, service, retrial, failure and repair rates assuming all interevent and service times are exponentially distributed. The joint process of the orbit size, environment state, and server status is shown to be a tri-layered, level-dependent quasi-birth-and-death (LDQBD) process, and we provide a necessary and sufficient condition for the positive recurrence of LDQBDs using classical techniques. Moreover, we apply efficient numerical algorithms, designed to exploit the matrix-geometric structure of the model, to compute the approximate steady-state orbit size distribution and mean congestion and delay measures. The second case assumes that customers bring generally distributed service requirements while all other processes are identical to the first case. We show that the joint process of orbit size, environment state and server status is a level-dependent, M/G/1-type stochastic process. By employing regenerative theory, and exploiting the M/G/1-type structure, we derive a necessary and sufficient condition for stability of the system. Finally, for the exponential model, we illustrate how the main results may be used to simultaneously select mean time customers spend in orbit, subject to bound and stability constraints
Phase-Type Approximations for Wear Processes in A Semi-Markov Environment
The reliability of a single-unit system experiencing degradation (wear) due to the influence of a general, observable environment process is considered. In particular, the failure time distribution is evaluated using only observations of the unit\u27s current operating environment which is characterized as a finite semi-Markov process (SMP). In order to impose the Markov property, generally distributed environment state sojourn times are approximated as phase-type (PH) random variables using observations of state holding times and transition rates. The use of PH distributions facilitates the use of existing analytical results for reliability evaluation of units subject to an environment process that evolves as a continuous-time Markov chain. The procedure is illustrated through three numerical examples, and results are compared with those obtained via Monte Carlo simulation. The maximum absolute deviation in probability for failure time distributions was on the order of 0.004. The results of this thesis provide a novel approach to the reliability analysis of units operating in randomly evolving environments for which degradation or failure time observations are difficult or impossible to obtain
Optimal Periodic Inspection of a Stochastically Degrading System
This thesis develops and analyzes a procedure to determine the optimal inspection interval that maximizes the limiting average availability of a stochastically degrading component operating in a randomly evolving environment. The component is inspected periodically, and if the total observed cumulative degradation exceeds a fixed threshold value, the component is instantly replaced with a new, statistically identical component. Degradation is due to a combination of continuous wear caused by the component\u27s random operating environment, as well as damage due to randomly occurring shocks of random magnitude. In order to compute an optimal inspection interval and corresponding limiting average availability, a nonlinear program is formulated and solved using a direct search algorithm in conjunction with numerical Laplace transform inversion. Techniques are developed to significantly decrease the time required to compute the approximate optimal solutions. The mathematical programming formulation and solution techniques are illustrated through a series of increasingly complex example problems
Age Replacement and Service Rate Control of Stochastically Degrading Queues
This thesis considers the problem of optimally selecting a periodic replacement time for a multiserver queueing system in which each server is subject to degradation as a function of the mean service rate and a stochastic and dynamic environment. Also considered is the problem of optimal service rate selection for such a system. In both cases, the performance metric is the long-run average cost rate. Analytical expressions are obtained, in terms of Laplace transforms, for the nonlinear objective functions, necessitating the use of numerical Laplace transform inversion to evaluate candidate solutions in conjunction with standard numerical algorithms. Due to the convexity of the objective function, the optimal replacement time is computed using a hybrid bisection-secant method which yields globally optimal solutions. The optimal service rates are obtained via gradient search methods but are only guaranteed to provide locally optimal solutions. The analytical results are implemented on three notional examples that demonstrate the benefits of dynamically adjusting service rates under the described maintenance policy
- …