9,836 research outputs found
Techniques for the Fast Simulation of Models of Highly dependable Systems
With the ever-increasing complexity and requirements of highly dependable systems, their evaluation during design and operation is becoming more crucial. Realistic models of such systems are often not amenable to analysis using conventional analytic or numerical methods. Therefore, analysts and designers turn to simulation to evaluate these models. However, accurate estimation of dependability measures of these models requires that the simulation frequently observes system failures, which are rare events in highly dependable systems. This renders ordinary Simulation impractical for evaluating such systems. To overcome this problem, simulation techniques based on importance sampling have been developed, and are very effective in certain settings. When importance sampling works well, simulation run lengths can be reduced by several orders of magnitude when estimating transient as well as steady-state dependability measures. This paper reviews some of the importance-sampling techniques that have been developed in recent years to estimate dependability measures efficiently in Markov and nonMarkov models of highly dependable system
Failure distance based bounds for steady-state availability without the kwnowledge of minimal cuts
We propose an algorithm to compute bounds for the steady-state unavailability using continuous-time Markov chains, which is based on the failure distance concept. The algorithm generates incrementally a subset of the state space until the tightness of the bounds is the specified one. In contrast with a previous algorithm also based on the failure distance concept, the proposed algorithm uses lower bounds for failure distances which are computed on the fault tree of the system, and does not require the knowledge of the minimal cuts. This is advantageous when the number of minimal cuts is large or their computation is time-consuming.Postprint (published version
Failure distance based bounds of dependability measures
El tema d'aquesta tesi Ă©s el desenvolupament de mètodes de fitaciĂł per a una classe de models de confiabilitat basats en cadenes de Markov de temps continu (CMTC) de sistemes tolerants a fallades.Els sistemes considerats a la tesi es conceptualitzen com formats per components (hardware o software) que fallen i, en el cas de sistemes reparables, sĂłn reparats. Els components s'agrupen en classes de forma que els components d'una mateixa classe sĂłn indistingibles. Per tant, un component Ă©s considerat com a una instĂ ncia d'una classe de components i el sistema inclou un bag de classes de components definit sobre un cert domini. L'estat no fallada/fallada del sistema es determina a partir de l'estat no fallada/fallada dels components mitjançant una funciĂł d'estructura coherent que s'especifica amb un arbre de fallades amb classes d'esdeveniments bĂ sics. (Una classe d'esdeveniment bĂ sic Ă©s la fallada d'un component d'una classe de components.)La classe de models basats en CMTC considerada a la tesi Ă©s força Ă mplia i permet, per exemple, de modelar el fet que un component pot tenir diversos modes de fallada. TambĂ© permet de modelar fallades de cobertura mitjançant la introducciĂł de components ficticis que no fallen per ells mateixos i als quals es propaguen les fallades d'altres components. En el cas de sistemes reparables, la classe de models considerada admet polĂtiques de reparaciĂł complexes (per exemple, nombre limitat de reparadors, prioritats, inhibiciĂł de reparaciĂł) aixĂ com reparaciĂł en grup (reparaciĂł simultĂ nia de diversos components). Tanmateix, no Ă©s possible de modelar la reparaciĂł diferida (Ă©s a dir, el fet de diferir la reparaciĂł d'un component fins que una certa condiciĂł es compleixi).A la tesi es consideren dues mesures de confiabilitat: la no fiabilitat en un instant de temps donat en el cas de sistemes no reparables i la no disponibilitat en règim estacionari en el cas sistemes reparables.Els mètodes de fitaciĂł desenvolupats a la tesi es basen en el concepte de "distĂ ncia a la fallada", que es defineix com el nombre mĂnim de components que han de fallar a mĂ©s dels que ja han fallat per fer que el sistema falli.A la tesi es desenvolupen quatre mètodes de fitaciĂł. El primer mètode dĂłna fites per a la no fiabilitat de sistemes no reparables emprant distĂ ncies a la fallada exactes. Aquestes distĂ ncies es calculen usant el conjunt de talls mĂnims de la funciĂł d'estructura del sistema. El conjunt de talls mĂnims s'obtĂ© amb un algorisme desenvolupat a la tesi que obtĂ© els talls mĂnims per a arbres de fallades amb classes d'esdeveniments bĂ sics. El segon mètode dĂłna fites per a la no fiabilitat usant fites inferiors per a les distĂ ncies a la fallada. Aquestes fites inferiors s'obtenen analitzant l'arbre de fallades del sistema, no requereixen de conèixer el conjunt de talls mĂnims i el seu cĂ lcul Ă©s poc costĂłs. El tercer mètode dĂłna fites per a la no disponibilitat en règim estacionari de sistemes reparables emprant distĂ ncies a la fallada exactes. El quart mètode dĂłna fites per a la no disponibilitat en règim estacionari emprant les fites inferiors per a les distĂ ncies a la fallada.Finalment, s'il·lustren les prestacions de cada mètode usant diversos exemples. La conclusiĂł Ă©s que cada un dels mètodes pot funcionar molt millor que altres mètodes prèviament existents i estendre de forma significativa la complexitat de sistemes tolerants a fallades per als quals Ă©s possible de calcular fites ajustades per a la no fiabilitat o la no disponibilitat en règim estacionari.The subject of this dissertation is the development of bounding methods for a class of continuous-time Markov chain (CTMC) dependability models of fault-tolerant systems.The systems considered in the dissertation are conceptualized as made up of components (hardware or software) that fail and, for repairable systems, are repaired. Components are grouped into classes, the components of the same class being indistinguishable. Thus, a component is regarded as an instance of some component class and the system includes a bag of component classes defined over a certain domain. The up/down state of the system is determined from the unfailed/failed state of the components through a coherent structure function specified by a fault tree with basic event classes. (A basic event class is the failure of a component of a component class.)The class of CTMC models considered in the dissertation is quite wide and allows, for instance, to model the fact that a component may have different failure modes. It also allows to model coverage failures by means of introducing fictitious components that do not fail by themselves and to which uncovered failures of other components are propagated. In the case of repairable systems, the considered class of models supports very complex repair policies (e.g., limited repairpersons, priorities, repair preemption) as well as group repair (i.e., simultaneous repair of several components). However, deferred repair (i.e., the deferring of repair until some condition is met) is not allowed.Two dependability measures are considered in the dissertation: the unreliability at a given time epoch for non-repairable systems and the steady-state unavailability for repairable systems.The bounding methods developed in the dissertation are based on the concept of "failure distance from a state," which is defined as the minimum number of components that have to fail in addition to those already failed to take the system down.We develop four bounding methods. The first method gives bounds for the unreliability of non-repairable fault-tolerant systems using (exact) failure distances. Those distances are computed using the set of minimal cuts of the structure function of the system. The set of minimal cuts is obtained using an algorithm developed in the dissertation that obtains the minimal cuts for fault trees with basic event classes. The second method gives bounds for the unreliability using easily computable lower bounds for failure distances. Those lower bounds are obtained analyzing the fault tree of the system and do not require the knowledge of the set of minimal cuts. The third method gives bounds for the steady-state unavailability using (exact) failure distances. The fourth method gives bounds for the steady-state unavailability using the lower bounds for failure distances.Finally, the performance of each method is illustrated by means of several large examples. We conclude that the methods can outperform significantly previously existing methods and extend significantly the complexity of the fault-tolerant systems for which tight bounds for the unreliability or steady-state unavailability can be computed
Efficient exploration of availability models guided by failure distances
Recently, a method to bound the steady-state availability using the failure distance concept has been proposed. In this paper we refine that method by introducing state space
exploration techniques. In the methods proposed here, the state space is incrementally generated based on the contributions to the steady-state availability band of the states in
the frontier of the currently generated state space. Several state space exploration algorithms are evaluated in terms of
bounds quality and memory and CPU time requirements.
The more efficient seems to be a waved algorithm which expands
transition groups. We compare our new methods with
the method based on the failure distance concept without
state exploration and a method proposed by Souza e Silva
and Ochoa which uses state space exploration but does not use the failure distance concept. Using typical examples we show that the methods proposed here can be significantly more efficient than any of the previous methods.Postprint (published version
A method for the computation of reliability bounds for non-repairable fault-tolerant systems
A realistic modeling of fault-tolerant systems requires to take into account phenomena such as the dependence of component failure rates and coverage parameters on the operational configuration of the system, which cannot be properly captured using combinatorial techniques. Such dependencies can be modeled with detail using continuous-time Markov chains (CTMC’s). However, the use of CTMC models is limited by the well-known state space explosion problem. In this paper we develop a method for the computation of bounds for the reliability of non-repairable fault-tolerant systems which requires the generation of only a subset of states. The tightness of the bounds increases as more detailed states are generated. The method uses the failure distance concept and is illustrated using an example of a quite complex fault-tolerant system whose failure behavior has the above mentioned types of dependencies.Postprint (published version
Tight steady-state availability bounds using the failure distance concept
Continuous-time Markov chains are commonly used for dependability modeling of repairable
fault-tolerant computer systems. Realistic models of non-trivial fault-tolerant systems often have very large state spaces. An attractive approach for dealing with the largeness problem is the use of pruningmethods with error bounds. Several such methods for computing steady-state
availability bounds have been proposed recently. This paper presents a new method which exploits the failure distance concept to bound more efficiently the behavior in the non-generated state space. It is proved that the bounding method gives tighter bounds than previous methods.
Numerical analysis shows that the new bounds can be significantly tighter.Postprint (published version
Improving availability bounds using the failure distance concept
Continuous-time Markov chains are commonly used for dependability modeling of repairable fault-tolerant computer systems. Realistic models of non-trivial fault-tolerant systems easily have very large state spaces. An attractive approach which has been proposed to deal with the largeness problem is the use of pruning-based methods which provide error bounds. Using results from Courtois and Semal, a method for bounding the steady-state availability has been recently developed by Muntz, de Souza e Silva, and Goyal. This paper presents a new method based on a different approach which exploits the concept of failure distance to better bound the behavior out of the non-generated state space. The proposed method yields tighter bounds.
Numerical analysis shows that the improvement is typically significant.Postprint (published version
Failure distance-based simulation of repairable fault-tolerant systems
This paper presents a new importance sampling scheme called failure biasing for the efficient simulation of Markovian models of repairable fault-tolerant systems. The new scheme enriches the failure biasing scheme previously proposed by exploiting the concept of failure distance. This results in a much more efficient simulation with speedups over failure biasing of orders of magnitude in typical cases. The paper also discusses the efficient implementation of the new importance sampling scheme and presents a practical method for the optimization of the biasing parameters.Postprint (author’s final draft
Content-Aware Multimedia Communications
The demands for fast, economic and reliable dissemination of multimedia
information are steadily growing within our society. While people and
economy increasingly rely on communication technologies, engineers still
struggle with their growing complexity.
Complexity in multimedia communication originates from several sources. The
most prominent is the unreliability of packet networks like the Internet.
Recent advances in scheduling and error control mechanisms for streaming
protocols have shown that the quality and robustness of multimedia delivery
can be improved significantly when protocols are aware of the content they
deliver. However, the proposed mechanisms require close cooperation between
transport systems and application layers which increases the overall system
complexity. Current approaches also require expensive metrics and focus on
special encoding formats only. A general and efficient model is missing so
far.
This thesis presents efficient and format-independent solutions to support
cross-layer coordination in system architectures. In particular, the first
contribution of this work is a generic dependency model that enables
transport layers to access content-specific properties of media streams,
such as dependencies between data units and their importance. The second
contribution is the design of a programming model for streaming
communication and its implementation as a middleware architecture. The
programming model hides the complexity of protocol stacks behind simple
programming abstractions, but exposes cross-layer control and monitoring
options to application programmers. For example, our interfaces allow
programmers to choose appropriate failure semantics at design time while
they can refine error protection and visibility of low-level errors at
run-time.
Based on some examples we show how our middleware simplifies the
integration of stream-based communication into large-scale application
architectures. An important result of this work is that despite cross-layer
cooperation, neither application nor transport protocol designers
experience an increase in complexity. Application programmers can even
reuse existing streaming protocols which effectively increases system
robustness.Der Bedarf unsere Gesellschaft nach kostengĂĽnstiger und
zuverlässiger
Kommunikation wächst stetig. Während wir uns selbst immer mehr von modernen
Kommunikationstechnologien abhängig machen, müssen die Ingenieure dieser
Technologien sowohl den Bedarf nach schneller EinfĂĽhrung neuer Produkte
befriedigen als auch die wachsende Komplexität der Systeme beherrschen.
Gerade die Ăśbertragung multimedialer Inhalte wie Video und Audiodaten ist
nicht trivial. Einer der prominentesten GrĂĽnde dafĂĽr ist die
Unzuverlässigkeit heutiger Netzwerke, wie z.B.~dem Internet. Paketverluste
und schwankende Laufzeiten können die Darstellungsqualität massiv
beeinträchtigen. Wie jüngste Entwicklungen im Bereich der
Streaming-Protokolle zeigen, sind jedoch Qualität und Robustheit der
Ăśbertragung effizient kontrollierbar, wenn Streamingprotokolle
Informationen ĂĽber den Inhalt der transportierten Daten ausnutzen.
Existierende Ansätze, die den Inhalt von Multimediadatenströmen
beschreiben, sind allerdings meist auf einzelne Kompressionsverfahren
spezialisiert und verwenden berechnungsintensive Metriken. Das reduziert
ihren praktischen Nutzen deutlich. AuĂźerdem erfordert der
Informationsaustausch eine enge Kooperation zwischen Applikationen und
Transportschichten. Da allerdings die Schnittstellen aktueller
Systemarchitekturen nicht darauf vorbereitet sind, mĂĽssen entweder die
Schnittstellen erweitert oder alternative Architekturkonzepte geschaffen
werden. Die Gefahr beider Varianten ist jedoch, dass sich die Komplexität
eines Systems dadurch weiter erhöhen kann.
Das zentrale Ziel dieser Dissertation ist es deshalb,
schichtenĂĽbergreifende Koordination bei gleichzeitiger Reduzierung der
Komplexität zu erreichen. Hier leistet die Arbeit zwei Beträge zum
aktuellen Stand der Forschung. Erstens definiert sie ein universelles
Modell zur Beschreibung von Inhaltsattributen, wie Wichtigkeiten und
Abhängigkeitsbeziehungen innerhalb eines Datenstroms. Transportschichten
können dieses Wissen zur effizienten Fehlerkontrolle verwenden. Zweitens
beschreibt die Arbeit das Noja Programmiermodell fĂĽr multimediale
Middleware. Noja definiert Abstraktionen zur Ăśbertragung und Kontrolle
multimedialer Ströme, die die Koordination von Streamingprotokollen mit
Applikationen ermöglichen. Zum Beispiel können Programmierer geeignete
Fehlersemantiken und Kommunikationstopologien auswählen und den konkreten
Fehlerschutz dann zur Laufzeit verfeinern und kontrolliere
- …