5,998 research outputs found
Techniques for the Fast Simulation of Models of Highly dependable Systems
With the ever-increasing complexity and requirements of highly dependable systems, their evaluation during design and operation is becoming more crucial. Realistic models of such systems are often not amenable to analysis using conventional analytic or numerical methods. Therefore, analysts and designers turn to simulation to evaluate these models. However, accurate estimation of dependability measures of these models requires that the simulation frequently observes system failures, which are rare events in highly dependable systems. This renders ordinary Simulation impractical for evaluating such systems. To overcome this problem, simulation techniques based on importance sampling have been developed, and are very effective in certain settings. When importance sampling works well, simulation run lengths can be reduced by several orders of magnitude when estimating transient as well as steady-state dependability measures. This paper reviews some of the importance-sampling techniques that have been developed in recent years to estimate dependability measures efficiently in Markov and nonMarkov models of highly dependable system
Cross-entropy optimisation of importance sampling parameters for statistical model checking
Statistical model checking avoids the exponential growth of states associated
with probabilistic model checking by estimating properties from multiple
executions of a system and by giving results within confidence bounds. Rare
properties are often very important but pose a particular challenge for
simulation-based approaches, hence a key objective under these circumstances is
to reduce the number and length of simulations necessary to produce a given
level of confidence. Importance sampling is a well-established technique that
achieves this, however to maintain the advantages of statistical model checking
it is necessary to find good importance sampling distributions without
considering the entire state space.
Motivated by the above, we present a simple algorithm that uses the notion of
cross-entropy to find the optimal parameters for an importance sampling
distribution. In contrast to previous work, our algorithm uses a low
dimensional vector of parameters to define this distribution and thus avoids
the often intractable explicit representation of a transition matrix. We show
that our parametrisation leads to a unique optimum and can produce many orders
of magnitude improvement in simulation efficiency. We demonstrate the efficacy
of our methodology by applying it to models from reliability engineering and
biochemistry.Comment: 16 pages, 8 figures, LNCS styl
Building Wavelet Histograms on Large Data in MapReduce
MapReduce is becoming the de facto framework for storing and processing
massive data, due to its excellent scalability, reliability, and elasticity. In
many MapReduce applications, obtaining a compact accurate summary of data is
essential. Among various data summarization tools, histograms have proven to be
particularly important and useful for summarizing data, and the wavelet
histogram is one of the most widely used histograms. In this paper, we
investigate the problem of building wavelet histograms efficiently on large
datasets in MapReduce. We measure the efficiency of the algorithms by both
end-to-end running time and communication cost. We demonstrate straightforward
adaptations of existing exact and approximate methods for building wavelet
histograms to MapReduce clusters are highly inefficient. To that end, we design
new algorithms for computing exact and approximate wavelet histograms and
discuss their implementation in MapReduce. We illustrate our techniques in
Hadoop, and compare to baseline solutions with extensive experiments performed
in a heterogeneous Hadoop cluster of 16 nodes, using large real and synthetic
datasets, up to hundreds of gigabytes. The results suggest significant (often
orders of magnitude) performance improvement achieved by our new algorithms.Comment: VLDB201
- …