Abstract. We propose a storage workload model able to process discrete time series incrementally, continually updating its parameters with the availability of new data. More specifically, a Hidden Markov Model (HMM) with an adaptive Baum-Welch algorithm is trained on two raw traces: a NetApp network trace consisting of timestamped I/O commands and a Microsoft trace also with timestamped entries containing reads and writes. Each of these traces is analyzed statistically and HMM parameters are inferred, from which a fluid input model with rates modulated by a Markov chain is derived. We generate new data traces using this Markovian fluid, workload model. To validate our parsimonious model, we compare statistics of the raw and generated traces and use the Viterbi algorithm to produce representative sequences of the hidden states. The incremental model is measured against both the standard model (parameterized on the whole dataset) and the raw data trace.
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.