4,344 research outputs found
When Do Redundant Requests Reduce Latency ?
Several systems possess the flexibility to serve requests in more than one
way. For instance, a distributed storage system storing multiple replicas of
the data can serve a request from any of the multiple servers that store the
requested data, or a computational task may be performed in a compute-cluster
by any one of multiple processors. In such systems, the latency of serving the
requests may potentially be reduced by sending "redundant requests": a request
may be sent to more servers than needed, and it is deemed served when the
requisite number of servers complete service. Such a mechanism trades off the
possibility of faster execution of at least one copy of the request with the
increase in the delay due to an increased load on the system. Due to this
tradeoff, it is unclear when redundant requests may actually help. Several
recent works empirically evaluate the latency performance of redundant requests
in diverse settings.
This work aims at an analytical study of the latency performance of redundant
requests, with the primary goals of characterizing under what scenarios sending
redundant requests will help (and under what scenarios they will not help), as
well as designing optimal redundant-requesting policies. We first present a
model that captures the key features of such systems. We show that when service
times are i.i.d. memoryless or "heavier", and when the additional copies of
already-completed jobs can be removed instantly, redundant requests reduce the
average latency. On the other hand, when service times are "lighter" or when
service times are memoryless and removal of jobs is not instantaneous, then not
having any redundancy in the requests is optimal under high loads. Our results
hold for arbitrary arrival processes.Comment: Extended version of paper presented at Allerton Conference 201
The MDS Queue: Analysing the Latency Performance of Erasure Codes
In order to scale economically, data centers are increasingly evolving their
data storage methods from the use of simple data replication to the use of more
powerful erasure codes, which provide the same level of reliability as
replication but at a significantly lower storage cost. In particular, it is
well known that Maximum-Distance-Separable (MDS) codes, such as Reed-Solomon
codes, provide the maximum storage efficiency. While the use of codes for
providing improved reliability in archival storage systems, where the data is
less frequently accessed (or so-called "cold data"), is well understood, the
role of codes in the storage of more frequently accessed and active "hot data",
where latency is the key metric, is less clear.
In this paper, we study data storage systems based on MDS codes through the
lens of queueing theory, and term this the "MDS queue." We analytically
characterize the (average) latency performance of MDS queues, for which we
present insightful scheduling policies that form upper and lower bounds to
performance, and are observed to be quite tight. Extensive simulations are also
provided and used to validate our theoretical analysis. We also employ the
framework of the MDS queue to analyse different methods of performing so-called
degraded reads (reading of partial data) in distributed data storage
Bayesian Cointegrated Vector Autoregression models incorporating Alpha-stable noise for inter-day price movements via Approximate Bayesian Computation
We consider a statistical model for pairs of traded assets, based on a
Cointegrated Vector Auto Regression (CVAR) Model. We extend standard CVAR
models to incorporate estimation of model parameters in the presence of price
series level shifts which are not accurately modeled in the standard Gaussian
error correction model (ECM) framework. This involves developing a novel matrix
variate Bayesian CVAR mixture model comprised of Gaussian errors intra-day and
Alpha-stable errors inter-day in the ECM framework. To achieve this we derive a
novel conjugate posterior model for the Scaled Mixtures of Normals (SMiN CVAR)
representation of Alpha-stable inter-day innovations. These results are
generalized to asymmetric models for the innovation noise at inter-day
boundaries allowing for skewed Alpha-stable models.
Our proposed model and sampling methodology is general, incorporating the
current literature on Gaussian models as a special subclass and also allowing
for price series level shifts either at random estimated time points or known a
priori time points. We focus analysis on regularly observed non-Gaussian level
shifts that can have significant effect on estimation performance in
statistical models failing to account for such level shifts, such as at the
close and open of markets. We compare the estimation accuracy of our model and
estimation approach to standard frequentist and Bayesian procedures for CVAR
models when non-Gaussian price series level shifts are present in the
individual series, such as inter-day boundaries. We fit a bi-variate
Alpha-stable model to the inter-day jumps and model the effect of such jumps on
estimation of matrix-variate CVAR model parameters using the likelihood based
Johansen procedure and a Bayesian estimation. We illustrate our model and the
corresponding estimation procedures we develop on both synthetic and actual
data.Comment: 30 page
- …