25,346 research outputs found
GPUs as Storage System Accelerators
Massively multicore processors, such as Graphics Processing Units (GPUs),
provide, at a comparable price, a one order of magnitude higher peak
performance than traditional CPUs. This drop in the cost of computation, as any
order-of-magnitude drop in the cost per unit of performance for a class of
system components, triggers the opportunity to redesign systems and to explore
new ways to engineer them to recalibrate the cost-to-performance relation. This
project explores the feasibility of harnessing GPUs' computational power to
improve the performance, reliability, or security of distributed storage
systems. In this context, we present the design of a storage system prototype
that uses GPU offloading to accelerate a number of computationally intensive
primitives based on hashing, and introduce techniques to efficiently leverage
the processing power of GPUs. We evaluate the performance of this prototype
under two configurations: as a content addressable storage system that
facilitates online similarity detection between successive versions of the same
file and as a traditional system that uses hashing to preserve data integrity.
Further, we evaluate the impact of offloading to the GPU on competing
applications' performance. Our results show that this technique can bring
tangible performance gains without negatively impacting the performance of
concurrently running applications.Comment: IEEE Transactions on Parallel and Distributed Systems, 201
Reservation-Based Federated Scheduling for Parallel Real-Time Tasks
This paper considers the scheduling of parallel real-time tasks with
arbitrary-deadlines. Each job of a parallel task is described as a directed
acyclic graph (DAG). In contrast to prior work in this area, where
decomposition-based scheduling algorithms are proposed based on the
DAG-structure and inter-task interference is analyzed as self-suspending
behavior, this paper generalizes the federated scheduling approach. We propose
a reservation-based algorithm, called reservation-based federated scheduling,
that dominates federated scheduling. We provide general constraints for the
design of such systems and prove that reservation-based federated scheduling
has a constant speedup factor with respect to any optimal DAG task scheduler.
Furthermore, the presented algorithm can be used in conjunction with any
scheduler and scheduling analysis suitable for ordinary arbitrary-deadline
sporadic task sets, i.e., without parallelism
Marginal productivity index policies for problems of admission control and routing to parallel queues with delay
In this paper we consider the problem of admission control of Bernoulli arrivals to a
buffer with geometric server, in which the controller’s actions take effect one period
after the actual change in the queue length. An optimal policy in terms of marginal
productivity indices (MPI) is derived for this problem under the following three
performance objectives: (i) minimization of the expected total discounted sum of
holding costs and rejection costs, (ii) minimization of the expected time-average sum of
holding costs and rejection costs, and (iii) maximization of the expected time-average
number of job completions. Our employment of existing theoretical and algorithmic
results on restless bandit indexation together with some new results yields a fast
algorithm that computes the MPI for a queue with a buffer size of I performing only
O(I) arithmetic operations. Such MPI values can be used both to immediately obtain the
optimal thresholds for the admission control problem, and to design an index policy for
the routing problem (with possible admission control) in the multi-queue system. Thus,
this paper further addresses the problem of designing and computing a tractable
heuristic policy for dynamic job admission control and/or routing in a discrete time
Markovian model of parallel loss queues with one-period delayed state observation
and/or action implementation, which comes close to optimizing an infinite-horizon
problem under the above three objectives. Our approach seems to be tractable also for
the analogous problems with larger delays and, more generally, for arbitrary restless
bandits with delays
- …