6,677 research outputs found
D-SPACE4Cloud: A Design Tool for Big Data Applications
The last years have seen a steep rise in data generation worldwide, with the
development and widespread adoption of several software projects targeting the
Big Data paradigm. Many companies currently engage in Big Data analytics as
part of their core business activities, nonetheless there are no tools and
techniques to support the design of the underlying hardware configuration
backing such systems. In particular, the focus in this report is set on Cloud
deployed clusters, which represent a cost-effective alternative to on premises
installations. We propose a novel tool implementing a battery of optimization
and prediction techniques integrated so as to efficiently assess several
alternative resource configurations, in order to determine the minimum cost
cluster deployment satisfying QoS constraints. Further, the experimental
campaign conducted on real systems shows the validity and relevance of the
proposed method
On Resource Pooling and Separation for LRU Caching
Caching systems using the Least Recently Used (LRU) principle have now become
ubiquitous. A fundamental question for these systems is whether the cache space
should be pooled together or divided to serve multiple flows of data item
requests in order to minimize the miss probabilities. In this paper, we show
that there is no straight yes or no answer to this question, depending on
complex combinations of critical factors, including, e.g., request rates,
overlapped data items across different request flows, data item popularities
and their sizes. Specifically, we characterize the asymptotic miss
probabilities for multiple competing request flows under resource pooling and
separation for LRU caching when the cache size is large.
Analytically, we show that it is asymptotically optimal to jointly serve
multiple flows if their data item sizes and popularity distributions are
similar and their arrival rates do not differ significantly; the
self-organizing property of LRU caching automatically optimizes the resource
allocation among them asymptotically. Otherwise, separating these flows could
be better, e.g., when data sizes vary significantly. We also quantify critical
points beyond which resource pooling is better than separation for each of the
flows when the overlapped data items exceed certain levels. Technically, we
generalize existing results on the asymptotic miss probability of LRU caching
for a broad class of heavy-tailed distributions and extend them to multiple
competing flows with varying data item sizes, which also validates the Che
approximation under certain conditions. These results provide new insights on
improving the performance of caching systems
Optimal Posted Prices for Online Cloud Resource Allocation
We study online resource allocation in a cloud computing platform, through a
posted pricing mechanism: The cloud provider publishes a unit price for each
resource type, which may vary over time; upon arrival at the cloud system, a
cloud user either takes the current prices, renting resources to execute its
job, or refuses the prices without running its job there. We design pricing
functions based on the current resource utilization ratios, in a wide array of
demand-supply relationships and resource occupation durations, and prove
worst-case competitive ratios of the pricing functions in terms of social
welfare. In the basic case of a single-type, non-recycled resource (i.e.,
allocated resources are not later released for reuse), we prove that our
pricing function design is optimal, in that any other pricing function can only
lead to a worse competitive ratio. Insights obtained from the basic cases are
then used to generalize the pricing functions to more realistic cloud systems
with multiple types of resources, where a job occupies allocated resources for
a number of time slots till completion, upon which time the resources are
returned back to the cloud resource pool
Power Management Techniques for Data Centers: A Survey
With growing use of internet and exponential growth in amount of data to be
stored and processed (known as 'big data'), the size of data centers has
greatly increased. This, however, has resulted in significant increase in the
power consumption of the data centers. For this reason, managing power
consumption of data centers has become essential. In this paper, we highlight
the need of achieving energy efficiency in data centers and survey several
recent architectural techniques designed for power management of data centers.
We also present a classification of these techniques based on their
characteristics. This paper aims to provide insights into the techniques for
improving energy efficiency of data centers and encourage the designers to
invent novel solutions for managing the large power dissipation of data
centers.Comment: Keywords: Data Centers, Power Management, Low-power Design, Energy
Efficiency, Green Computing, DVFS, Server Consolidatio
- …