36 research outputs found
Reducing Electricity Demand Charge for Data Centers with Partial Execution
Data centers consume a large amount of energy and incur substantial
electricity cost. In this paper, we study the familiar problem of reducing data
center energy cost with two new perspectives. First, we find, through an
empirical study of contracts from electric utilities powering Google data
centers, that demand charge per kW for the maximum power used is a major
component of the total cost. Second, many services such as Web search tolerate
partial execution of the requests because the response quality is a concave
function of processing time. Data from Microsoft Bing search engine confirms
this observation.
We propose a simple idea of using partial execution to reduce the peak power
demand and energy cost of data centers. We systematically study the problem of
scheduling partial execution with stringent SLAs on response quality. For a
single data center, we derive an optimal algorithm to solve the workload
scheduling problem. In the case of multiple geo-distributed data centers, the
demand of each data center is controlled by the request routing algorithm,
which makes the problem much more involved. We decouple the two aspects, and
develop a distributed optimization algorithm to solve the large-scale request
routing problem. Trace-driven simulations show that partial execution reduces
cost by for one data center, and by for geo-distributed
data centers together with request routing.Comment: 12 page
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Processing
Carefully balancing load in distributed stream processing systems has a
fundamental impact on execution latency and throughput. Load balancing is
challenging because real-world workloads are skewed: some tuples in the stream
are associated to keys which are significantly more frequent than others. Skew
is remarkably more problematic in large deployments: more workers implies fewer
keys per worker, so it becomes harder to "average out" the cost of hot keys
with cold keys.
We propose a novel load balancing technique that uses a heaving hitter
algorithm to efficiently identify the hottest keys in the stream. These hot
keys are assigned to choices to ensure a balanced load, where is
tuned automatically to minimize the memory and computation cost of operator
replication. The technique works online and does not require the use of routing
tables. Our extensive evaluation shows that our technique can balance
real-world workloads on large deployments, and improve throughput and latency
by and respectively over the previous
state-of-the-art when deployed on Apache Storm.Comment: 12 pages, 14 Figures, this paper is accepted and will be published at
ICDE 201
PaRiS: Causally Consistent Transactions with Non-blocking Reads and Partial Replication
Geo-replicated data platforms are at the backbone of several large-scale
online services. Transactional Causal Consistency (TCC) is an attractive
consistency level for building such platforms. TCC avoids many anomalies of
eventual consistency, eschews the synchronization costs of strong consistency,
and supports interactive read-write transactions. Partial replication is
another attractive design choice for building geo-replicated platforms, as it
increases the storage capacity and reduces update propagation costs. This paper
presents PaRiS, the first TCC system that supports partial replication and
implements non-blocking parallel read operations, whose latency is paramount
for the performance of read-intensive applications. PaRiS relies on a novel
protocol to track dependencies, called Universal Stable Time (UST). By means of
a lightweight background gossip process, UST identifies a snapshot of the data
that has been installed by every DC in the system. Hence, transactions can
consistently read from such a snapshot on any server in any replication site
without having to block. Moreover, PaRiS requires only one timestamp to track
dependencies and define transactional snapshots, thereby achieving resource
efficiency and scalability. We evaluate PaRiS on a large-scale AWS deployment
composed of up to 10 replication sites. We show that PaRiS scales well with the
number of DCs and partitions, while being able to handle larger data-sets than
existing solutions that assume full replication. We also demonstrate a
performance gain of non-blocking reads vs. a blocking alternative (up to 1.47x
higher throughput with 5.91x lower latency for read-dominated workloads and up
to 1.46x higher throughput with 20.56x lower latency for write-heavy
workloads)
User Behavior in Mass Media Websites
Mass media websites can be worthy to understand user trends in web services. RTVE, the National Broadcaster in Spain is a sample of such kind of service. Trend points to a shorter user interaction over the last three years, and a more straight access to content. Besides the number of pages consumed in a visit is becoming smaller as well. This article reviews these trends with data obtained from public sources, and analyze the distribution of web pages in the client layer and the corresponding distribution observed in the server layer. The two distributions can be characterized by Zipf-like distributions and ?, the degree of disparity in the popularity distribution, is calculated for both. In all cases ? is higher to one implying a huge concentration of popularity on a few objects
A Reliable and Cost-Efficient Auto-Scaling System for Web Applications Using Heterogeneous Spot Instances
Cloud providers sell their idle capacity on markets through an auction-like
mechanism to increase their return on investment. The instances sold in this
way are called spot instances. In spite that spot instances are usually 90%
cheaper than on-demand instances, they can be terminated by provider when their
bidding prices are lower than market prices. Thus, they are largely used to
provision fault-tolerant applications only. In this paper, we explore how to
utilize spot instances to provision web applications, which are usually
considered availability-critical. The idea is to take advantage of differences
in price among various types of spot instances to reach both high availability
and significant cost saving. We first propose a fault-tolerant model for web
applications provisioned by spot instances. Based on that, we devise novel
auto-scaling polices for hourly billed cloud markets. We implemented the
proposed model and policies both on a simulation testbed for repeatable
validation and Amazon EC2. The experiments on the simulation testbed and the
real platform against the benchmarks show that the proposed approach can
greatly reduce resource cost and still achieve satisfactory Quality of Service
(QoS) in terms of response time and availability
ОПТИМИЗАЦИЯ ПОДДЕРЖКИ ВЫЧИСЛИТЕЛЬНЫХ РЕСУРСОВ НА ЖЕЛЕЗНОДОРОЖНОМ ТРАНСПОРТЕ
In this article the author focuses on peculiarities of providing virtual resources, used in cloud computing systems for guaranteed quality of service, taking into account the requirements of QoS. The article contains the description of adaptive mechanism and comparative analysis of static and adaptive mechanisms to provide resources through simulation models. Moreover it covers such computing figures for models as the average time for request processing, the level of service denial, the significance of general application of system resources with different inbound parameters.Статья посвящена особенностям предоставления виртуальных ресурсов в системах, основанных на облачных технологиях, но с учетом выполнения требований QoS. Описан адаптивный механизм, а также проведен сравнительный анализ работы адаптивного и статических механизмов предоставления ресурсов с помощью имитационных моделей. Рассмотрены такие вычислительные показатели для моделей, как среднее время обработки запроса, уровень отказа обслуживания, значение общего использования ресурсов системы при различных входных параметрах
Hierarchical Content Stores in High-speed ICN Routers: Emulation and Prototype Implementation
Recent work motivates the design of Information-centric rou-ters that make use of hierarchies of memory to jointly scale in the size and speed of content stores. The present paper advances this understanding by (i) instantiating a general purpose two-layer packet-level caching system, (ii) investigating the solution design space via emulation, and (iii) introducing a proof-of-concept prototype. The emulation-based study reveals insights about the broad design space, the expected impact of workload, and gains due to multi-threaded execution. The full-blown system prototype experimentally confirms that, by exploiting both DRAM and SSD memory technologies, ICN routers can sustain cache operations in excess of 10Gbps running on off-the-shelf hardware