24 research outputs found
Harmony: Towards automated self-adaptive consistency in cloud storage
In just a few years cloud computing has become a very popular paradigm and a business success story, with storage being one of the key features. To achieve high data availability, cloud storage services rely on replication. In this context, one major challenge is data consistency. In contrast to traditional approaches that are mostly based on strong consistency, many cloud storage services opt for weaker consistency models in order to achieve better availability and performance. This comes at the cost of a high probability of stale data being read, as the replicas involved in the reads may not always have the most recent write. In this paper, we propose a novel approach, named Harmony, which adaptively tunes the consistency level at run-time according to the application requirements. The key idea behind Harmony is an intelligent estimation model of stale reads, allowing to elastically scale up or down the number of replicas involved in read operations to maintain a low (possibly zero) tolerable fraction of stale reads. As a result, Harmony can meet the desired consistency of the applications while achieving good performance. We have implemented Harmony and performed extensive evaluations with the Cassandra cloud storage on Grid?5000 testbed and on Amazon EC2. The results show that Harmony can achieve good performance without exceeding the tolerated number of stale reads. For instance, in contrast to the static eventual consistency used in Cassandra, Harmony reduces the stale data being read by almost 80% while adding only minimal latency. Meanwhile, it improves the throughput of the system by 45% while maintaining the desired consistency requirements of the applications when compared to the strong consistency model in Cassandra
Consistency in the Cloud:When Money Does Matter!
With the emergence of cloud computing, many organizations have moved their data to the cloud in order to provide scalable, reliable and highly available services. To meet ever growing user needs, these services mainly rely on geographically-distributed data replication to guarantee good performance and high availability. However, with replication, consistency comes into question. Service providers in the cloud have the freedom to select the level of consistency according to the access patterns exhibited by the applications.Most optimizations efforts then concentrate on how to provide adequate trade-offs between between consistency guarantees and performance. However, as the monetary cost completely relies on the service providers, in this paper we argue that monetary cost should be taken into consideration when evaluating or selecting a consistency level in the cloud. Accordingly, we define a new metric called consistency-cost efficiency. Based on this metric, we present a simple, yet efficient economical consistency model, called Bismar, that adaptively tunes the consistency level at run-time in order to reduce the monetary cost while simultaneously maintaining a low fraction of stale reads. Experimental evaluations with the Cassandra cloud storage on a Grid'5000 testbed show the validity of the metric and demonstrate the effectiveness of the proposed consistency model
Harmony: Towards Automated Self-Adaptive Consistency in Cloud Storage
International audienceIn just a few years cloud computing has become a very popular paradigm and a business success story, with storage being one of the key features. To achieve high data availability, cloud storage services rely on replication. In this context, one major challenge is data consistency. In contrast to traditional approaches that are mostly based on strong consistency, many cloud storage services opt for weaker consistency models in order to achieve better availability and performance. This comes at the cost of a high probability of stale data being read, as the replicas involved in the reads may not always have the most recent write. In this paper, we propose a novel approach, named Harmony, which adaptively tunes the consistency level at run-time according to the application requirements. The key idea behind Harmony is an intelligent estimation model of stale reads, allowing to elastically scale up or down the number of replicas involved in read operations to maintain a low (possibly zero) tolerable fraction of stale reads. As a result, Harmony can meet the desired consistency of the applications while achieving good performance. We have implemented Harmony and performed extensive evaluations with the Cassandra cloud storage on Grid'5000 testbed and on Amazon EC2. The results show that Harmony can achieve good performance without exceeding the tolerated number of stale reads. For instance, in contrast to the static eventual consistency used in Cassandra, Harmony reduces the stale data being read by almost 80% while adding only minimal latency. Meanwhile, it improves the throughput of the system by 45% while maintaining the desired consistency requirements of the applications when compared to the strong consistency modelin Cassandra
Live migration of user environments across wide area networks
A complex challenge in mobile computing is to allow the user to migrate her highly customised environment while moving to a different location and to continue work without interruption. I motivate why this is a highly desirable capability and conduct a survey of the current approaches towards this goal and explain their limitations. I then propose a new architecture to support user mobility by live migration of a userâs operating system instance over the network. Previous work includes the Collective and Internet Suspend/Resume projects that have addressed migration of a userâs environment by suspending the running state and resuming it at a later time. In contrast to previous work, this work addresses live migration of a userâs operating system instance across wide area links. Live migration is done by performing most of the migration while the operating system is still running, achieving very little downtime and preserving all network connectivity.
I developed an initial proof of concept of this solution. It relies on migrating whole operating systems using the Xen virtual machine and provides a way to perform live migration of persistent storage as well as the network connections across subnets. These challenges have not been addressed previously in this scenario. In a virtual machine environment, persistent storage is provided by virtual block devices. The architecture supports decentralized virtual block device replication across wide area network links, as well as migrating network connection across subnetworks using the Host Identity Protocol. The proposed architecture is compared against existing solutions and an initial performance evaluation of the prototype implementation is presented, showing that such a solution is a promising step towards true seamless mobility of fully fledged computing environments
Dependable Embedded Systems
This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from todayâs points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems
Clouder : a flexible large scale decentralized object store
Programa Doutoral em InformĂĄtica MAP-iLarge scale data stores have been initially introduced to support a few concrete extreme
scale applications such as social networks. Their scalability and availability
requirements often outweigh sacrificing richer data and processing models, and even
elementary data consistency. In strong contrast with traditional relational databases
(RDBMS), large scale data stores present very simple data models and APIs, lacking
most of the established relational data management operations; and relax consistency
guarantees, providing eventual consistency.
With a number of alternatives now available and mature, there is an increasing
willingness to use them in a wider and more diverse spectrum of applications, by
skewing the current trade-off towards the needs of common business users, and easing
the migration from current RDBMS. This is particularly so when used in the context
of a Cloud solution such as in a Platform as a Service (PaaS).
This thesis aims at reducing the gap between traditional RDBMS and large scale
data stores, by seeking mechanisms to provide additional consistency guarantees and
higher level data processing primitives in large scale data stores. The devised mechanisms
should not hinder the scalability and dependability of large scale data stores.
Regarding, higher level data processing primitives this thesis explores two complementary
approaches: by extending data stores with additional operations such as general
multi-item operations; and by coupling data stores with other existent processing
facilities without hindering scalability.
We address this challenges with a new architecture for large scale data stores, efficient
multi item access for large scale data stores, and SQL processing atop large scale
data stores. The novel architecture allows to find the right trade-offs among flexible
usage, efficiency, and fault-tolerance. To efficient support multi item access we extend first generation large scale data storeâs data models with tags and a multi-tuple data
placement strategy, that allow to efficiently store and retrieve large sets of related data
at once. For efficient SQL support atop scalable data stores we devise design modifications
to existing relational SQL query engines, allowing them to be distributed.
We demonstrate our approaches with running prototypes and extensive experimental
evaluation using proper workloads.Os sistemas de armazenamento de dados de grande escala foram inicialmente desenvolvidos
para suportar um leque restrito de aplicacÔes de escala extrema, como as
redes sociais. Os requisitos de escalabilidade e elevada disponibilidade levaram a
sacrificar modelos de dados e processamento enriquecidos e atĂ© a coerĂȘncia dos dados.
Em oposição aos tradicionais sistemas relacionais de gestão de bases de dados
(SRGBD), os sistemas de armazenamento de dados de grande escala apresentam modelos
de dados e APIs muito simples. Em particular, evidenciasse a ausĂȘncia de muitas
das conhecidas operacÔes de gestão de dados relacionais e o relaxamento das garantias
de coerĂȘncia, fornecendo coerĂȘncia futura.
Atualmente, com o nĂșmero de alternativas disponĂveis e maduras, existe o crescente
interesse em uså-los num maior e diverso leque de aplicacÔes, orientando o atual
compromisso para as necessidades dos tĂpicos clientes empresariais e facilitando a
migração a partir das atuais SRGBD. Isto é particularmente importante no contexto de
soluçÔes cloud como plataformas como um servicžo (PaaS).
Esta tese tem como objetivo reduzir a diferencça entre os tradicionais SRGDBs e os
sistemas de armazenamento de dados de grande escala, procurando mecanismos que
providenciem garantias de coerĂȘncia mais fortes e primitivas com maior capacidade de
processamento. Os mecanismos desenvolvidos nĂŁo devem comprometer a escalabilidade
e fiabilidade dos sistemas de armazenamento de dados de grande escala. No que
diz respeito Ă s primitivas com maior capacidade de processamento esta tese explora
duas abordagens complementares : a extensĂŁo de sistemas de armazenamento de dados
de grande escala com operacÔes genéricas de multi objeto e a junção dos sistemas de armazenamento de dados de grande escala com mecanismos existentes de processamento
e interrogacž Ëao de dados, sem colocar em causa a escalabilidade dos mesmos.
Para isso apresentÂŽamos uma nova arquitetura para os sistemas de armazenamento
de dados de grande escala, acesso eficiente a mÂŽultiplos objetos, e processamento de
SQL sobre sistemas de armazenamento de dados de grande escala. A nova arquitetura
permite encontrar os compromissos adequados entre flexibilidade, eficiËencia e
tolerËancia a faltas. De forma a suportar de forma eficiente o acesso a mÂŽultiplos objetos
estendemos o modelo de dados de sistemas de armazenamento de dados de grande escala
da primeira geracž Ëao com palavras-chave e definimos uma estratÂŽegia de colocacž Ëao
de dados para mÂŽultiplos objetos que permite de forma eficiente armazenar e obter
grandes quantidades de dados de uma sÂŽo vez. Para o suporte eficiente de SQL sobre
sistemas de armazenamento de dados de grande escala, analisĂĄmos a arquitetura dos
motores de interrogação de SRGBDs e fizemos alteraçÔes que permitem que sejam
distribuĂdos.
As abordagens propostas são demonstradas através de protótipos e uma avaliacão
experimental exaustiva recorrendo a cargas adequadas baseadas em aplicaçÔes reais