2,126 research outputs found
The End of a Myth: Distributed Transactions Can Scale
The common wisdom is that distributed transactions do not scale. But what if
distributed transactions could be made scalable using the next generation of
networks and a redesign of distributed databases? There would be no need for
developers anymore to worry about co-partitioning schemes to achieve decent
performance. Application development would become easier as data placement
would no longer determine how scalable an application is. Hardware provisioning
would be simplified as the system administrator can expect a linear scale-out
when adding more machines rather than some complex sub-linear function, which
is highly application specific.
In this paper, we present the design of our novel scalable database system
NAM-DB and show that distributed transactions with the very common Snapshot
Isolation guarantee can indeed scale using the next generation of RDMA-enabled
network technology without any inherent bottlenecks. Our experiments with the
TPC-C benchmark show that our system scales linearly to over 6.5 million
new-order (14.5 million total) distributed transactions per second on 56
machines.Comment: 12 page
LogBase: A Scalable Log-structured Database System in the Cloud
Numerous applications such as financial transactions (e.g., stock trading)
are write-heavy in nature. The shift from reads to writes in web applications
has also been accelerating in recent years. Write-ahead-logging is a common
approach for providing recovery capability while improving performance in most
storage systems. However, the separation of log and application data incurs
write overheads observed in write-heavy environments and hence adversely
affects the write throughput and recovery time in the system. In this paper, we
introduce LogBase - a scalable log-structured database system that adopts
log-only storage for removing the write bottleneck and supporting fast system
recovery. LogBase is designed to be dynamically deployed on commodity clusters
to take advantage of elastic scaling property of cloud environments. LogBase
provides in-memory multiversion indexes for supporting efficient access to data
maintained in the log. LogBase also supports transactions that bundle read and
write operations spanning across multiple records. We implemented the proposed
system and compared it with HBase and a disk-based log-structured
record-oriented system modeled after RAMCloud. The experimental results show
that LogBase is able to provide sustained write throughput, efficient data
access out of the cache, and effective system recovery.Comment: VLDB201
Cache Serializability: Reducing Inconsistency in Edge Transactions
Read-only caches are widely used in cloud infrastructures to reduce access
latency and load on backend databases. Operators view coherent caches as
impractical at genuinely large scale and many client-facing caches are updated
in an asynchronous manner with best-effort pipelines. Existing solutions that
support cache consistency are inapplicable to this scenario since they require
a round trip to the database on every cache transaction.
Existing incoherent cache technologies are oblivious to transactional data
access, even if the backend database supports transactions. We propose T-Cache,
a novel caching policy for read-only transactions in which inconsistency is
tolerable (won't cause safety violations) but undesirable (has a cost). T-Cache
improves cache consistency despite asynchronous and unreliable communication
between the cache and the database. We define cache-serializability, a variant
of serializability that is suitable for incoherent caches, and prove that with
unbounded resources T-Cache implements this new specification. With limited
resources, T-Cache allows the system manager to choose a trade-off between
performance and consistency.
Our evaluation shows that T-Cache detects many inconsistencies with only
nominal overhead. We use synthetic workloads to demonstrate the efficacy of
T-Cache when data accesses are clustered and its adaptive reaction to workload
changes. With workloads based on the real-world topologies, T-Cache detects
43-70% of the inconsistencies and increases the rate of consistent transactions
by 33-58%.Comment: Ittay Eyal, Ken Birman, Robbert van Renesse, "Cache Serializability:
Reducing Inconsistency in Edge Transactions," Distributed Computing Systems
(ICDCS), IEEE 35th International Conference on, June~29 2015--July~2 201
CloudTPS: Scalable Transactions for Web Applications in the Cloud
NoSQL Cloud data services provide scalability and high availability properties for web applications but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager to allow cloud database services to execute the ACID transactions of web applications, even in the presence of server failures and network partitions. We implement this approach on top of the two main families of scalable data layers: Bigtable and SimpleDB. Performance evaluation on top of HBase (an open-source version of Bigtable) in our local cluster and Amazon SimpleDB in the Amazon cloud shows that our system scales linearly at least up to 40 nodes in our local cluster and 80 nodes in the Amazon cloud
CumuloNimbo: parallel-distributed transactional processing
CumuloNimbo aims at solving the lack of scalability of transactional applications that represent a large fraction of existing applications. CumuloNimbo aims at conceiving, architecting and developing a transactional, coherent, elastic and ultra scalable Platform as a Service. Its goals are: Ultra scalable and dependable, able to scale from a few users to many millions of users while at the same time providing continuous availability; Support transparent migration of multi-tier applications (e.g. Java EE applications, relational DB applications, etc.) to the cloud with automatic scalability and elasticity. Avoid reprogramming of applications and non-transparent scalability techniques such as sharding. Support transactions for new data stores such as cloud data stores, graph databases, etc.The main challenges are: Update ultrascalability (million update transactions per second and as many read-only transactions as needed). Strong transactional consistency. Non-intrusive elasticity. Inexpensive high availability. Low latency. CumuloNimbo goes beyond the state of the art by scaling transparently transactional applications to very large rates without sharding, the current practice in Today?s cloud. In this paper we describe CumuloNimbo architecture and its performance
Transactions and data management in NoSQL cloud databases
NoSQL databases have become the preferred option for storing and processing data in cloud computing as they are capable of providing high data availability, scalability and efficiency. But in order to achieve these attributes, NoSQL databases make certain trade-offs. First, NoSQL databases cannot guarantee strong consistency of data. They only guarantee a weaker consistency which is based on eventual consistency model. Second, NoSQL databases adopt a simple data model which makes it easy for data to be scaled across multiple nodes. Third, NoSQL databases do not support table joins and referential integrity which by implication, means they cannot implement complex queries. The combination of these factors implies that NoSQL databases cannot support transactions. Motivated by these crucial issues this thesis investigates into the transactions and data management in NoSQL databases.
It presents a novel approach that implements transactional support for NoSQL databases in order to ensure stronger data consistency and provide appropriate level of performance. The novelty lies in the design of a Multi-Key transaction model that guarantees the standard properties of transactions in order to ensure stronger consistency and integrity of data. The model is implemented in a novel loosely-coupled architecture that separates the implementation of transactional logic from the underlying data thus ensuring transparency and abstraction in cloud and NoSQL databases. The proposed approach is validated through the development of a prototype system using real MongoDB system. An extended version of the standard Yahoo! Cloud Services Benchmark (YCSB) has been used in order to test and evaluate the proposed approach. Various experiments have been conducted and sets of results have been generated. The results show that the proposed approach meets the research objectives. It maintains stronger consistency of cloud data as well as appropriate level of reliability and performance
Scalable transactions in the cloud: partitioning revisited
Lecture Notes in Computer Science, 6427Cloud computing is becoming one of the most used paradigms to deploy highly available and scalable systems. These systems usually demand the management of huge amounts of data, which cannot be solved with traditional nor replicated database systems as we know them. Recent solutions store data in special key-value structures, in an approach that commonly lacks the consistency provided by transactional guarantees, as it is traded for high scalability and availability. In order to ensure consistent access to the information, the use of transactions is required. However, it is well-known that traditional replication protocols do not scale well for a cloud environment. Here we take a look at current proposals to deploy transactional systems in the cloud and we propose a new system aiming at being a step forward in achieving this goal. We proceed to focus on data partitioning and describe the key role it plays in achieving high scalability.This work has been partially supported by the Spanish Government under grant TIN2009-14460-C03-02 and by the Spanish MEC under grant BES-2007-17362 and by project ReD Resilient Database Clusters (PDTC/EIA-EIA/109044/2008)
- …