4,212 research outputs found
Enabling distributed key-value stores with low latency-impact snapshot support
Current distributed key-value stores generally provide greater scalability at the expense of weaker consistency and isolation. However, additional isolation support is becoming increasingly important in the environments in which these stores are deployed, where different kinds of applications with different needs are executed, from transactional workloads to data analytics. While fully-fledged ACID support may not be feasible, it is still possible to take advantage of the design of these data stores, which often include the notion of multiversion concurrency control, to enable them with additional features at a much lower performance cost and maintaining its scalability and availability. In this paper we explore the effects that additional consistency guarantees and isolation capabilities may have on a state of the art key-value store: Apache Cassandra. We propose and implement a new multiversioned isolation level that provides stronger guarantees without compromising Cassandra's scalability and availability. As shown in our experiments, our version of Cassandra allows Snapshot Isolation-like transactions, preserving the overall performance and scalability of the system.This work is partially supported by the Ministry of Science and Technology of Spain and the European Union’s FEDER funds (TIN2007-60625, TIN2012-34557), by the Generalitat de Catalunya (2009-SGR-980), by the BSC-CNS Severo Ochoa program (SEV-2011-00067), by the HiPEAC European Network of Excellence (IST-
004408, FP7-ICT-217068, FP7-ICT-287759), and by IBM through the 2008 and 2010 IBM Faculty Award program.Peer ReviewedPostprint (author’s final draft
The End of a Myth: Distributed Transactions Can Scale
The common wisdom is that distributed transactions do not scale. But what if
distributed transactions could be made scalable using the next generation of
networks and a redesign of distributed databases? There would be no need for
developers anymore to worry about co-partitioning schemes to achieve decent
performance. Application development would become easier as data placement
would no longer determine how scalable an application is. Hardware provisioning
would be simplified as the system administrator can expect a linear scale-out
when adding more machines rather than some complex sub-linear function, which
is highly application specific.
In this paper, we present the design of our novel scalable database system
NAM-DB and show that distributed transactions with the very common Snapshot
Isolation guarantee can indeed scale using the next generation of RDMA-enabled
network technology without any inherent bottlenecks. Our experiments with the
TPC-C benchmark show that our system scales linearly to over 6.5 million
new-order (14.5 million total) distributed transactions per second on 56
machines.Comment: 12 page
The End of Slow Networks: It's Time for a Redesign
Next generation high-performance RDMA-capable networks will require a
fundamental rethinking of the design and architecture of modern distributed
DBMSs. These systems are commonly designed and optimized under the assumption
that the network is the bottleneck: the network is slow and "thin", and thus
needs to be avoided as much as possible. Yet this assumption no longer holds
true. With InfiniBand FDR 4x, the bandwidth available to transfer data across
network is in the same ballpark as the bandwidth of one memory channel, and it
increases even further with the most recent EDR standard. Moreover, with the
increasing advances of RDMA, the latency improves similarly fast. In this
paper, we first argue that the "old" distributed database design is not capable
of taking full advantage of the network. Second, we propose architectural
redesigns for OLTP, OLAP and advanced analytical frameworks to take better
advantage of the improved bandwidth, latency and RDMA capabilities. Finally,
for each of the workload categories, we show that remarkable performance
improvements can be achieved
Optimal column layout for hybrid workloads
Data-intensive analytical applications need to support both efficient reads and writes. However, what is usually a good data layout for an update-heavy workload, is not well-suited for a read-mostly one and vice versa. Modern analytical data systems rely on columnar layouts and employ delta stores to inject new data and updates. We show that for hybrid workloads we can achieve close to one order of magnitude better performance by tailoring the column layout design to the data and query workload. Our approach navigates the possible design space of the physical layout: it organizes each column’s data by determining the number of partitions, their corresponding sizes and ranges, and the amount of buffer space and how it is allocated. We frame these design decisions as an optimization problem that, given workload knowledge and performance requirements, provides an optimal physical layout for the workload at hand. To evaluate this work, we build an in-memory storage engine, Casper, and we show that it outperforms state-of-the-art data layouts of analytical systems for hybrid workloads. Casper delivers up to 2.32x higher throughput for update-intensive workloads and up to 2.14x higher throughput for hybrid workloads. We further show how to make data layout decisions robust to workload variation by carefully selecting the input of the optimization.http://www.vldb.org/pvldb/vol12/p2393-athanassoulis.pdfPublished versionPublished versio
- …