1,587 research outputs found
Recommended from our members
Using distributed OLTP technology in a high performance storage system
The design of scaleable mass storage systems requires various system components to be distributed across multiple processors. Most of these processes maintain persistent database-type information (i.e., metadata) on the resources they are responsible for managing (e.g., bitfiles, bitfile segments, physical volumes, virtual volumes, cartridges, etc.). These processes all participate in fulfilling end-user requests and updating metadata information. A number of challenges arise when distributed processes attempt to maintain separate metadata resources with production-level integrity and consistency. For example, when requests fail, metadata changes made by the various processes must be aborted or rolled back. When requests are successful, all metadata changes must be committed together. If all metadata changes cannot be committed together for some reason, then all metadata changes must be rolled back to the previous consistent state. Lack of metadata consistency jeopardizes storage system integrity. Distributed on-line transaction processing (OLTP) technology can be applied to distributed mass storage systems as the mechanism for managing the consistency of distributed metadata. OLTP concepts are familiar to manN, industries such as banking and financial services but are less well known and understood in scientific and technical computing. As mass storage systems and other products are designed using distributed processing and data-management strategies for performance, scalability, and/or availability reasons, distributed OLTP technology can be applied to solve the inherent challenges raised by such environments. This paper discusses the benefits in using distributed transaction processing products. Design and implementation experiences using the Encina OLTP product from Transarc in the High Performance Storage System are presented in more detail as a case study for how this technology can be applied to mass storage systems designed for distributed environments
The End of Slow Networks: It's Time for a Redesign
Next generation high-performance RDMA-capable networks will require a
fundamental rethinking of the design and architecture of modern distributed
DBMSs. These systems are commonly designed and optimized under the assumption
that the network is the bottleneck: the network is slow and "thin", and thus
needs to be avoided as much as possible. Yet this assumption no longer holds
true. With InfiniBand FDR 4x, the bandwidth available to transfer data across
network is in the same ballpark as the bandwidth of one memory channel, and it
increases even further with the most recent EDR standard. Moreover, with the
increasing advances of RDMA, the latency improves similarly fast. In this
paper, we first argue that the "old" distributed database design is not capable
of taking full advantage of the network. Second, we propose architectural
redesigns for OLTP, OLAP and advanced analytical frameworks to take better
advantage of the improved bandwidth, latency and RDMA capabilities. Finally,
for each of the workload categories, we show that remarkable performance
improvements can be achieved
The End of a Myth: Distributed Transactions Can Scale
The common wisdom is that distributed transactions do not scale. But what if
distributed transactions could be made scalable using the next generation of
networks and a redesign of distributed databases? There would be no need for
developers anymore to worry about co-partitioning schemes to achieve decent
performance. Application development would become easier as data placement
would no longer determine how scalable an application is. Hardware provisioning
would be simplified as the system administrator can expect a linear scale-out
when adding more machines rather than some complex sub-linear function, which
is highly application specific.
In this paper, we present the design of our novel scalable database system
NAM-DB and show that distributed transactions with the very common Snapshot
Isolation guarantee can indeed scale using the next generation of RDMA-enabled
network technology without any inherent bottlenecks. Our experiments with the
TPC-C benchmark show that our system scales linearly to over 6.5 million
new-order (14.5 million total) distributed transactions per second on 56
machines.Comment: 12 page
Some Considerations about Modern Database Machines
Optimizing the two computing resources of any computing system - time and space - has al-ways been one of the priority objectives of any database. A current and effective solution in this respect is the computer database. Optimizing computer applications by means of database machines has been a steady preoccupation of researchers since the late seventies. Several information technologies have revolutionized the present information framework. Out of these, those which have brought a major contribution to the optimization of the databases are: efficient handling of large volumes of data (Data Warehouse, Data Mining, OLAP – On Line Analytical Processing), the improvement of DBMS – Database Management Systems facilities through the integration of the new technologies, the dramatic increase in computing power and the efficient use of it (computer networks, massive parallel computing, Grid Computing and so on). All these information technologies, and others, have favored the resumption of the research on database machines and the obtaining in the last few years of some very good practical results, as far as the optimization of the computing resources is concerned.Database Optimization, Database Machines, Data Warehouse, OLAP – On Line Analytical Processing, OLTP – On Line Transaction Processing, Parallel Processing
- …