1,253 research outputs found

    Letter from the Special Issue Editor

    Get PDF
    Editorial work for DEBULL on a special issue on data management on Storage Class Memory (SCM) technologies

    On the use of NAND flash memory in high-performance relational databases

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 47-49).High-density NAND flash storage has become relatively inexpensive due to the popularity of various consumer electronics. Recently, several manufacturers have released IDE-compatible NAND flash-based drives in sizes up to 64 GB at reasonable (sub-$1000) prices. Because flash is significantly more durable than mechanical hard drives and requires considerably less energy, there is some speculation that large data centers will adopt these devices. As database workloads make up a substantial fraction of the processing done by data centers, it is interesting to ask how switching to flash-based storage will affect the performance of database systems. We evaluate this question using IDE-based flash drives from two major manufacturers. We measure their read and write performance and find that flash has excellent random read performance, acceptable sequential read performance, and quite poor write performance compared to conventional IDE disks. We then consider how standard database algorithms are affected by these performance characteristics and find that the fast random read capability dramatically improves the performance of secondary indexes and index-based join algorithms. We next investigate using logstructured filesystems to mitigate the poor write performance of flash and find an 8.2x improvement in random write performance, but at the cost of a 3.7x decrease in random read performance. Finally, we study techniques for exploiting the inherent parallelism of multiple-chip flash devices, and we find that adaptive coding strategies can yield a 2x performance improvement over static ones. We conclude that in many cases flash disk performance is still worse than on traditional drives and that current flash technology may not yet be mature enough for widespread database adoption if performance is a dominant factor. Finally, we briefly speculate how this landscape may change based on expected performance of next-generation flash memories.by Daniel Myers.S.M

    BlueDBM: An Appliance for Big Data Analytics

    Get PDF
    Complex data queries, because of their need for random accesses, have proven to be slow unless all the data can be accommodated in DRAM. There are many domains, such as genomics, geological data and daily twitter feeds where the datasets of interest are 5TB to 20 TB. For such a dataset, one would need a cluster with 100 servers, each with 128GB to 256GBs of DRAM, to accommodate all the data in DRAM. On the other hand, such datasets could be stored easily in the flash memory of a rack-sized cluster. Flash storage has much better random access performance than hard disks, which makes it desirable for analytics workloads. In this paper we present BlueDBM, a new system architecture which has flash-based storage with in-store processing capability and a low-latency high-throughput inter-controller network. We show that BlueDBM outperforms a flash-based system without these features by a factor of 10 for some important applications. While the performance of a ram-cloud system falls sharply even if only 5%~10% of the references are to the secondary storage, this sharp performance degradation is not an issue in BlueDBM. BlueDBM presents an attractive point in the cost-performance trade-off for Big Data analytics.Quanta Computer (Firm)Samsung (Firm)Lincoln Laboratory (PO7000261350)Intel Corporatio
    • …
    corecore