20,403 research outputs found

    Flash memory management system and method utilizing multiple block list windows

    Get PDF
    The present invention provides a flash memory management system and method with increased performance. The flash memory management system provides the ability to efficiently manage and allocate flash memory use in a way that improves reliability and longevity, while maintaining good performance levels. The flash memory management system includes a free block mechanism, a disk maintenance mechanism, and a bad block detection mechanism. The free block mechanism provides efficient sorting of free blocks to facilitate selecting low use blocks for writing. The disk maintenance mechanism provides for the ability to efficiently clean flash memory blocks during processor idle times. The bad block detection mechanism provides the ability to better detect when a block of flash memory is likely to go bad. The flash status mechanism stores information in fast access memory that describes the content and status of the data in the flash disk. The new bank detection mechanism provides the ability to automatically detect when new banks of flash memory are added to the system. Together, these mechanisms provide a flash memory management system that can improve the operational efficiency of systems that utilize flash memory

    Flashing up the storage hierarchy

    Get PDF
    The focus of this thesis is on systems that employ both flash and magnetic disks as storage media. Considering the widely disparate I/O costs of flash disks currently on the market, our approach is a cost-aware one: we explore techniques that exploit the I/O costs of the underlying storage devices to improve I/O performance. We also study the asymmetric I/O properties of magnetic and flash disks and propose algorithms that take advantage of this asymmetry. Our work is geared towards database systems; however, most of the ideas presented in this thesis can be generalised to any data-intensive application. For the case of low-end, inexpensive flash devices with large capacities, we propose using them at the same level of the memory hierarchy as magnetic disks. In such setups, we study the problem of data placement, that is, on which type of storage medium each data page should be stored. We present a family of online algorithms that can be used to dynamically decide the optimal placement of each page. Our algorithms adapt to changing workloads for maximum I/O efficiency. We found that substantial performance benefits can be gained with such a design, especially for queries touching large sets of pages with read-intensive workloads. Moving one level higher in the storage hierarchy, we study the problem of buffer allocation in databases that store data across multiple storage devices. We present our novel approach to per-device memory allocation, under which both the I/O costs of the storage devices and the cache behaviour of the data stored on each medium determine the size of the main memory buffers that will be allocated to each device. Towards informed decisions, we found that the ability to predict the cache behaviour of devices under various cache sizes is of paramount importance. In light of this, we study the problem of efficiently tracking the hit ratio curve for each device and introduce a lowoverhead technique that provides high accuracy. The price and performance characteristics of high-end flash disks make them perfectly suitable for use as caches between the main memory and the magnetic disk(s) of a storage system. In this context, we primarily focus on the problem of deciding which data should be placed in the flash cache of a system: how the data flows from one level of the memory hierarchy to the others is crucial for the performance of such a system. Considering such decisions, we found that the I/O costs of the flash cache play a major role. We also study several implementation issues such as the optimal size of flash pages and the properties of the page directory of a flash cache. Finally, we explore sorting in external memory using external merge-sort, as the latter employs access patterns that can take full advantage of the I/O characteristics of flash memory. We study the problem of sorting hierarchical data, as such is necessary for a wide variety of applications including archiving scientific data and dealing with large XML datasets. The proposed algorithm efficiently exploits the hierarchical structure in order to minimize the number of disk accesses and optimise the utilization of available memory. Our proposals are not specific to sorting over flash memory: the presented techniques are highly efficient over magnetic disks as well

    Write-limited sorts and joins for persistent memory

    Get PDF
    To mitigate the impact of the widening gap between the memory needs of CPUs and what standard memory technology can deliver, system architects have introduced a new class of memory technology termed persistent memory. Persistent memory is byteaddressable, but exhibits asymmetric I/O: writes are typically one order of magnitude more expensive than reads. Byte addressability combined with I/O asymmetry render the performance profile of persistent memory unique. Thus, it becomes imperative to find new ways to seamlessly incorporate it into database systems. We do so in the context of query processing. We focus on the fundamental operations of sort and join processing. We introduce the notion of write-limited algorithms that effectively minimize the I/O cost. We give a high-level API that enables the system to dynamically optimize the workflow of the algorithms; or, alternatively, allows the developer to tune the write profile of the algorithms. We present four different techniques to incorporate persistent memory into the database processing stack in light of this API. We have implemented and extensively evaluated all our proposals. Our results show that the algorithms deliver on their promise of I/O-minimality and tunable performance. We showcase the merits and deficiencies of each implementation technique, thus taking a solid first step towards incorporating persistent memory into query processing. 1

    Sam2bam: High-Performance Framework for NGS Data Preprocessing Tools

    Full text link
    This paper introduces a high-throughput software tool framework called {\it sam2bam} that enables users to significantly speedup pre-processing for next-generation sequencing data. The sam2bam is especially efficient on single-node multi-core large-memory systems. It can reduce the runtime of data pre-processing in marking duplicate reads on a single node system by 156-186x compared with de facto standard tools. The sam2bam consists of parallel software components that can fully utilize the multiple processors, available memory, high-bandwidth of storage, and hardware compression accelerators if available. The sam2bam provides file format conversion between well-known genome file formats, from SAM to BAM, as a basic feature. Additional features such as analyzing, filtering, and converting the input data are provided by {\it plug-in} tools, e.g., duplicate marking, which can be attached to sam2bam at runtime. We demonstrated that sam2bam could significantly reduce the runtime of NGS data pre-processing from about two hours to about one minute for a whole-exome data set on a 16-core single-node system using up to 130 GB of memory. The sam2bam could reduce the runtime for whole-genome sequencing data from about 20 hours to about nine minutes on the same system using up to 711 GB of memory

    Wear Minimization for Cuckoo Hashing: How Not to Throw a Lot of Eggs into One Basket

    Full text link
    We study wear-leveling techniques for cuckoo hashing, showing that it is possible to achieve a memory wear bound of loglogn+O(1)\log\log n+O(1) after the insertion of nn items into a table of size CnCn for a suitable constant CC using cuckoo hashing. Moreover, we study our cuckoo hashing method empirically, showing that it significantly improves on the memory wear performance for classic cuckoo hashing and linear probing in practice.Comment: 13 pages, 1 table, 7 figures; to appear at the 13th Symposium on Experimental Algorithms (SEA 2014
    corecore