15 research outputs found

    GPUs as Storage System Accelerators

    Full text link
    Massively multicore processors, such as Graphics Processing Units (GPUs), provide, at a comparable price, a one order of magnitude higher peak performance than traditional CPUs. This drop in the cost of computation, as any order-of-magnitude drop in the cost per unit of performance for a class of system components, triggers the opportunity to redesign systems and to explore new ways to engineer them to recalibrate the cost-to-performance relation. This project explores the feasibility of harnessing GPUs' computational power to improve the performance, reliability, or security of distributed storage systems. In this context, we present the design of a storage system prototype that uses GPU offloading to accelerate a number of computationally intensive primitives based on hashing, and introduce techniques to efficiently leverage the processing power of GPUs. We evaluate the performance of this prototype under two configurations: as a content addressable storage system that facilitates online similarity detection between successive versions of the same file and as a traditional system that uses hashing to preserve data integrity. Further, we evaluate the impact of offloading to the GPU on competing applications' performance. Our results show that this technique can bring tangible performance gains without negatively impacting the performance of concurrently running applications.Comment: IEEE Transactions on Parallel and Distributed Systems, 201

    JOR: A Journal-guided Reconstruction Optimization for RAID-Structured Storage Systems

    Get PDF
    This paper proposes a simple and practical RAID reconstruction optimization scheme, called JOurnal-guided Reconstruction (JOR). JOR exploits the fact that significant portions of data blocks in typical disk arrays are unused. JOR monitors the storage space utilization status at the block level to guide the reconstruction process so that only failed data on the used stripes is recovered to the spare disk. In JOR, data consistency is ensured by the requirement that all blocks in a disk array be initialized to zero (written with value zero) during synchronization while all blocks in the spare disk also be initialized to zero in the background. JOR can be easily incorporated into any existing reconstruction approach to optimize it, because the former is independent of and orthogonal to the latter. Experimental results obtained from our JOR prototype implementation demonstrate that JOR reduces reconstruction times of two state-of-the-art reconstruction schemes by an amount that is approximately proportional to the percentage of unused storage space while ensuring data consistency

    Distributed object store principles of operation

    Get PDF
    Abstract In this paper we look at the growth of distributed object stores (DOS) and examine the underlying mechanisms that guide their use and development. Our focus is on the fundamental principles of operation that define this class of system, how it has evolved, and where it is heading as new markets expand beyond the use originally presented. We conclude by speculating about how object stores as a class must evolve to meet the more demanding requirements of future applications

    Singleton: System-wide Page Deduplication in Virtual Environments

    Get PDF
    ABSTRACT We consider the problem of providing memory-management in hypervisors and propose Singleton, a KVM-based systemwide page deduplication solution to increase memory usage efficiency. Specifically, we address the problem of doublecaching that occurs in KVM-the same disk blocks are cached at both the host(hypervisor) and the guest(VM) page-caches. Singleton's main components are identical-page sharing across guest virtual machines and an implementation of an exclusivecache for the host and guest page-cache hierarchy. We use and improve KSM-Kernel SamePage Merging to identify and share pages across guest virtual machines. We utilize guest memory-snapshots to scrub the host page-cache and maintain a single copy of a page across the host and the guests. Singleton operates on a completely black-box assumption-we do not modify the guest or assume anything about it's behaviour. We show that conventional operating system cache management techniques are sub-optimal for virtual environments, and how Singleton supplements and improves the existing Linux kernel memory management mechanisms. Singleton is able to improve the utilization of the host cache by reducing its size(by upto an order of magnitude), and increasing the cache-hit ratio(by factor of 2x). This translates into better VM performance(40% faster I/O). Singleton's unified page deduplication and host cache scrubbing is able to reclaim large amounts of memory and facilitates higher levels of memory overcommitment. The optimizations to page deduplication we have implemented keep the overhead down to less than 20% CPU utilization

    Fragmentation in storage systems with duplicate elimination

    Get PDF
    Deduplication inevitably results in data fragmentation, because logically continuous data is scattered across many disk locations. Even though this significantly increases restore time from backup, the problem is still not well examined. In this work I close this gap by designing algorithms that reduce negative impact of fragmentation on restore time for two major types of fragmentation: internal and inter-version.Internal stream fragmentation is caused by the blocks appearing many times within a single backup. Such phenomenon happens surprisingly often and can result in even three times lower restore bandwidth. With an algorithm utilizing available forward knowledge to enable efficient caching I managed to improve this result on average by 62%-88% with only about 5% extra memory used. Although these results are achieved with limited forward knowledge, they are very close to the ones measured with no such limitation.Inter-version fragmentation is caused by duplicates from previous backups of the same backup set. Since such duplicates are very common due to repeated full backups containing a lot of unchanged data, this type of fragmentation may double the restore time after even a few backups. The context-based rewriting algorithm minimizes this effect by selectively rewriting a small percentage of duplicates during backup, limiting the bandwidth drop from 21.3% to 2.48% on average with only small increase in writing time and temporary space overhead.The two algorithms combined end up in a very effective symbiosis resulting in an average 142% restore bandwidth increase with standard 256MB of per-stream cache memory. In many cases such setup achieves results close to the theoretical maximum achievable with unlimited cache size. Moreover, all the above experiments where performed assuming only one spindle, even though in majority of today鈥檚 systems many spindles are used. In a sample setup with ten spindles, the restore bandwidth results are on average 5 times higher than in standard LRU case.Fragmentacja jest nieuniknion膮 konsekwencj膮 deduplikacji, poniewa偶 pojedynczy strumie艅 danych rozrzucany jest pomi臋dzy wiele lokalizacji na dysku. Fakt ten powoduje znacz膮ce wyd艂u偶enie czasu odzyskiwania danych z kopii zapasowych. Mimo to, problem wci膮偶 nie jest dobrze zbadany. Niniejsza praca wype艂nia t臋 luk臋 poprzez propozycje algorytm贸w, kt贸re redukuj膮 negatywny wp艂yw fragmentacji na czas odczytu dla dw贸ch najwa偶niejszych jej rodzaj贸w: wewn臋trznej fragmentacji strumienia oraz fragmentacji pomi臋dzy r贸偶nymi wersjami danych.Wewn臋trzna fragmentacja strumienia jest spowodowana blokami powtarzaj膮cymi si臋 wielokrotnie w pojedynczym strumieniu danych. To zjawisko zdarza si臋 zaskakuj膮co cz臋sto i powoduje nawet trzykrotnie ni偶sz膮 wydaj-no艣膰 odczytu. Proponowany w tej pracy algorytm efektywnego zarz膮dzania pami臋ci膮, wykorzystuj膮cy dost臋pn膮 wiedz臋 o danych, jest w stanie podnie艣膰 wydajno艣膰 odczytu o 62-88%, u偶ywaj膮c przy tym tylko 5% dodatkowej pami臋ci.Fragmentacja pomi臋dzy r贸偶nymi wersjami danych jest spowodowana duplikatami pochodz膮cymi z wcze艣niejszych zapis贸w tego samego zbioru danych. Poniewa偶 pe艂ne kopie zapasowe tworzone s膮 regularnie i zawieraj膮 du偶e ilo艣ci powtarzaj膮cych si臋 danych, takie duplikaty wyst臋puj膮 bardzo cz臋sto. W przypadku p贸藕niejszego odczytu, ich obecno艣膰 mo偶e powodowa膰 nawet podwojenie czasu potrzebnego na odzyskanie danych, po utworzeniu zaledwie kilku kopii zapasowych. Algorytm przepisywania kontekstowego minimalizuje ten efekt przez selektywne przepisywanie ma艂ej ilo艣ci duplikat贸w podczas zapisu. Takie post臋powanie jest w stanie ograniczy膰 艣redni spadek wydajno艣ci odczytu z 21,3% do 2,48%, kosztem minimalnego zwi臋kszenia czasu zapisudanych i wymagania niewielkiej przestrzeni dyskowej na pami臋膰 tymczasow膮.Obydwa algorytmy u偶yte razem dzia艂aj膮 jeszcze wydajniej, poprawiaj膮c przepustowo艣膰 odczytu przeci臋tnie o 142% przy standardowej ilo艣ci 256MB pami臋ci cache dla ka偶dego strumienia. Dodatkowo, poniewa偶 powy偶sze wyniki zak艂adaj膮 odczyt z jednego dysku, przeprowadzone zosta艂y testy symuluj膮ce korzystanie z przepustowo艣ci wielu dysk贸w, gdy偶 takie konfiguracje s膮 bardzo cz臋ste w dzisiejszych systemach. Dla przyk艂adu, u偶ywaj膮c dzieci臋ciu dysk贸w i proponowanych algorytm贸w, mo偶na osi膮gn膮膰 艣rednio pi臋ciokrotnie wy偶sz膮 wydajno艣膰 ni偶 w standardowym podej艣ciu z algorytmem typu LRU
    corecore