2,278 research outputs found

    CRAID: Online RAID upgrades using dynamic hot data reorganization

    Get PDF
    Current algorithms used to upgrade RAID arrays typically require large amounts of data to be migrated, even those that move only the minimum amount of data required to keep a balanced data load. This paper presents CRAID, a self-optimizing RAID array that performs an online block reorganization of frequently used, long-term accessed data in order to reduce this migration even further. To achieve this objective, CRAID tracks frequently used, long-term data blocks and copies them to a dedicated partition spread across all the disks in the array. When new disks are added, CRAID only needs to extend this process to the new devices to redistribute this partition, thus greatly reducing the overhead of the upgrade process. In addition, the reorganized access patterns within this partition improve the array’s performance, amortizing the copy overhead and allowing CRAID to offer a performance competitive with traditional RAIDs. We describe CRAID’s motivation and design and we evaluate it by replaying seven real-world workloads including a file server, a web server and a user share. Our experiments show that CRAID can successfully detect hot data variations and begin using new disks as soon as they are added to the array. Also, the usage of a dedicated partition improves the sequentiality of relevant data access, which amortizes the cost of reorganizations. Finally, we prove that a full-HDD CRAID array with a small distributed partition (<1.28% per disk) can compete in performance with an ideally restriped RAID-5 and a hybrid RAID-5 with a small SSD cache.Peer ReviewedPostprint (published version

    An Infrastructure for the Dynamic Distribution of Web Applications and Services

    Full text link
    This paper presents the design and implementation of an infrastructure that enables any Web application, regardless of its current state, to be stopped and uninstalled from a particular server, transferred to a new server, then installed, loaded, and resumed, with all these events occurring "on the fly" and totally transparent to clients. Such functionalities allow entire applications to fluidly move from server to server, reducing the overhead required to administer the system, and increasing its performance in a number of ways: (1) Dynamic replication of new instances of applications to several servers to raise throughput for scalability purposes, (2) Moving applications to servers to achieve load balancing or other resource management goals, (3) Caching entire applications on servers located closer to clients.National Science Foundation (9986397

    Cooperative Interval Caching in Clustered Multimedia Servers

    Get PDF
    In this project, we design a cooperative interval caching (CIC) algorithm for clustered video servers, and evaluate its performance through simulation. The CIC algorithm describes how distributed caches in the cluster cooperate to serve a given request. With CIC, a clustered server can accommodate twice (95%) more number of cached streams than the clustered server without cache cooperation. There are two major processes of CIC to find available cache space for a given request in the cluster: to find the server containing the information about the preceding request of the given request; and to find another server which may have available cache space if the current server turns out not to have enough cache space. The performance study shows that it is better to direct the requests of the same movie to the same server so that a request can always find the information of its preceding request from the same server. The CIC algorithm uses scoreboard mechanism to achieve this goal. The performance results also show that when the current server fails to find cache space for a given request, randomly selecting a server works well to find the next server which may have available cache space. The combination of scoreboard and random selection to find the preceding request information and the next available server outperforms other combinations of different approaches by 86%. With CIC, the cooperative distributed caches can support as many cached streams as one integrated cache does. In some cases, the cooperative distributed caches accommodate more number of cached streams than one integrated cache would do. The CIC algorithm makes every server in the cluster perform identical tasks to eliminate any single point of failure, there by increasing availability of the server cluster. The CIC algorithm also specifies how to smoothly add or remove a server to or from the cluster to provide the server with scalability

    Privacy-Enhanced Dependable and Searchable Storage in a Cloud-of-Clouds

    Get PDF
    In this dissertation we will propose a solution for a trustable and privacy-enhanced storage architecture based on a multi-cloud approach. The solution provides the necessary support for multi modal on-line searching operation on data that is always maintained encrypted on used cloud-services. We implemented a system prototype, conducting an experimental evaluation. Our results show that the proposal offers security and privacy guarantees, and provides efficient information retrieval capabilities without sacrificing precision and recall properties on the supported search operations. There is a constant increase in the demand of cloud services, particularly cloud-based storage services. These services are currently used by different applications as outsourced storage services, with some interesting advantages. Most personal and mobile applications also offer the user the choice to use the cloud to store their data, transparently and sometimes without entire user awareness and privacy-conditions, to overcome local storage limitations. Companies might also find that it is cheaper to outsource databases and keyvalue stores, instead of relying on storage solutions in private data-centers. This raises the concern about data privacy guarantees and data leakage danger. A cloud system administrator can easily access unprotected data and she/he could also forge, modify or delete data, violating privacy, integrity, availability and authenticity conditions. A possible solution to solve those problems would be to encrypt and add authenticity and integrity proofs in all data, before being sent to the cloud, and decrypting and verifying authenticity or integrity on data downloads. However this solution can be used only for backup purposes or when big data is not involved, and might not be very practical for online searching requirements over large amounts of cloud stored data that must be searched, accessed and retrieved in a dynamic way. Those solutions also impose high-latency and high amount of cloud inbound/outbound traffic, increasing the operational costs. Moreover, in the case of mobile or embedded devices, the power, computation and communication constraints cannot be ignored, since indexing, encrypting/decrypting and signing/verifying all data will be computationally expensive. To overcome the previous drawbacks, in this dissertation we propose a solution for a trustable and privacy-enhanced storage architecture based on a multi-cloud approach, providing privacy-enhanced, dependable and searchable support. Our solution provides the necessary support for dependable cloud storage and multi modal on-line searching operations over always-encrypted data in a cloud-of-clouds. We implemented a system prototype, conducting an experimental evaluation of the proposed solution, involving the use of conventional storage clouds, as well as, a high-speed in-memory cloud-storage backend. Our results show that the proposal offers the required dependability properties and privacy guarantees, providing efficient information retrieval capabilities without sacrificing precision and recall properties in the supported indexing and search operations

    Cooperative Cashing? An Economic Analysis of Document Duplication in Cooperative Web Caching

    Get PDF
    Cooperative caching is a popular mechanism to allow an array of distributed caches to cooperate and serve each others\u27 Web requests. Controlling duplication of documents across cooperating caches is a challenging problem faced by cache managers. In this paper, we study the economics of document duplication in strategic and nonstrategic settings. We have three primary findings. First, we find that the optimum level of duplication at a cache is nondecreasing in intercache latency, cache size, and extent of request locality. Second, in situations in which cache peering spans organizations, we find that the interaction between caches is a game of strategic substitutes wherein a cache employs lesser resources towards eliminating duplicate documents when the other caches employs more resources towards eliminating duplicate documents at that cache. Thus, a significant challenge will be to simultaneously induce multiple caches to contribute more resources towards reducing duplicate documents in the system. Finally, centralized decision making, which as expected provides improvements in average latency over a decentralized setup, can entail highly asymmetric duplication levels at the caches. This in turn can benefit one set of users at the expense of the other, and thus will be challenging to implement

    A Guide to Distributed Digital Preservation

    Get PDF
    This volume is devoted to the broad topic of distributed digital preservation, a still-emerging field of practice for the cultural memory arena. Replication and distribution hold out the promise of indefinite preservation of materials without degradation, but establishing effective organizational and technical processes to enable this form of digital preservation is daunting. Institutions need practical examples of how this task can be accomplished in manageable, low-cost ways."--P. [4] of cove
    • …
    corecore