8 research outputs found

    FixityBerry: Environmentally Sustainable Digital Preservation for Very Low Resourced Cultural Heritage Institutions

    Get PDF
    Whereas large cultural heritage institutions have made significant headway in providing digital preservation for archival assets—such as by setting-up geographically redundant digital repositories— medium and small institutions have struggled to meet minimum digital preservation standards. This project will explore one option for enhancing the digital preservation capacity for very low-resourced environments. FixityBerry is a project which connects consumer-grade USB hard disks to the $35 Raspberry Pi computer, which checks file fixity weekly and powers down when checking is complete. This poster will report out on an eight-month pilot of using FixityBerry to monitor the digital assets from several small cultural heritage institutions.ye

    A basic framework and overview of a network-based RAID-like distributed back-up system : NetRAID

    Get PDF
    NetRAID is a framework for a simple, open, and free system to allow end-users to have the capacity to create a geographically distributed, secure, redundant system that will provide end-users with the capacity to back up important data. NetRAID is designed to be lightweight, cross-platform, low cost, extendable, and simple. As more important data becomes digitalized it is critical for even average home computer users to be able to ensure that their data is secure. Even for people with DVD burners that back up their data weekly, if the back ups and their sources are kept in the same physical location the value of the back-up is greatly diminished. NetRAID can offer a more comprehensive end-user back-up. NetRAID version 1 has some limitations with the types and speeds of networks it can run on; however, it provides a building block for the future extension to almost any sort of TCP/IP network. NetRAID also has the potential capability to use a wide variety of encryption and data verification schemes to make sure that data is secure in transmission and storage. The NetRAID virtual file system, sockets, and program core are written in Visual Basic.NET 2003, and should be portable to a wide variety of operating systems and languages in the future

    Power and Performance Management of Virtualized Computing Environments Via Lookahead Control

    Full text link

    Reliability and Security of RAID Storage Systems and D2D Archives Using SATA Disk Drives

    No full text
    Information storage reliability and security is addressed by using personal computer disk drives in enterprise-class nearline and archival storage systems. The low cost of these serial ATA (SATA) PC drives is a tradeoff against drive reliability design and demonstration test levels, which are higher in the more expensive SCSI and Fibre Channel drives. This article discusses the tradeoff between SATA which has the advantage that fewer higher capacity drives are needed for a given system storage capacity, which further reduces cost and allows higher drive failure rates, and the use of additional storage system redundancy and drive failure prediction to maintain system data integrity using less reliable drives. RAID stripe failure probability is calculated using typical ATA and SCSI drive failure rates, for single and double parity data reconstruction failure, and failure due to drive unrecoverable block errors. Reliability improvement from drive failure prediction is also calculated, and can be significant. Today’s SATA drive specifications for unrecoverable block errors appear to allow stripe reconstruction failure, and additional in-drive parity blocks are suggested as a solution. The possibility of using low cost disks data for backup and archiving is discussed, replacing higher cost magnetic tape. This requires significantly better RAID stripe failure probability, and suitable drive technology alternatives are discussed. The failure rate of nonoperating drives i

    Combined power and performance management of virtualized computing environments using limited lookahead control

    Get PDF
    There is growing incentive to reduce the power consumed by large-scale data centers that host online services such as banking, retail commerce, and gaming. Virtualization is a promising approach to consolidating multiple online services onto a smaller number of computing resources. A virtualized server environment allows computing resources to be shared among multiple performance-isolated platforms called virtual machines. By dynamically provisioning virtual machines, consolidating the workload, and turning servers on and off as needed, data center operators can maintain desired service-level agreements with end users while achieving higher server utilization and energy efficiency. This thesis develops an online resource provisioning framework for combined power and performance management in a virtualized computing environment serving sessionbased workloads. We pose this management problem as one of sequential optimization under uncertainty and solve it using limited lookahead control (LLC), a form of modelpredictive control. The approach accounts for the switching costs incurred while provisioning physical and virtual machines, and explicitly encodes the risk of provisioning resources in an uncertain and dynamic operating environment.We experimentally validate the control framework on a multi-tier e-commerce architecture hosting multiple online services. When managed using LLC, the cluster saves, on average, 41% in power-consumption costs over a twenty-four hour period when compared to a system operating without dynamic control. The overhead of the controller is low, compared to the control interval, on the order of a few seconds. We also use trace-based simulations to analyze LLC performance on server clusters larger than our testbed, and show how concepts from approximation theory can be used to further reduce the computational burden of controlling large systems.Ph.D., Computer Engineering -- Drexel University, 200

    Dependence-driven techniques in system design

    Get PDF
    Burstiness in workloads is often found in multi-tier architectures, storage systems, and communication networks. This feature is extremely important in system design because it can significantly degrade system performance and availability. This dissertation focuses on how to use knowledge of burstiness to develop new techniques and tools for performance prediction, scheduling, and resource allocation under bursty workload conditions.;For multi-tier enterprise systems, burstiness in the service times is catastrophic for performance. Via detailed experimentation, we identify the cause of performance degradation on the persistent bottleneck switch among various servers. This results in an unstable behavior that cannot be captured by existing capacity planning models. In this dissertation, beyond identifying the cause and effects of bottleneck switch in multi-tier systems, we also propose modifications to the classic TPC-W benchmark to emulate bursty arrivals in multi-tier systems.;This dissertation also demonstrates how burstiness can be used to improve system performance. Two dependence-driven scheduling policies, SWAP and ALoC, are developed. These general scheduling policies counteract burstiness in workloads and maintain high availability by delaying selected requests that contribute to burstiness. Extensive experiments show that both SWAP and ALoC achieve good estimates of service times based on the knowledge of burstiness in the service process. as a result, SWAP successfully approximates the shortest job first (SJF) scheduling without requiring a priori information of job service times. ALoC adaptively controls system load by infinitely delaying only a small fraction of the incoming requests.;The knowledge of burstiness can also be used to forecast the length of idle intervals in storage systems. In practice, background activities are scheduled during system idle times. The scheduling of background jobs is crucial in terms of the performance degradation of foreground jobs and the utilization of idle times. In this dissertation, new background scheduling schemes are designed to determine when and for how long idle times can be used for serving background jobs, without violating predefined performance targets of foreground jobs. Extensive trace-driven simulation results illustrate that the proposed schemes are effective and robust in a wide range of system conditions. Furthermore, if there is burstiness within idle times, then maintenance features like disk scrubbing and intra-disk data redundancy can be successfully scheduled as background activities during idle times
    corecore