283 research outputs found
Interactive media server with media synchronized raid storage system
We propose an efficient placement algorithm and per-disk prefetching method to effectively support interactive operations in the media server. Our placement policy is incorporated with an encoder having a special bitcount control scheme that repeatedly tunes quantization parameters to adjust the bitcounts of video frames. This encoder can generate coded frames whose sizes are synchronized with the RAID stripe size, so that when various fast-forward levels are accessed we can reduce the seek and rotational latency and enhance the disk throughput of each disk in the RAID system. In the experimental results, the proposed placement policy and bitrate control scheme can significantly improve the average service time, which can enlarge the capacity of the interactive media server
CERN openlab Whitepaper on Future IT Challenges in Scientific Research
This whitepaper describes the major IT challenges in scientific research at CERN and several other European and international research laboratories and projects. Each challenge is exemplified through a set of concrete use cases drawn from the requirements of large-scale scientific programs. The paper is based on contributions from many researchers and IT experts of the participating laboratories and also input from the existing CERN openlab industrial sponsors. The views expressed in this document are those of the individual contributors and do not necessarily reflect the view of their organisations and/or affiliates
Internet Predictions
More than a dozen leading experts give their opinions on where the Internet is headed and where it will be in the next decade in terms of technology, policy, and applications. They cover topics ranging from the Internet of Things to climate change to the digital storage of the future. A summary of the articles is available in the Web extras section
A server-less architecture for building scalable, reliable, and cost-effective video-on-demand systems.
Leung Wai Tak.Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.Includes bibliographical references (leaves 58-60).Abstracts in English and Chinese.Acknowledgement --- p.IAbstract --- p.II摘要 --- p.IIIChapter Chapter 1 --- Introduction --- p.1Chapter Chapter 2 --- Related Works --- p.5Chapter 2.1 --- Previous Works --- p.5Chapter 2.2 --- Contributions of this Study --- p.7Chapter Chapter 3 --- Architecture --- p.9Chapter 3.1 --- Data Placement Policy --- p.10Chapter 3.2 --- Retrieval and Transmission Scheduling --- p.13Chapter 3.3 --- Fault Tolerance --- p.20Chapter Chapter 4 --- Performance Modeling --- p.22Chapter 4.1 --- Storage Requirement --- p.22Chapter 4.2 --- Network Bandwidth Requirement --- p.23Chapter 4.3 --- Buffer Requirement --- p.24Chapter 4.4 --- System Response Time --- p.27Chapter Chapter 5 --- System Reliability --- p.29Chapter 5.1 --- System Failure Model --- p.29Chapter 5.2 --- Minimum System Repair Capability --- p.32Chapter 5.3 --- Redundancy Configuration --- p.35Chapter Chapter 6 --- System Dimensioning --- p.37Chapter 6.1 --- Storage Capacity --- p.38Chapter 6.2 --- Network Capacity --- p.38Chapter 6.3 --- Disk Access Bandwidth --- p.39Chapter 6.4 --- Buffer Requirement --- p.41Chapter 6.5 --- System Response Time --- p.43Chapter Chapter 7 --- Multiple Parity Groups --- p.45Chapter 7.1 --- System Failure Model --- p.47Chapter 7.2 --- Buffer Requirement --- p.47Chapter 7.3 --- System Response Time --- p.49Chapter 7.4 --- Redundancy Configuration --- p.49Chapter 7.5 --- Scalability --- p.51Chapter Chapter 8 --- Conclusions and Future Works --- p.53Appendix --- p.55Chapter A. --- Derivation of the Artificial Admission Delay --- p.55Chapter B. --- Derivation of the Receiver Buffer Requirement --- p.56Bibliography --- p.5
Recommended from our members
Making Data Storage Efficient in the Era of Cloud Computing
We enter the era of cloud computing in the last decade, as many paradigm shifts are happening on how people write and deploy applications. Despite the advancement of cloud computing, data storage abstractions have not evolved much, causing inefficiencies in performance, cost, and security.
This dissertation proposes a novel approach to make data storage efficient in the era of cloud computing by building new storage abstractions and systems that bridge the gap between cloud computing and data storage and simplify development. We build four systems to address four data inefficiencies in cloud computing.
The first system, Grandet, solves the data storage inefficiency caused by the paradigm shift from upfront provisioning to a variety of pay-as-you-go cloud services. Grandet is an extensible storage system that significantly reduces storage costs for web applications deployed in the cloud. Under the hood, it supports multiple heterogeneous stores and unifies them by placing each data object at the store deemed most economical. Our results show that Grandet reduces their costs by an average of 42.4%, and it is fast, scalable, and easy to use.
The second system, Unic, solves the data inefficiency caused by the paradigm shift from single-tenancy to multi-tenancy. Unic securely deduplicates general computations. It exports a cache service that allows cloud applications running on behalf of mutually distrusting users to memoize and reuse computation results, thereby improving performance. Unic achieves both integrity and secrecy through a novel use of code attestation, and it provides a simple yet expressive API that enables applications to deduplicate their own rich computations. Our results show that Unic is easy to use, speeds up applications by an average of 7.58x, and with little storage overhead.
The third system, Lambdata, solves the data inefficiency caused by the paradigm shift to serverless computing, where developers only write core business logic, and cloud service providers maintain all the infrastructure. Lambdata is a novel serverless computing system that enables developers to declare a cloud function's data intents, including both data read and data written. Once data intents are made explicit, Lambdata performs a variety of optimizations to improve speed, including caching data locally and scheduling functions based on code and data locality. Our results show that Lambdata achieves an average speedup of 1.51x on the turnaround time of practical workloads and reduces monetary cost by 16.5%.
The fourth system, CleanOS, solves the data inefficiency caused by the paradigm shift from desktop computers to smartphones always connected to the cloud. CleanOS is a new Android-based operating system that manages sensitive data rigorously and maintains a clean environment at all times. It identifies and tracks sensitive data, encrypts it with a key, and evicts that key to the cloud when the data is not in active use on the device. Our results show that CleanOS limits sensitive-data exposure drastically while incurring acceptable overheads on mobile networks
What broke where for distributed and parallel applications — a whodunit story
Detection, diagnosis and mitigation of performance problems in today\u27s large-scale distributed and parallel systems is a difficult task. These large distributed and parallel systems are composed of various complex software and hardware components. When the system experiences some performance or correctness problem, developers struggle to understand the root cause of the problem and fix in a timely manner. In my thesis, I address these three components of the performance problems in computer systems. First, we focus on diagnosing performance problems in large-scale parallel applications running on supercomputers. We developed techniques to localize the performance problem for root-cause analysis. Parallel applications, most of which are complex scientific simulations running in supercomputers, can create up to millions of parallel tasks that run on different machines and communicate using the message passing paradigm. We developed a highly scalable and accurate automated debugging tool called PRODOMETER, which uses sophisticated algorithms to first, create a logical progress dependency graph of the tasks to highlight how the problem spread through the system manifesting as a system-wide performance issue. Second, uses this logical progress dependence graph to identify the task where the problem originated. Finally, PRODOMETER pinpoints the code region corresponding to the origin of the bug. Second, we developed a tool-chain that can detect performance anomaly using machine-learning techniques and can achieve very low false positive rate. Our input-aware performance anomaly detection system consists of a scalable data collection framework to collect performance related metrics from different granularity of code regions, an offline model creation and prediction-error characterization technique, and a threshold based anomaly-detection-engine for production runs. Our system requires few training runs and can handle unknown inputs and parameter combinations by dynamically calibrating the anomaly detection threshold according to the characteristics of the input data and the characteristics of the prediction-error of the models. Third, we developed performance problem mitigation scheme for erasure-coded distributed storage systems. Repair operations of the failed blocks in erasure-coded distributed storage system take really long time in networked constrained data-centers. The reason being, during the repair operation for erasure-coded distributed storage, a lot of data from multiple nodes are gathered into a single node and then a mathematical operation is performed to reconstruct the missing part. This process severely congests the links toward the destination where newly recreated data is to be hosted. We proposed a novel distributed repair technique, called Partial-Parallel-Repair (PPR) that performs this reconstruction in parallel on multiple nodes and eliminates network bottlenecks, and as a result, greatly speeds up the repair process. Fourth, we study how for a class of applications, performance can be improved (or performance problems can be mitigated) by selectively approximating some of the computations. For many applications, the main computation happens inside a loop that can be logically divided into a few temporal segments, we call phases. We found that while approximating the initial phases might severely degrade the quality of the results, approximating the computation for the later phases have very small impact on the final quality of the result. Based on this observation, we developed an optimization framework that for a given budget of quality-loss, would find the best approximation settings for each phase in the execution
Sixth Goddard Conference on Mass Storage Systems and Technologies Held in Cooperation with the Fifteenth IEEE Symposium on Mass Storage Systems
This document contains copies of those technical papers received in time for publication prior to the Sixth Goddard Conference on Mass Storage Systems and Technologies which is being held in cooperation with the Fifteenth IEEE Symposium on Mass Storage Systems at the University of Maryland-University College Inn and Conference Center March 23-26, 1998. As one of an ongoing series, this Conference continues to provide a forum for discussion of issues relevant to the management of large volumes of data. The Conference encourages all interested organizations to discuss long term mass storage requirements and experiences in fielding solutions. Emphasis is on current and future practical solutions addressing issues in data management, storage systems and media, data acquisition, long term retention of data, and data distribution. This year's discussion topics include architecture, tape optimization, new technology, performance, standards, site reports, vendor solutions. Tutorials will be available on shared file systems, file system backups, data mining, and the dynamics of obsolescence
Rescuing the legacy project: a case study in digital preservation and technical obsolescence
The ability to maintain continuous access to digital documents and artifacts is one
of the most significant problems facing the archival, manuscript repository, and record
management communities in the twenty-first century. This problem with access is
particularly troublesome in the case of complex digital installments, which resist simple
migration and emulation strategies. The Legacy Project, which was produced by the
William Breman Jewish Heritage Museum in Atlanta, was created in the early 2000s as a
means of telling the stories of Holocaust survivors who settled in metropolitan Atlanta.
Legacy was an interactive multimedia kiosk that enabled museum visitors to read
accounts, watch digital video, and examine photographs about these survivors. However,
several years after Legacy was completed, it became inoperable, due to technological
obsolescence. By using Legacy as a case study, I examine how institutions can preserve
access to complex digital artifacts and how they can rescue digital information that is in
danger of being lost.M.S.Committee Chair: Knoespel, Kenneth; Committee Member: Burnett, Rebecca; Committee Member: Fox Harrell; Committee Member: TyAnna Herringto
Design and analysis of stream scheduling algorithms in distributed reservation-based multimedia systems
Ph.DDOCTOR OF PHILOSOPH
- …