157 research outputs found
Priori information and sliding window based prediction algorithm for energy-efficient storage systems in cloud
One of the major challenges in cloud computing and data centers is the energy conservation and emission reduction. Accurate prediction algorithms are essential for building energy efficient storage systems in cloud computing. In this paper, we first propose a Three-State Disk Model (3SDM), which can describe the service quality and energy consumption states of a storage system accurately. Based on this model, we develop a method for achieving energy conservation without losing quality by skewing the workload among the disks to transmit the disk states of a storage system. The efficiency of this method is
highly dependent on the accuracy of the information predicting the blocks to be accessed and the blocks not be accessed in the near future. We develop a priori information and sliding window based prediction (PISWP) algorithm by taking advantage of the priori information about human behavior and selecting suitable size of sliding window. The PISWP method targets at streaming media applications, but we also check its efficiency on other two applications, news in webpage and new tool released. Disksim, an established storage system simulator, is applied in our experiments to verify the effect of our method for various users’ traces. The results show that this prediction method can bring a high degree energy saving for storage systems in cloud computing environment
Cloud computing: survey on energy efficiency
International audienceCloud computing is today’s most emphasized Information and Communications Technology (ICT) paradigm that is directly or indirectly used by almost every online user. However, such great significance comes with the support of a great infrastructure that includes large data centers comprising thousands of server units and other supporting equipment. Their share in power consumption generates between 1.1% and 1.5% of the total electricity use worldwide and is projected to rise even more. Such alarming numbers demand rethinking the energy efficiency of such infrastructures. However, before making any changes to infrastructure, an analysis of the current status is required. In this article, we perform a comprehensive analysis of an infrastructure supporting the cloud computing paradigm with regards to energy efficiency. First, we define a systematic approach for analyzing the energy efficiency of most important data center domains, including server and network equipment, as well as cloud management systems and appliances consisting of a software utilized by end users. Second, we utilize this approach for analyzing available scientific and industrial literature on state-of-the-art practices in data centers and their equipment. Finally, we extract existing challenges and highlight future research directions
The Design of a System Architecture for Mobile Multimedia Computers
This chapter discusses the system architecture of a portable computer, called Mobile Digital Companion, which provides support for handling multimedia applications energy efficiently. Because battery life is limited and battery weight is an important factor for the size and the weight of the Mobile Digital Companion, energy management plays a crucial role in the architecture. As the Companion must remain usable in a variety of environments, it has to be flexible and adaptable to various operating conditions. The Mobile Digital Companion has an unconventional architecture that saves energy by using system decomposition at different levels of the architecture and exploits locality of reference with dedicated, optimised modules. The approach is based on dedicated functionality and the extensive use of energy reduction techniques at all levels of system design. The system has an architecture with a general-purpose processor accompanied by a set of heterogeneous autonomous programmable modules, each providing an energy efficient implementation of dedicated tasks. A reconfigurable internal communication network switch exploits locality of reference and eliminates wasteful data copies
Recommended from our members
Making Data Storage Efficient in the Era of Cloud Computing
We enter the era of cloud computing in the last decade, as many paradigm shifts are happening on how people write and deploy applications. Despite the advancement of cloud computing, data storage abstractions have not evolved much, causing inefficiencies in performance, cost, and security.
This dissertation proposes a novel approach to make data storage efficient in the era of cloud computing by building new storage abstractions and systems that bridge the gap between cloud computing and data storage and simplify development. We build four systems to address four data inefficiencies in cloud computing.
The first system, Grandet, solves the data storage inefficiency caused by the paradigm shift from upfront provisioning to a variety of pay-as-you-go cloud services. Grandet is an extensible storage system that significantly reduces storage costs for web applications deployed in the cloud. Under the hood, it supports multiple heterogeneous stores and unifies them by placing each data object at the store deemed most economical. Our results show that Grandet reduces their costs by an average of 42.4%, and it is fast, scalable, and easy to use.
The second system, Unic, solves the data inefficiency caused by the paradigm shift from single-tenancy to multi-tenancy. Unic securely deduplicates general computations. It exports a cache service that allows cloud applications running on behalf of mutually distrusting users to memoize and reuse computation results, thereby improving performance. Unic achieves both integrity and secrecy through a novel use of code attestation, and it provides a simple yet expressive API that enables applications to deduplicate their own rich computations. Our results show that Unic is easy to use, speeds up applications by an average of 7.58x, and with little storage overhead.
The third system, Lambdata, solves the data inefficiency caused by the paradigm shift to serverless computing, where developers only write core business logic, and cloud service providers maintain all the infrastructure. Lambdata is a novel serverless computing system that enables developers to declare a cloud function's data intents, including both data read and data written. Once data intents are made explicit, Lambdata performs a variety of optimizations to improve speed, including caching data locally and scheduling functions based on code and data locality. Our results show that Lambdata achieves an average speedup of 1.51x on the turnaround time of practical workloads and reduces monetary cost by 16.5%.
The fourth system, CleanOS, solves the data inefficiency caused by the paradigm shift from desktop computers to smartphones always connected to the cloud. CleanOS is a new Android-based operating system that manages sensitive data rigorously and maintains a clean environment at all times. It identifies and tracks sensitive data, encrypts it with a key, and evicts that key to the cloud when the data is not in active use on the device. Our results show that CleanOS limits sensitive-data exposure drastically while incurring acceptable overheads on mobile networks
Doctor of Philosophy in Computing
dissertatio
Scalable, Data- intensive Network Computation
To enable groups of collaborating researchers at different locations to effectively share large datasets and investigate their spontaneous hypotheses on the fly, we are interested in de- veloping a distributed system that can be easily leveraged by a variety of data intensive applications. The system is composed of (i) a number of best effort logistical depots to en- able large-scale data sharing and in-network data processing, (ii) a set of end-to-end tools to effectively aggregate, manage and schedule a large number of network computations with attendant data movements, and (iii) a Distributed Hash Table (DHT) on top of the generic depot services for scalable data management.
The logistical depot is extended by following the end-to-end principles and is modeled with a closed queuing network model. Its performance characteristics are studied by solving the steady state distributions of the model using local balance equations. The modeling results confirm that the wide area network is the performance bottleneck and running concurrent jobs can increase resource utilization and system throughput.
As a novel contribution, techniques to effectively support resource demanding data- intensive applications using the ÂŻne-grained depot services are developed. These techniques include instruction level scheduling of operations, dynamic co-scheduling of computation and replication, and adaptive workload control. Experiments in volume visualization have proved the effectiveness of these techniques. Due to the unique characteristic of data- intensive applications and our co-scheduling algorithm, a DHT is implemented on top of the basic storage and computation services. It demonstrates the potential of the Logistical Networking infrastructure to serve as a service creation platform
A shared-disk parallel cluster file system
Dissertação apresentada para obtenção do Grau de Doutor em Informática Pela Universidade Nova de Lisboa, Faculdade de Ciências e TecnologiaToday, clusters are the de facto cost effective platform both for high performance
computing (HPC) as well as IT environments. HPC and IT are quite different environments
and differences include, among others, their choices on file systems and storage: HPC favours parallel file systems geared towards maximum I/O bandwidth, but which are not fully POSIX-compliant and were devised to run on top of (fault prone) partitioned storage; conversely, IT data centres favour both external disk arrays (to provide highly available storage) and POSIX compliant file systems, (either general purpose or shared-disk cluster file systems, CFSs).
These specialised file systems do perform very well in their target environments provided that applications do not require some lateral features, e.g., no file locking on parallel file systems, and no high performance writes over cluster-wide shared files on CFSs. In brief, we can say
that none of the above approaches solves the problem of providing high levels of reliability and performance to both worlds.
Our pCFS proposal makes a contribution to change this situation: the rationale is to take advantage on the best of both – the reliability of cluster file systems and the high performance of parallel file systems. We don’t claim to provide the absolute best of each, but we aim at full POSIX compliance, a rich feature set, and levels of reliability and performance good enough
for broad usage – e.g., traditional as well as HPC applications, support of clustered DBMS engines that may run over regular files, and video streaming. pCFS’ main ideas include:
· Cooperative caching, a technique that has been used in file systems for distributed disks but, as far as we know, was never used either in SAN based cluster file systems or in parallel file systems. As a result, pCFS may use all infrastructures (LAN and SAN) to move data.
· Fine-grain locking, whereby processes running across distinct nodes may define nonoverlapping byte-range regions in a file (instead of the whole file) and access them in parallel, reading and writing over those regions at the infrastructure’s full speed (provided that no major metadata changes are required).
A prototype was built on top of GFS (a Red Hat shared disk CFS): GFS’ kernel code was
slightly modified, and two kernel modules and a user-level daemon were added. In the
prototype, fine grain locking is fully implemented and a cluster-wide coherent cache is maintained through data (page fragments) movement over the LAN.
Our benchmarks for non-overlapping writers over a single file shared among processes
running on different nodes show that pCFS’ bandwidth is 2 times greater than NFS’ while
being comparable to that of the Parallel Virtual File System (PVFS), both requiring about 10 times more CPU. And pCFS’ bandwidth also surpasses GFS’ (600 times for small record sizes, e.g., 4 KB, decreasing down to 2 times for large record sizes, e.g., 4 MB), at about the same CPU usage.Lusitania, Companhia de Seguros S.A, Programa
IBM Shared University Research (SUR
In-Datacenter Performance Analysis of a Tensor Processing Unit
Many architects believe that major improvements in cost-energy-performance
must now come from domain-specific hardware. This paper evaluates a custom
ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since
2015 that accelerates the inference phase of neural networks (NN). The heart of
the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak
throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed
on-chip memory. The TPU's deterministic execution model is a better match to
the 99th-percentile response-time requirement of our NN applications than are
the time-varying optimizations of CPUs and GPUs (caches, out-of-order
execution, multithreading, multiprocessing, prefetching, ...) that help average
throughput more than guaranteed latency. The lack of such features helps
explain why, despite having myriad MACs and a big memory, the TPU is relatively
small and low power. We compare the TPU to a server-class Intel Haswell CPU and
an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters.
Our workload, written in the high-level TensorFlow framework, uses production
NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters'
NN inference demand. Despite low utilization for some applications, the TPU is
on average about 15X - 30X faster than its contemporary GPU or CPU, with
TOPS/Watt about 30X - 80X higher. Moreover, using the GPU's GDDR5 memory in the
TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and
200X the CPU.Comment: 17 pages, 11 figures, 8 tables. To appear at the 44th International
Symposium on Computer Architecture (ISCA), Toronto, Canada, June 24-28, 201
- …