686 research outputs found

    Lifecycle-Aware Online Video Caching

    Get PDF
    The current explosion of video traffic compels service providers to deploy caches at edge networks. Nowadays, most caching systems store data with a high programming voltage corresponding to the largest possible ‘expiry date’, typically on the order of years, which maximizes the cache damage. However, popular videos rarely exhibit lifecycles longer than a couple of months. Consequently, the programming voltage can instead be adapted to fit the lifecycle and mitigate the cache damage accordingly. In this paper, we propose LiA-cache, a Lifecycle-Aware caching policy for online videos. LiA-cache finds both near-optimal caching retention times and cache eviction policies by optimizing traffic delivery cost and cache damage cost conjointly. We first investigate temporal patterns of video access from a real-world dataset covering 10 million online videos collected by one of the largest mobile network operators in the world. We next cluster the videos based on their access lifecycles and integrate the clustering into a general model of the caching system. Specifically, LiA-cache analyzes videos and caches them depending on their cluster label. Compared to other popular policies in real-world scenarios, LiA-cache can reduce cache damage up to 90%, while keeping a cache hit ratio close to a policy purely relying on video popularity.Peer reviewe

    낸드 플래시 기반 저장장치의 수명 향상을 위한 계층 교차 최적화 기법

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 2. 김지홍.Replacing HDDs with NAND flash-based storage devices (SSDs) has been one of the major challenges in modern computing systems especially in regards to better performance and higher mobility. Although uninterrupted semiconductor process scaling and multi-leveling techniques lower the price of SSDs to the comparable level of HDDs, the decreasing lifetime of NAND flash memory, as a side effect of recent advanced device technologies, is emerging as one of the major barriers to the wide adoption of SSDs in high-performance computing systems. In this dissertation, we propose new cross-layer optimization techniques to extend the lifetime (in particular, endurance) of NAND flash memory. Our techniques are motivated by our key observation that erasing a NAND block with a lower voltage or at a slower speed can significantly improve NAND endurance. However, using a lower voltage in erase operations causes adverse side effects on other NAND characteristics such as write performance and retention capability. The main goal of the proposed techniques is to improve NAND endurance without affecting the other NAND requirements. We first present Dynamic Erase Voltage and Time Scaling (DeVTS), a unified framework to enable a system software to exploit the tradeoff relationship between the endurance and erase voltages/times of NAND flash memory. DeVTS includes erase voltage/time scaling and write capability tuning, each of which brings a different impact on the endurance, performance, and retention capabilities of NAND flash memory. Second, we propose a lifetime improvement technique which takes advantage of idle times between write requests when erasing a NAND block with a slower speed or when writing data to a NAND block erased with a lower voltage. We have implemented a DeVTS-enabled FTL, called dvsFTL, which optimally adjusts the erase voltage/time and write performance of NAND devices in an automatic fashion. Our experimental results show that dvsFTL can improve NAND endurance by 62%, on average, over DeVTS-unaware FTL with a negligible decrease in the overall write performance. Third, we suggest a comprehensive lifetime improvement technique which exploits variations of the retention requirements as well as the performance requirement of SSDs when writing data to a NAND block erased with a lower voltage. We have implemented dvsFTL+, an extended version of dvsFTL, which fully utilizes DeVTS by accurately predicting the write performance and retention requirements during run times. Our experimental results show that dvsFTL+ can further improve NAND endurance by more than 50% over dvsFTL while preserving all the NAND requirements. Lastly, we present a reliability management technique which prevents retention failure problems when aggressive retention-capability tuning techniques are employed in real environments. Our measurement results show that the proposed technique can recover corrupted data from retention failures up to 23 times faster over existing data recovery techniques. Furthermore, it can successfully recover severely retention-failed data, such as ones experienced 8 times longer retention times than the retention-time specification, that were not recoverable with the existing technique. Based on the evaluation studies for the developed lifetime improvement techniques, we verified that the cross-layer optimization approach has a significant impact on extending the lifetime of NAND flash-based storage devices. We expect that our proposed techniques can positively contribute to not only the wide adoption of NAND flash memory in datacenter environments but also the gradual acceleration of using flash as main memory.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Dissertation Goals 3 1.3 Contributions 4 1.4 Dissertation Structure 5 Chapter 2 Background 7 2.1 Threshold Voltage Window of NAND Flash Memory 7 2.2 NAND Program Operation 10 2.3 Related Work 11 2.3.1 System-Level SSD Lifetime Improvement Techniques 12 2.3.2 Device-Level Endurance-Enhancing Technique 15 2.3.3 Cross-Layer Optimization Techniques Exploiting NAND Tradeoffs 17 Chapter 3 Dynamic Erase Voltage and Time Scaling 20 3.1 Erase Voltage and Time Scaling 22 3.1.1 Motivation 22 3.1.2 Erase Voltage Scaling 23 3.1.3 Erase Time Scaling 26 3.2 Write Capability Tuning 28 3.2.1 Write Performance Tuning 29 3.2.2 Retention Capability Tuning 30 3.2.3 Disturbance Resistance Tuning 33 3.3 NAND Endurance Model 34 Chapter 4 Lifetime Improvement Technique Using Write-Performance Tuning 39 4.1 Design and Implementation of dvsFTL 40 4.1.1 Overview 40 4.1.2 Write-Speed Mode Selection 41 4.1.3 Erase Voltage Mode Selection 44 4.1.4 Erase Speed Mode Selection 46 4.1.5 DeVTS-wPT Aware FTL Modules 47 4.2 Experimental Results 50 4.2.1 Experimental Settings 50 4.2.2 Workload Characteristics 53 4.2.3 Endurance Gain Analysis 54 4.2.4 Overall Write Throughput Analysis 56 4.2.5 Detailed Analysis 58 Chapter 5 Lifetime Improvement Technique Using Retention-Capability Tuning 60 5.1 Design and Implementation of dvsFTL+ 62 5.1.1 Overview 62 5.1.2 Retention Requirement Prediction 64 5.1.3 Maximization of Endurance Benefit 66 5.1.4 Minimization of Reclaim Overhead 68 5.2 Experimental Results 69 5.2.1 Experimental Settings 69 5.2.2 Workload Characteristics 70 5.2.3 Endurance Gain Analysis 72 5.2.4 NAND Requirements Analysis 73 5.2.5 Detailed Analysis of Retention-Time Predictor 76 5.2.6 Detailed Analysis of Endurance Gain 83 Chapter 6 Reliability Management Technique for NAND Flash Memory 87 6.1 Overview 89 6.2 Motivation 91 6.2.1 Limitations of the Existing Retention-Error Management Policy 91 6.2.2 Limitations of the Existing Retention-Failure Recovery Technique 92 6.3 Retention Error Recovery Technique 95 6.3.1 Charge Movement Model 95 6.3.2 A Selective Error-Correction Procedure 99 6.3.3 Implementation 100 6.4 Experimental Results 103 Chapter 7 Conclusions 108 7.1 Summary and Conclusions 108 7.2 Future Work 110 7.2.1 Lifetime Improvement Technique Exploiting The Other NAND Tradeoffs 110 7.2.2 Development of Extended Techniques for DRAM-Flash Hybrid Main Memory Systems 111 7.2.3 Development of Specialized SSDs 112 Bibliography 114 초 록 122Docto

    Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications

    Full text link
    The challenging deployment of compute-intensive applications from domains such Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces the community of computing systems to explore new design approaches. Approximate Computing appears as an emerging solution, allowing to tune the quality of results in the design of a system in order to improve the energy efficiency and/or performance. This radical paradigm shift has attracted interest from both academia and industry, resulting in significant research on approximation techniques and methodologies at different design layers (from system down to integrated circuits). Motivated by the wide appeal of Approximate Computing over the last 10 years, we conduct a two-part survey to cover key aspects (e.g., terminology and applications) and review the state-of-the art approximation techniques from all layers of the traditional computing stack. In Part II of our survey, we classify and present the technical details of application-specific and architectural approximation techniques, which both target the design of resource-efficient processors/accelerators & systems. Moreover, we present a detailed analysis of the application spectrum of Approximate Computing and discuss open challenges and future directions.Comment: Under Review at ACM Computing Survey

    High-Density Solid-State Memory Devices and Technologies

    Get PDF
    This Special Issue aims to examine high-density solid-state memory devices and technologies from various standpoints in an attempt to foster their continuous success in the future. Considering that broadening of the range of applications will likely offer different types of solid-state memories their chance in the spotlight, the Special Issue is not focused on a specific storage solution but rather embraces all the most relevant solid-state memory devices and technologies currently on stage. Even the subjects dealt with in this Special Issue are widespread, ranging from process and design issues/innovations to the experimental and theoretical analysis of the operation and from the performance and reliability of memory devices and arrays to the exploitation of solid-state memories to pursue new computing paradigms

    Development of portable air quality sensor network based on IoT devices

    Get PDF
    Air pollution has been one of the major agendas around the globe in recent years. With rising awareness among all citizens, it's of extraordinary importance to measure data related to air pollution in order to have interested parties making informative decisions. Network composed of IoT devices has been one of the tools researchers relied on. Recent years saw rapid progress in sensor technology, which in turn flourished the market for low-cost sensors, giving citizens opportunities to measure various physical properties with affordable and portable sensors. Countless organizations have deployed wireless sensor networks (WSN) involving the usage of IoT devices and budget-friendly sensing hardware. Statistical Analysis of Networks and Systems (SANS) is a research group of the Computer Architecture Department at the Polytechnic University of Catalonia, which has launched several campaigns using WSN composed of Captor devices. Researchers then use relevant machine learning techniques to provide more meaningful information out of otherwise flawed data. A platform using a technology stack composed of Captor devices and machine learning techniques has gone through several stages and is still in the progress of improving itself. This thesis discusses the latest iteration of such a platform, by means of introducing characteristics of hardware, software, as well as machine learning methodologies used. By overviewing and comparing the older iterations of Captor and similar platforms used by other researchers, this thesis hopes to serve as a reference outlook into the current and future development of WSN (with focus on air-quality), where innovations are constantly needed to improve its capabilities. The result of the thesis is an autonomous IoT device, Captor4b, that is self-sufficient for at least 1 month and half where the autonomy can be further tweaked by adjusting duty cycle of Raspberry Pi and Arduino Nano separately from a software perspective

    HMC-Based Accelerator Design For Compressed Deep Neural Networks

    Get PDF
    Deep Neural Networks (DNNs) offer remarkable performance of classifications and regressions in many high dimensional problems and have been widely utilized in real-word cognitive applications. In DNN applications, high computational cost of DNNs greatly hinder their deployment in resource-constrained applications, real-time systems and edge computing platforms. Moreover, energy consumption and performance cost of moving data between memory hierarchy and computational units are higher than that of the computation itself. To overcome the memory bottleneck, data locality and temporal data reuse are improved in accelerator design. In an attempt to further improve data locality, memory manufacturers have invented 3D-stacked memory where multiple layers of memory arrays are stacked on top of each other. Inherited from the concept of Process-In-Memory (PIM), some 3D-stacked memory architectures also include a logic layer that can integrate general-purpose computational logic directly within main memory to take advantages of high internal bandwidth during computation. In this dissertation, we are going to investigate hardware/software co-design for neural network accelerator. Specifically, we introduce a two-phase filter pruning framework for model compression and an accelerator tailored for efficient DNN execution on HMC, which can dynamically offload the primitives and functions to PIM logic layer through a latency-aware scheduling controller. In our compression framework, we formulate filter pruning process as an optimization problem and propose a filter selection criterion measured by conditional entropy. The key idea of our proposed approach is to establish a quantitative connection between filters and model accuracy. We define the connection as conditional entropy over filters in a convolutional layer, i.e., distribution of entropy conditioned on network loss. Based on the definition, different pruning efficiencies of global and layer-wise pruning strategies are compared, and two-phase pruning method is proposed. The proposed pruning method can achieve a reduction of 88% filters and 46% inference time reduction on VGG16 within 2% accuracy degradation. In this dissertation, we are going to investigate hardware/software co-design for neural network accelerator. Specifically, we introduce a two-phase filter pruning framework for model compres- sion and an accelerator tailored for efficient DNN execution on HMC, which can dynamically offload the primitives and functions to PIM logic layer through a latency-aware scheduling con- troller. In our compression framework, we formulate filter pruning process as an optimization problem and propose a filter selection criterion measured by conditional entropy. The key idea of our proposed approach is to establish a quantitative connection between filters and model accuracy. We define the connection as conditional entropy over filters in a convolutional layer, i.e., distribution of entropy conditioned on network loss. Based on the definition, different pruning efficiencies of global and layer-wise pruning strategies are compared, and two-phase pruning method is proposed. The proposed pruning method can achieve a reduction of 88% filters and 46% inference time reduction on VGG16 within 2% accuracy degradation
    corecore