9,910 research outputs found
Dynamic Virtual Page-based Flash Translation Layer with Novel Hot Data Identification and Adaptive Parallelism Management
Solid-state disks (SSDs) tend to replace traditional motor-driven hard disks in high-end storage devices in past few decades. However, various inherent features, such as out-of-place update [resorting to garbage collection (GC)] and limited endurance (resorting to wear leveling), need to be reduced to a large extent before that day comes. Both the GC and wear leveling fundamentally depend on hot data identification (HDI). In this paper, we propose a hot data-aware flash translation layer architecture based on a dynamic virtual page (DVPFTL) so as to improve the performance and lifetime of NAND flash devices. First, we develop a generalized dual layer HDI (DL-HDI) framework, which is composed of a cold data pre-classifier and a hot data post-identifier. Those can efficiently follow the frequency and recency of information access. Then, we design an adaptive parallelism manager (APM) to assign the clustered data chunks to distinct resident blocks in the SSD so as to prolong its endurance. Finally, the experimental results from our realized SSD prototype indicate that the DVPFTL scheme has reliably improved the parallelizability and endurance of NAND flash devices with improved GC-costs, compared with related works.Peer reviewe
Hardware and software status of QCDOC
QCDOC is a massively parallel supercomputer whose processing nodes are based
on an application-specific integrated circuit (ASIC). This ASIC was
custom-designed so that crucial lattice QCD kernels achieve an overall
sustained performance of 50% on machines with several 10,000 nodes. This strong
scalability, together with low power consumption and a price/performance ratio
of $1 per sustained MFlops, enable QCDOC to attack the most demanding lattice
QCD problems. The first ASICs became available in June of 2003, and the testing
performed so far has shown all systems functioning according to specification.
We review the hardware and software status of QCDOC and present performance
figures obtained in real hardware as well as in simulation.Comment: Lattice2003(machine), 6 pages, 5 figure
Index to NASA Tech Briefs, 1975
This index contains abstracts and four indexes--subject, personal author, originating Center, and Tech Brief number--for 1975 Tech Briefs
On board processor system study Final report
Development and characteristics of onboard processor syste
Transport or Store? Synthesizing Flow-based Microfluidic Biochips using Distributed Channel Storage
Flow-based microfluidic biochips have attracted much atten- tion in the EDA
community due to their miniaturized size and execution efficiency. Previous
research, however, still follows the traditional computing model with a
dedicated storage unit, which actually becomes a bottleneck of the performance
of bio- chips. In this paper, we propose the first architectural synthe- sis
framework considering distributed storage constructed tem- porarily from
transportation channels to cache fluid samples. Since distributed storage can
be accessed more efficiently than a dedicated storage unit and channels can
switch between the roles of transportation and storage easily, biochips with
this dis- tributed computing architecture can achieve a higher execution
efficiency even with fewer resources. Experimental results con- firm that the
execution efficiency of a bioassay can be improved by up to 28% while the
number of valves in the biochip can be reduced effectively.Comment: ACM/IEEE Design Automation Conference (DAC), June 201
Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack
Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria
- …