Search CORE

897 research outputs found

Optimizing Scrubbing by Netlist Analysis for FPGA Configuration Bit Classification and Floorplanning

Author: Schmidt Bernhard
Teich Jürgen
Ziener Daniel
Zöllner Christian
Publication venue: 'Elsevier BV'
Publication date: 25/07/2017
Field of study

Existing scrubbing techniques for SEU mitigation on FPGAs do not guarantee an error-free operation after SEU recovering if the affected configuration bits do belong to feedback loops of the implemented circuits. In this paper, we a) provide a netlist-based circuit analysis technique to distinguish so-called critical configuration bits from essential bits in order to identify configuration bits which will need also state-restoring actions after a recovered SEU and which not. Furthermore, b) an alternative classification approach using fault injection is developed in order to compare both classification techniques. Moreover, c) we will propose a floorplanning approach for reducing the effective number of scrubbed frames and d), experimental results will give evidence that our optimization methodology not only allows to detect errors earlier but also to minimize the Mean-Time-To-Repair (MTTR) of a circuit considerably. In particular, we show that by using our approach, the MTTR for datapath-intensive circuits can be reduced by up to 48.5% in comparison to standard approaches

arXiv.org e-Print Archive

Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

Author: Gujarathi Hemal S
McDonald-Maier Klaus D
Qadri Muhammad Yasir
Publication venue: 'Academy Publisher'
Publication date: 01/01/2009
Field of study

The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

University of Essex Research Repository

CiteSeerX

Crossref

Recommended from our members

A survey of behavioral-level partitioning systems

Author: Vahid Frank
Publication venue: eScholarship, University of California
Publication date: 30/10/1991
Field of study

Many approaches have been developed to partition a system's behavioral description before a structural implementation is synthesized. We highlight the foundations and motivations for behavioral partitioning. We survey behavioral partitioning approaches, discussing abstraction levels, goals, major steps, and key assumptions in each

eScholarship - University of California

A Survey on the Contributions of Software-Defined Networking to Traffic Engineering

Author: Astorga Burgo Jasone
Higuero Aperribay María Victoria
Jacob Taquet Eduardo Juan
Mendiola Alaitz
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/05/2017
Field of study

Since the appearance of OpenFlow back in 2008, software-defined networking (SDN) has gained momentum. Although there are some discrepancies between the standards developing organizations working with SDN about what SDN is and how it is defined, they all outline traffic engineering (TE) as a key application. One of the most common objectives of TE is the congestion minimization, where techniques such as traffic splitting among multiple paths or advanced reservation systems are used. In such a scenario, this manuscript surveys the role of a comprehensive list of SDN protocols in TE solutions, in order to assess how these protocols can benefit TE. The SDN protocols have been categorized using the SDN architecture proposed by the open networking foundation, which differentiates among data-controller plane interfaces, application-controller plane interfaces, and management interfaces, in order to state how the interface type in which they operate influences TE. In addition, the impact of the SDN protocols on TE has been evaluated by comparing them with the path computation element (PCE)-based architecture. The PCE-based architecture has been selected to measure the impact of SDN on TE because it is the most novel TE architecture until the date, and because it already defines a set of metrics to measure the performance of TE solutions. We conclude that using the three types of interfaces simultaneously will result in more powerful and enhanced TE solutions, since they benefit TE in complementary ways.European Commission through the Horizon 2020 Research and Innovation Programme (GN4) under Grant 691567 Spanish Ministry of Economy and Competitiveness under the Secure Deployment of Services Over SDN and NFV-based Networks Project S&NSEC under Grant TEC2013-47960-C4-3-

Archivo Digital para la Docencia y la Investigación

40.4fJ/bit/mm Low-Swing On-Chip Signaling with Self-Resetting Logic Repeaters Embedded within a Mesh NoC in 45nm SOI CMOS

Author: Chandrakasan Anantha P.
Park Sunghyun
Peh Li-Shiuan
Qazi Masood
Publication venue: 'EDAA'
Publication date: 01/03/2013
Field of study

Mesh NoCs are the most widely-used fabric in high-performance many-core chips today. They are, however, becoming increasingly power-constrained with the higher on-chip bandwidth requirements of high-performance SoCs. In particular, the physical datapath of a mesh NoC consumes significant energy. Low-swing signaling circuit techniques can substantially reduce the NoC datapath energy, but existing low-swing circuits involve huge area footprints, unreliable signaling or considerable system overheads such as an additional supply voltage, so embedding them into a mesh datapath is not attractive. In this paper, we propose a novel low-swing signaling circuit, a self-resetting logic repeater, to meet these design challenges. The SRLR enables single-ended low-swing pulses to be asynchronously repeated, and hence, consumes less energy than differential, clocked low-swing signaling. To mitigate global process variations while delivering high energy efficiency, three circuit techniques are incorporated. Fabricated in 45nm SOI CMOS, our 10mm SRLR-based low-swing datapath achieves 6.83Gb/s/µm bandwidth density with 40.4fJ/bit/mm energy at 4.1Gb/s data rate at 0.8V.United States. Defense Advanced Research Projects Agency. The Ubiquitous High-Performance Computing Progra

DSpace@MIT

Optimization of DSSS Receivers Using Hardware-in-the-Loop Simulations

Author: Dhillon Balbir Kaur
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2005
Field of study

Over the years, there has been significant interest in defining a hardware abstraction layer to facilitate code reuse in software defined radio (SDR) applications. Designers are looking for a way to enable application software to specify a waveform, configure the platform, and control digital signal processing (DSP) functions in a hardware platform in a way that insulates it from the details of realization. This thesis presents a tool-based methodolgy for developing and optimizing a Direct Sequence Spread Spectrum (DSSS) transceiver deployed in custom hardware like Field Programmble Gate Arrays (FPGAs). The system model consists of a tranmitter which employs a quadrature phase shift keying (QPSK) modulation scheme, an additive white Gaussian noise (AWGN) channel, and a receiver whose main parts consist of an analog-to-digital converter (ADC), digital down converter (DDC), image rejection low-pass filter (LPF), carrier phase locked loop (PLL), tracking locked loop, down-sampler, spread spectrum correlators, and rectangular-to-polar converter. The design methodology is based on a new programming model for FPGAs developed in the industry by Xilinx Inc. The Xilinx System Generator for DSP software tool provides design portability and streamlines system development by enabling engineers to create and validate a system model in Xilinx FPGAs. By providing hierarchical modeling and automatic HDL code generation for programmable devices, designs can be easily verified through hardware-in-the-loop (HIL) simulations. HIL provides a significant increase in simulation speed which allows optimization of the receiver design with respect to the datapath size for different functional parts of the receiver. The parameterized datapath points used in the simulation are ADC resolution, DDC datapath size, LPF datapath size, correlator height, correlator datapath size, and rectangular-to-polar datapath size. These parameters are changed in the software enviornment and tested for bit error rate (BER) performance through real-time hardware simualtions. The final result presents a system design with minimum harware area occupancy relative to an acceptable BER degradation

University of Tennessee, Knoxville: Trace

Reliable Multicast in Mobile Ad Hoc Wireless Networks

Author: Klos Lawrence
Publication venue: ScholarWorks@UNO
Publication date: 20/12/2009
Field of study

A mobile wireless ad hoc network (MANET) consists of a group of mobile nodes communicating wirelessly with no fixed infrastructure. Each node acts as source or receiver, and all play a role in path discovery and packet routing. MANETs are growing in popularity due to multiple usage models, ease of deployment and recent advances in hardware with which to implement them. MANETs are a natural environment for multicasting, or group communication, where one source transmits data packets through the network to multiple receivers. Proposed applications for MANET group communication ranges from personal network apps, impromptu small scale business meetings and gatherings, to conference, academic or sports complex presentations for large crowds reflect the wide range of conditions such a protocol must handle. Other applications such as covert military operations, search and rescue, disaster recovery and emergency response operations reflect the mission critical nature of many ad hoc applications. Reliable data delivery is important for all categories, but vital for this last one. It is a feature that a MANET group communication protocol must provide. Routing protocols for MANETs are challenged with establishing and maintaining data routes through the network in the face of mobility, bandwidth constraints and power limitations. Multicast communication presents additional challenges to protocols. In this dissertation we study reliability in multicast MANET routing protocols. Several on-demand multicast protocols are discussed and their performance compared. Then a new reliability protocol, R-ODMRP is presented that runs on top of ODMRP, a well documented best effort protocol with high reliability. This protocol is evaluated against ODMRP in a standard network simulator, ns-2. Next, reliable multicast MANET protocols are discussed and compared. We then present a second new protocol, Reyes, also a reliable on-demand multicast communication protocol. Reyes is implemented in the ns-2 simulator and compared against the current standards for reliability, flooding and ODMRP. R-ODMRP is used as a comparison point as well. Performance results are comprehensively described for latency, bandwidth and reliable data delivery. The simulations show Reyes to greatly outperform the other protocols in terms of reliability, while also outperforming R-ODMRP in terms of latency and bandwidth overhead

University of New Orleans

Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack

Author: Imran Naveed
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2013
Field of study

Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Recommended from our members

High-Speed Wide-Field Time-Correlated Single-Photon Counting Fluorescence Lifetime Imaging Microscopy

Author: Field Ryan Michael
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2014
Field of study

Fluorescence microscopy is a powerful imaging technique used in the biological sciences to identify labeled components of a sample with specificity. This is usually accomplished through labeling with fluorescent dyes, isolating these dyes by their spectral signatures with optical filters, and recording the intensity of the fluorescent response. Although these techniques are widely used, fluorescence intensity images can be negatively affected by a variety of factors that impact the fluorescence intensity. Fluorescence lifetime imaging microscopy (FLIM) is an imaging technique that is relatively immune to intensity fluctuations and also provides the unique ability to directly monitor the microenvironment surrounding a fluorophore. Despite the benefits associated with FLIM, the applications to which it is applied are fairly limited due to long image acquisition times and high cost of traditional hardware. Recent advances in complementary metal-oxide-semiconductor (CMOS) single-photon avalanche diodes (SPADs) have enabled the design of low-cost imaging arrays that are capable of recording lifetime images with acquisition times greater than one order of magnitude faster than existing systems. However, these SPAD arrays have yet to realize the full potential of the technology due to limitations in their ability to handle the vast amount of data generated during the commonly used time-correlated single-photon counting (TCSPC) lifetime imaging technique. This thesis presents the design, implementation, characterization, and demonstration of a high speed FLIM imaging system. The components of this design include a CMOS imager chip in a standard 0.13 μm technology containing a custom CMOS SPAD, a 64-by-64 array of these SPADs, pixel control circuitry, independent time-to-digital converters (TDCs), a FLIM specific datapath, and high bandwidth output buffers. In addition to the CMOS imaging array, a complete system was designed and implemented using a printed circuit board (PCB) for capturing data from the imager, creating histograms for the photon arrival data using field-programmable gate arrays, and transferring the data to a computer using a cabled PCIe interface. Finally, software is used to communicate between the imaging system and a computer.The dark count rate of the SPAD was measured to be only 231 Hz at room temperature while maintaining a photon detection probability of up to 30\%. TDCs included on the array have a 62.5 ps resolution and a 64 ns range, which is suitable for measuring the lifetime of most biological fluorophores. Additionally, the on-chip datapath was designed to handle continuous data transfers at rates capable of supporting TCSPC-based lifetime imaging at 100 frames per second. The system level implementation also provides sufficient data throughput for transferring up to 750 frames per second from the imaging system to a computer. The lifetime imaging system was characterized using standard techniques for evaluating SPAD performance and an electrical delay signal for measuring the TDC performance. This thesis concludes with a demonstration of TCSPC-FLIM imaging at 100 frames per second -- the fastest 64-by-64 TCSPC FLIM that has been demonstrated. This system overcomes some of the limitations of existing FLIM systems and has the potential to enable new application domains in dynamic FLIM imaging

Columbia University Academic Commons