Search CORE

114,025 research outputs found

Parametric Macromodels of Digital I/O Ports

Author: Canavero Flavio
Maio I.A
Stievano I.S
Publication venue: Piscataway, N.J. : IEEE
Publication date: 01/01/2002
Field of study

This paper addresses the development of macromodels for input and output ports of a digital device. The proposed macromodels consist of parametric representations that can be obtained from port transient waveforms at the device ports via a well established procedure. The models are implementable as SPICE subcircuits and their accuracy and efficiency are verified by applying the approach to the characterization of transistor-level models of commercial devices

Crossref

PORTO Publications Open Repository TOrino

Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs

Author: Kadetotad Deepak
Kim Minkyu
Linares Barranco Alejandro
Ríos Navarro José Antonio
Seo Jae-Sun
Tapiador Morales Ricardo
Publication venue: 'SAGE Publications'
Publication date: 01/01/2016
Field of study

Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementations of these deep CNN architectures are challenged with memory bottlenecks that require many convolution and fully-connected layers demanding large amount of communication for parallel computation. Multi-core CPU based solutions have demonstrated their inadequacy for this problem due to the memory wall and low parallelism. Many-core GPU architectures show superior performance but they consume high power and also have memory constraints due to inconsistencies between cache and main memory. FPGA design solutions are also actively being explored, which allow implementing the memory hierarchy using embedded BlockRAM. This boosts the parallel use of shared memory elements between multiple processing units, avoiding data replicability and inconsistencies. This makes FPGAs potentially powerful solutions for real-time classification of CNNs. Both Altera and Xilinx have adopted OpenCL co-design framework from GPU for FPGA designs as a pseudo-automatic development solution. In this paper, a comprehensive evaluation and comparison of Altera and Xilinx OpenCL frameworks for a 5-layer deep CNN is presented. Hardware resources, temporal performance and the OpenCL architecture for CNNs are discussed. Xilinx demonstrates faster synthesis, better FPGA resource utilization and more compact boards. Altera provides multi-platforms tools, mature design community and better execution times

idUS. Depósito de Investigación Universidad de Sevilla

A 90 nm CMOS 16 Gb/s Transceiver for Optical Interconnects

Author: Emami-Neyestanak Azita
Horowitz Mark
Palermo Samuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2007
Field of study

Interconnect architectures which leverage high-bandwidth optical channels offer a promising solution to address the increasing chip-to-chip I/O bandwidth demands. This paper describes a dense, high-speed, and low-power CMOS optical interconnect transceiver architecture. Vertical-cavity surface-emitting laser (VCSEL) data rate is extended for a given average current and corresponding reliability level with a four-tap current summing FIR transmitter. A low-voltage integrating and double-sampling optical receiver front-end provides adequate sensitivity in a power efficient manner by avoiding linear high-gain elements common in conventional transimpedance-amplifier (TIA) receivers. Clock recovery is performed with a dual-loop architecture which employs baud-rate phase detection and feedback interpolation to achieve reduced power consumption, while high-precision phase spacing is ensured at both the transmitter and receiver through adjustable delay clock buffers. A prototype chip fabricated in 1 V 90 nm CMOS achieves 16 Gb/s operation while consuming 129 mW and occupying 0.105 mm^2

Crossref

Caltech Authors

Comprehensive Evaluation of OpenCL-Based CNN Implementations for FPGAs

Author: Kadetotad Deepak
Kim Minkyu
Linares Barranco Alejandro
Ríos Navarro José Antonio
Seo Jae-sun
Tapiador Morales Ricardo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementations of these deep CNN architectures are challenged with memory bottlenecks that require many convolution and fully-connected layers demanding large amount of communication for parallel computation. Multi-core CPU based solutions have demonstrated their inadequacy for this problem due to the memory wall and low parallelism. Many-core GPU architectures show superior performance but they consume high power and also have memory constraints due to inconsistencies between cache and main memory. OpenCL is commonly used to describe these architectures for their execution on GPGPUs or FPGAs. FPGA design solutions are also actively being explored, which allow implementing the memory hierarchy using embedded parallel BlockRAMs. This boosts the parallel use of shared memory elements between multiple processing units, avoiding data replicability and inconsistencies. This makes FPGAs potentially powerful solutions for real-time classification of CNNs. In this paper both Altera and Xilinx adopted OpenCL co-design frameworks for pseudo-automatic development solutions are evaluated. A comprehensive evaluation and comparison for a 5-layer deep CNN is presented. Hardware resources, temporal performance and the OpenCL architecture for CNNs are discussed. Xilinx demonstrates faster synthesis, better FPGA resource utilization and more compact boards. Altera provides multi-platforms tools, mature design community and better execution times.Ministerio de Economía y Competitividad TEC2016-77785-

idUS. Depósito de Investigación Universidad de Sevilla

Partially reconfigurable TVWS transceiver for use in UK and US markets

Author: Crockett Louise Helen
Darbari Faisal
Elliot Ross Andrew
Enderwitz Martin
He Ke
Stewart Robert
Weiss Stephan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

With more and more countries opening up sections of unlicensed spectrum for use by TV White Space (TVWS) devices, the prospect of building a device capable of operating in more than one world region is appealing. The difficulty is that the locations of TVWS bands within the radio spectrum are not globally harmonised. With this problem in mind, the purpose of this paper is to present a TVWS transceiver design which is capable of being reconfigured to operate in both the UK and US spectrum. We present three different configurations: one covering the UK TVWS spectrum and the remaining two covering the various locations of the US TVWS bands

Crossref

University of Strathclyde Institutional Repository

An FPGA-based real-time event sampler

Author: De Sutter Bjorn
Penneman Niels
Perneel Luc
Timmerman Martin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

This paper presents the design and FPGA-implementation of a sampler that is suited for sampling real-time events in embedded systems. Such sampling is useful, for example, to test whether real-time events are handled in time on such systems. By designing and implementing the sampler as a logic analyzer on an FPGA, several design parameters can be explored and easily modiﬁed to match the behavior of diﬀerent kinds of embedded systems. Moreover, the trade-off between price and performance becomes easy, as it mainly exists of choosing the appropriate type and speed grade of an FPGA family

Crossref

Ghent University Academic Bibliography

Integrated chaos generators

Author: Delgado Restituto Manuel
Rodríguez Vázquez Ángel Benito
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

This paper surveys the different design issues, from mathematical model to silicon, involved on the design of integrated circuits for the generation of chaotic behavior.Comisión Interministerial de Ciencia y Tecnología 1FD97-1611(TIC)European Commission ESPRIT 3110

idUS. Depósito de Investigación Universidad de Sevilla