Search CORE

646 research outputs found

A 64-point Fourier transform chip for high-speed wireless LAN application using OFDM

Author: Grass Eckhard
Jagdhold Ulrich
Maharatna Koushik
Publication venue
Publication date: 01/03/2004
Field of study

In this article, we present a novel fixed-point 16-bit word-width 64-point FFT/IFFT processor developed primarily for the application in the OFDM based IEEE 802.11a Wireless LAN (WLAN) baseband processor. The 64-point FFT is realized by decomposing it into a 2-D structure of 8-point FFTs. This approach reduces the number of required complex multiplications compared to the conventional radix-2 64-point FFT algorithm. The complex multiplication operations are realized using shift-and-add operations. Thus, the processor does not use any 2-input digital multiplier. It also does not need any RAM or ROM for internal storage of coefficients. The proposed 64-point FFT/IFFT processor has been fabricated and tested successfully using our in-house 0.25 ?m BiCMOS technology. The core area of this chip is 6.8 mm2. The average dynamic power consumption is 41 mW @ 20 MHz operating frequency and 1.8 V supply voltage. The processor completes one parallel-to-parallel (i. e., when all input data are available in parallel and all output data are generated in parallel) 64-point FFT computation in 23 cycles. These features show that though it has been developed primarily for application in the IEEE 802.11a standard, it can be used for any application that requires fast operation as well as low power consumption

Southampton (e-Prints Soton)

Explore Bristol Research

Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine

Author: Andri Renzo
Benini Luca
Cavigelli Lukas
Rossi Davide
Publication venue
Publication date: 01/01/2019
Field of study

Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-the-art networks are extremely compute and memory intensive which makes them unsuitable for mW-devices such as IoT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth and system-level efficiency that are crucial for deployment of accelerators in ultra-low power devices. We present Hyperdrive: a BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, which can be used for arbitrarily sized convolutional neural network architecture and input resolution by exploiting the natural scalability of the compute units both at chip-level and system-level by arranging Hyperdrive chips systolically in a 2D mesh while processing the entire feature map together in parallel. Hyperdrive achieves 4.3 TOp/s/W system-level efficiency (i.e., including I/Os)---3.1x higher than state-of-the-art BWN accelerators, even if its core uses resource-intensive FP16 arithmetic for increased robustness

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Digital Signal Processing on FPGA for Short-Range Optical Communications Systems over Plastic Optical Fiber

Author: Ramírez Molina Julio César
Publication venue
Publication date: 01/01/2012
Field of study

Nowadays bandwidth requirements are increasing vertiginously. As new ways and concepts of how to share information emerge, new ways of how to access the web enter the market. Computers and mobile devices are only the beginning, the spectrum of web products and services such as IPTV, VoIP, on-line gaming, etc has been augmented by the possibility to share, store data, interact and work on the Cloud. The rush for bandwidth has led researchers from all over the world to enquire themselves on how to achieve higher data rates, and it is thanks to their efforts, that both long-haul and short-range communications systems have experienced a huge development during the last few years. However, as the demand for higher information throughput increases traditional short-range solutions reach their lim- its. As a result, optical solutions are now migrating from long-haul to short-range communication systems. As part of this trend, plastic optical fiber (POF) systems have arisen as promising candidates for applications where traditional glass optical fibers (GOF) are unsuitable. POF systems feature a series of characteristics that make them very suitable for the market requirements. More in detail, these systems are low cost, robust, easy to handle and to install, flexible and yet tolerant to bendings. Nonetheless, these features come at the expense of a considerable higher bandwidth limitation when compared to GOF systems. This thesis is aimed to the investigate the use of digital signal processing (DSP) algorithms to overcome the bandwidth limitation in short-range optical communications system based on POF. In particular, this dissertation presents the design and development of DSP algorithms on field programmable gate arrays (FPGAs) with the ultimate purpose of implementing a fully engineered 1Gbit/s Ethernet Media Converter capable of establishing data links over 50+ meters of PMMA-SI POF using an RC-LED as transmitte

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Gallium arsenide design methodology and testing of a systolic floating point processing element

Author: Beaumont-Smith Andrew J.
Publication venue
Publication date: 01/01/1995
Field of study

Thesis (M.E.Sc.) -- University of Adelaide, Dept. of Electrical and Electronic Engineering, 199

Adelaide Research & Scholarship

Implementation analysis of convolutional neural networks on FPGAs

Author: Ivanovs I.
Publication venue
Publication date: 29/04/2019
Field of study

Pure OAI Repository

Software controlled proportional valve for NIBP measurements

Author: Heesakkers Jeroen
Publication venue
Publication date: 01/01/1997
Field of study

Repository TU/e

Pure OAI Repository

Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine

Author: Andri R.
Benini L.
Cavigelli L.
Rossi D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-the-art networks are extremely compute and memory intensive, which makes them unsuitable for mW-devices such as loT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth, and system-level efficiency that are crucial for the deployment of accelerators in ultra-low power devices. We present Hyperdrive: a BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, which can he used for an arbitrarily sized convolutional neural network architecture and input resolution by exploiting the natural scalability of the compute units both at chip-level and system-level by arranging Hyperdrive chips systolically in a 2D mesh while processing the entire feature map together in parallel. Hyperdrive achieves 4.3 TOp/s/W system-level efficiency (i.e., including I/Os)-3.1 x higher than state-of-the-art BWN accelerators, even if its core uses resource-intensive FP16 arithmetic for increased robustness

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

AI/ML Algorithms and Applications in VLSI Design and Technology

Author: Abbas Zia
Ahmad Amir
Amuru Deepthi
Cherupally Pavan K.
Gurram Sushanth R.
Vudumula Harsha V.
Zahra Andleeb
Publication venue
Publication date: 15/02/2023
Field of study

An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

arXiv.org e-Print Archive

Pulse patency and oxygenation sensing system development to detect g-induced loss of consciousness

Author: Pretorius Herman
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2013
Field of study

A fighter pilots greatest strength is the weakness of his or her opponent. Commonly, this strength comes down to the maneuverability of the aircraft, particularly the ability to out-climb. Since the 1980\u27s, the thrust produced by these engines have the ability to drain the pilots head of blood causing a state of unconsciousness due to the overwhelming forces of gravity for upwards of 30 seconds; often times having fatal outcomes. This thesis explores the feasibility of detecting of blood flow by means of arterial wall expansion (pulse patency) and blood oxygenation using a microprocessor to continually monitor the signals from this two part sensor where by insight into the development of a g-induced loss of consciousness sensing system can be developed. Results indicate greater than 90% accuracy pulse patency detection using an accelerometer. Simulation and physical models were used as well as human testing to develop a blood oxygenation and pulse patency sensor, or BOPS

UNH Scholars' Repository