115 research outputs found

    Reconfigurable Architectures and Systems for IoT Applications

    Get PDF
    abstract: Internet of Things (IoT) has become a popular topic in industry over the recent years, which describes an ecosystem of internet-connected devices or things that enrich the everyday life by improving our productivity and efficiency. The primary components of the IoT ecosystem are hardware, software and services. While the software and services of IoT system focus on data collection and processing to make decisions, the underlying hardware is responsible for sensing the information, preprocess and transmit it to the servers. Since the IoT ecosystem is still in infancy, there is a great need for rapid prototyping platforms that would help accelerate the hardware design process. However, depending on the target IoT application, different sensors are required to sense the signals such as heart-rate, temperature, pressure, acceleration, etc., and there is a great need for reconfigurable platforms that can prototype different sensor interfacing circuits. This thesis primarily focuses on two important hardware aspects of an IoT system: (a) an FPAA based reconfigurable sensing front-end system and (b) an FPGA based reconfigurable processing system. To enable reconfiguration capability for any sensor type, Programmable ANalog Device Array (PANDA), a transistor-level analog reconfigurable platform is proposed. CAD tools required for implementation of front-end circuits on the platform are also developed. To demonstrate the capability of the platform on silicon, a small-scale array of 24ร—25 PANDA cells is fabricated in 65nm technology. Several analog circuit building blocks including amplifiers, bias circuits and filters are prototyped on the platform, which demonstrates the effectiveness of the platform for rapid prototyping IoT sensor interfaces. IoT systems typically use machine learning algorithms that run on the servers to process the data in order to make decisions. Recently, embedded processors are being used to preprocess the data at the energy-constrained sensor node or at IoT gateway, which saves considerable energy for transmission and bandwidth. Using conventional CPU based systems for implementing the machine learning algorithms is not energy-efficient. Hence an FPGA based hardware accelerator is proposed and an optimization methodology is developed to maximize throughput of any convolutional neural network (CNN) based machine learning algorithm on a resource-constrained FPGA.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    ๊ฐ€๋ณ€๊ธฐ๋Šฅํ˜• ์•„๋‚ ๋กœ๊ทธ ๋ธ”๋ก ๊ธฐ๋ฐ˜์˜ ํ˜„์žฅ ํ”„๋กœ๊ทธ๋žจ์ด ๊ฐ€๋Šฅํ•œ ํ˜ผ์„ฑ ์‹ ํ˜ธ ์ง‘์ ํšŒ๋กœ์˜ ์„ค๊ณ„

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2017. 8. ๊น€์žฌํ•˜.Fast-emerging electronic device applications demand a variety of new mixed-signal ICs to be developed in fast cycle and with low cost. While field-programmable gate arrays (FPGAs) are established solutions for timely and low-cost prototyping of digital systems, their counterpart for mixed-signal circuits is still an active area for research. This thesis presents a design of a field-programmable IC for analog/mixed-signal circuits, which solves many challenges with the previous works by performing analog functions in time domain. In order to realize the field-programmable analog functionality, time-domain configurable analog block (TCAB) is proposed. A single TCAB can be programmed to various analog circuits, including a time-to-digital converter, digitally-controlled oscillator, digitally-controlled delay cell, digital pulse-width modulator, and phase interpolator. In addition, the TCABs convey and process analog information using the frequency, pulse width, delay, or phase of digital pulses or pulse sequences, rather than using analog voltage or current signals for less susceptibility to attenuation and noise. This analog information expressed in the digital pulses makes it easy to implement scalable programmable interconnects among the TCABs. The architecture of field-programmable IC capable of emulating todays diverse mixed-signal systems is also introduced. In addition to the TCABs, the proposed IC also includes arrays of configurable logic blocks (CLBs) and programmable arithmetic logic units (ALUs) for programmable digital functions. By programming the functionality of the TCAB, CLB, and ALU arrays and configuring the interconnects, the chip can implement various mixed-signal systems. A prototype IC fabricated with 65-nm CMOS technology demonstrates the versatile programmability of the proposed TCAB and the IC by being successfully operated as a 1-GHz phase-locked loop with a 12.3-psrms integrated jitter, as a 50-MS/s analog-to-digital converter with a 32.5-dB SNDR, and as a 1.2-to-0.7V DCโ€“DC converter with 95.5 % efficiency.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATIONS 1 1.2 THESIS CONTRIBUTION AND ORGANIZATION 5 CHAPTER 2 TIME-DOMAIN CONFIGURABLE ANALOG BLOCK 7 2.1 OVERVIEW OF THE TCAB 9 2.1.1. RECONFIGURABLE FUNCTIONALITY 9 2.1.2. TIME-DOMAIN SIGNAL PROCESSING 14 2.2 CIRCUIT IMPLEMENTATION OF THE TCAB 17 2.3 VERSATILE PROGRAMMABILITY OF TCAB 24 2.3.1. RELAXATION OSCILLATOR 24 2.3.2. DIGITALLY-CONTROLLED OSCILLATOR 28 2.3.3. DIGITAL PULSE-WIDTH MODULATOR 32 2.3.4. GATED OSCILLATOR 34 2.3.5. DIGITALLY-CONTROLLED DELAY CELL 35 2.3.6. PHASE INTERPOLATOR 37 2.3.7. MULTIPHASE DCO 39 2.3.8. NON-OVERLAPPING PULSE GENERATOR 41 2.4 TCAB ARRAY WITH PROGRAMMABLE INTERCONNECTS 43 2.4.1. TCAB ARRAY COMPOSITION 43 2.4.2. PROGRAMMABLE INTERCONNECTS 44 CHAPTER 3 PROPOSED ARCHITECTURE FOR FIELD-PROGRAMMABLE MIXED-SIGNAL IC 49 CHAPTER 4 CIRCUIT IMPLEMENTATION 54 4.1 CONFIGURABLE LOGIC BLOCK ARRAY 55 4.1.1. CONFIGURABLE LOGIC BLOCK 55 4.1.2. CLB ARRAY 56 4.2 ARITHMETIC LOGIC UNIT ARRAY 58 4.2.1. ARITHMETIC LOGIC UNIT 58 4.2.2. ALU ARRAY 61 4.3 INTERFACING BLOCKS 63 4.3.1. VOLTAGE-TO-TIME CONVERTER 64 4.3.2. PHASE-FREQUENCY DETECTOR 65 4.3.3. COUNTER BLOCK 66 4.3.4. TIME-TO-VOLTAGE CONVERTER 68 4.4 PROGRAM METHOD 70 CHAPTER 5 MIXED-SIGNAL EXAMPLES AND EXPERIMENTAL RESULTS 73 5.1 MEASUREMENT RESULTS OF TCAB 76 5.1.1. DIGITAL PULSE-WIDTH MODULATOR 76 5.1.2. DIGITALLY-CONTROLLED OSCILLATOR 79 5.1.3. GATED OSCILLATOR 81 5.2 DIGITAL PHASE-LOCKED LOOP 83 5.3 ANALOG-TO-DIGITAL CONVERTER 89 5.4 DCDC CONVERTER 94 CHAPTER 6 CONCLUSION 99 BIBLIOGRAPHY 101 ์ดˆ ๋ก 108Docto

    FPGA-Based Relaxation D/A Converters With Parasitics-Induced Error Suppression and Digital Self-Calibration

    Get PDF
    In this paper, the implementation on a Field Programmable Gate Array (FPGA) of Relaxation Digital to Analog Converters (ReDACs), which take advantage of the impulse response of a first-order RC network to generate and combine binary weighted voltages, is addressed. For this purpose, the dominant ReDAC nonlinearity limitation related to the parasitics of the RC network is analyzed and a simple and robust technique for its effective suppression is proposed. Moreover, a ReDAC foreground digital calibration strategy suitable to FPGA implementation is introduced to tune the clock frequency of the converter, as requested for ReDAC operation. The novel error suppression technique and calibration strategy are finally implemented on a 13-bit, 514,S/s prototype (ReDAC1) and on a 11-bit, 10.5,kS/s prototype (ReDAC2), which are experimentally characterized under static and dynamic conditions. Measured results on ReDAC1 (ReDAC2) reveal 1.68,LSB (1.53,LSB) maximum INL, 1.54,LSB (1.0,LSB) maximum DNL, 76.4,dB (67.9,dB) THD, 79.7,dB (71.4,dB) SFDR and 71.3,dB (63.3,dB) SNDR, corresponding to 11.6 (10.2) effective bits (ENOB)

    KAVUAKA: a low-power application-specific processor architecture for digital hearing aids

    Get PDF
    The power consumption of digital hearing aids is very restricted due to their small physical size and the available hardware resources for signal processing are limited. However, there is a demand for more processing performance to make future hearing aids more useful and smarter. Future hearing aids should be able to detect, localize, and recognize target speakers in complex acoustic environments to further improve the speech intelligibility of the individual hearing aid user. Computationally intensive algorithms are required for this task. To maintain acceptable battery life, the hearing aid processing architecture must be highly optimized for extremely low-power consumption and high processing performance.The integration of application-specific instruction-set processors (ASIPs) into hearing aids enables a wide range of architectural customizations to meet the stringent power consumption and performance requirements. In this thesis, the application-specific hearing aid processor KAVUAKA is presented, which is customized and optimized with state-of-the-art hearing aid algorithms such as speaker localization, noise reduction, beamforming algorithms, and speech recognition. Specialized and application-specific instructions are designed and added to the baseline instruction set architecture (ISA). Among the major contributions are a multiply-accumulate (MAC) unit for real- and complex-valued numbers, architectures for power reduction during register accesses, co-processors and a low-latency audio interface. With the proposed MAC architecture, the KAVUAKA processor requires 16 % less cycles for the computation of a 128-point fast Fourier transform (FFT) compared to related programmable digital signal processors. The power consumption during register file accesses is decreased by 6 %to 17 % with isolation and by-pass techniques. The hardware-induced audio latency is 34 %lower compared to related audio interfaces for frame size of 64 samples.The final hearing aid system-on-chip (SoC) with four KAVUAKA processor cores and ten co-processors is integrated as an application-specific integrated circuit (ASIC) using a 40 nm low-power technology. The die size is 3.6 mm2. Each of the processors and co-processors contains individual customizations and hardware features with a varying datapath width between 24-bit to 64-bit. The core area of the 64-bit processor configuration is 0.134 mm2. The processors are organized in two clusters that share memory, an audio interface, co-processors and serial interfaces. The average power consumption at a clock speed of 10 MHz is 2.4 mW for SoC and 0.6 mW for the 64-bit processor.Case studies with four reference hearing aid algorithms are used to present and evaluate the proposed hardware architectures and optimizations. The program code for each processor and co-processor is generated and optimized with evolutionary algorithms for operation merging,instruction scheduling and register allocation. The KAVUAKA processor architecture is com-pared to related processor architectures in terms of processing performance, average power consumption, and silicon area requirements

    Two Dimensional Dynamic Synapse With Programmable Spatio-Temporal Dynamics For Neuromorphic Computing

    Get PDF
    In todayโ€™s era of big-data, a new computing paradigm beyond todayโ€™s von-Neumann architecture is needed to process large-scale datasets efficiently. In response to this need, the field of neuromorphic computing has recently emerged. Inspired by the brain, neuromorphic approaches are better at complex tasks than even supercomputers and show much better efficiency. This is because, unlike modern computers that use digital โ€˜0โ€™ and โ€˜1โ€™ for computation, biological neural networks exhibit analog changes in synaptic connections during the decision-making and learning processes. However, the existing approaches of using digital complementary metal-oxide-semiconductor (CMOS) devices to emulate gradual/analog behaviors in the neural network are energy intensive and unsustainable; furthermore, emerging memristor devices still face challenges such as non-linearities and large write noise. Here, we propose a novel artificial synaptic device use of an electrochemical dynamic synapse based on two-dimensional (2D) materials. The synaptic weight (channel conductance) of these dynamic synapses can be tuned via both a long-term doping effect from electrochemical intercalation and a short-term doping effect from ionic gating, thereby demonstrating programmable spatio-temporal dynamics, an essential feature for implementing spiking neural networks (SNNs). The electrical conductance of the channel is reversibly modulated by a concentration of Li ions between the layers of the 2D materials. This fundamentally different mechanism allows us to achieve a good energy efficiency (5000 non-volatile states), good endurance and retention performances, and a linear and symmetric resistance response. We demonstrate essential neuronal functions such as excitatory and inhibitory synapses, short term and long term plasticity, paired pulse facilitation (PPF), spike timing dependent plasticity (STDP), and spike rating dependent plasticity (SRDP), with good repeatability. Our scaling study suggests that this simple, two-dimensional (2D) synapse is scalable in terms of switching energy and speed. This work can lead to the low-power hardware implementation of neural networks for neuromorphic computing

    Bio-inspired learning and hardware acceleration with emerging memories

    Get PDF
    Machine Learning has permeated many aspects of engineering, ranging from the Internet of Things (IoT) applications to big data analytics. While computing resources available to implement these algorithms have become more powerful, both in terms of the complexity of problems that can be solved and the overall computing speed, the huge energy costs involved remains a significant challenge. The human brain, which has evolved over millions of years, is widely accepted as the most efficient control and cognitive processing platform. Neuro-biological studies have established that information processing in the human brain relies on impulse like signals emitted by neurons called action potentials. Motivated by these facts, the Spiking Neural Networks (SNNs), which are a bio-plausible version of neural networks have been proposed as an alternative computing paradigm where the timing of spikes generated by artificial neurons is central to its learning and inference capabilities. This dissertation demonstrates the computational power of the SNNs using conventional CMOS and emerging nanoscale hardware platforms. The first half of this dissertation presents an SNN architecture which is trained using a supervised spike-based learning algorithm for the handwritten digit classification problem. This network achieves an accuracy of 98.17% on the MNIST test data-set, with about 4X fewer parameters compared to the state-of-the-art neural networks achieving over 99% accuracy. In addition, a scheme for parallelizing and speeding up the SNN simulation on a GPU platform is presented. The second half of this dissertation presents an optimal hardware design for accelerating SNN inference and training with SRAM (Static Random Access Memory) and nanoscale non-volatile memory (NVM) crossbar arrays. Three prominent NVM devices are studied for realizing hardware accelerators for SNNs: Phase Change Memory (PCM), Spin Transfer Torque RAM (STT-RAM) and Resistive RAM (RRAM). The analysis shows that a spike-based inference engine with crossbar arrays of STT-RAM bit-cells is 2X and 5X more efficient compared to PCM and RRAM memories, respectively. Furthermore, the STT-RAM design has nearly 6X higher throughput per unit Watt per unit area than that of an equivalent SRAM-based (Static Random Access Memory) design. A hardware accelerator with on-chip learning on an STT-RAM memory array is also designed, requiring 1616 bits of floating-point synaptic weight precision to reach the baseline SNN algorithmic performance on the MNIST dataset. The complete design with STT-RAM crossbar array achieves nearly 20X higher throughput per unit Watt per unit mm^2 than an equivalent design with SRAM memory. In summary, this work demonstrates the potential of spike-based neuromorphic computing algorithms and its efficient realization in hardware based on conventional CMOS as well as emerging technologies. The schemes presented here can be further extended to design spike-based systems that can be ubiquitously deployed for energy and memory constrained edge computing applications
    • โ€ฆ
    corecore