298 research outputs found
Mismatch-induced tradeoffs and scalability of mixed-signal vision chips
This paper explores different trade-offs associated with the design of analog VLSI chips. These trade-offs are related to the necessity of keeping the analog accuracy while taking advantage of the possibility of reducing the power consumption, increasing the operation speed, and reducing the area occupation (i.e., increasing the density of processors), as fabrication technologies scale down into deep sub-micron.DICTAM IST- 1999-19007Comisión Interministerial de Ciencia y Tecnología TIC1999-082
MorphIC: A 65-nm 738k-Synapse/mm Quad-Core Binary-Weight Digital Neuromorphic Processor with Stochastic Spike-Driven Online Learning
Recent trends in the field of neural network accelerators investigate weight
quantization as a means to increase the resource- and power-efficiency of
hardware devices. As full on-chip weight storage is necessary to avoid the high
energy cost of off-chip memory accesses, memory reduction requirements for
weight storage pushed toward the use of binary weights, which were demonstrated
to have a limited accuracy reduction on many applications when
quantization-aware training techniques are used. In parallel, spiking neural
network (SNN) architectures are explored to further reduce power when
processing sparse event-based data streams, while on-chip spike-based online
learning appears as a key feature for applications constrained in power and
resources during the training phase. However, designing power- and
area-efficient spiking neural networks still requires the development of
specific techniques in order to leverage on-chip online learning on binary
weights without compromising the synapse density. In this work, we demonstrate
MorphIC, a quad-core binary-weight digital neuromorphic processor embedding a
stochastic version of the spike-driven synaptic plasticity (S-SDSP) learning
rule and a hierarchical routing fabric for large-scale chip interconnection.
The MorphIC SNN processor embeds a total of 2k leaky integrate-and-fire (LIF)
neurons and more than two million plastic synapses for an active silicon area
of 2.86mm in 65nm CMOS, achieving a high density of 738k synapses/mm.
MorphIC demonstrates an order-of-magnitude improvement in the area-accuracy
tradeoff on the MNIST classification task compared to previously-proposed SNNs,
while having no penalty in the energy-accuracy tradeoff.Comment: This document is the paper as accepted for publication in the IEEE
Transactions on Biomedical Circuits and Systems journal (2019), the
fully-edited paper is available at
https://ieeexplore.ieee.org/document/876400
Principles of Neuromorphic Photonics
In an age overrun with information, the ability to process reams of data has
become crucial. The demand for data will continue to grow as smart gadgets
multiply and become increasingly integrated into our daily lives.
Next-generation industries in artificial intelligence services and
high-performance computing are so far supported by microelectronic platforms.
These data-intensive enterprises rely on continual improvements in hardware.
Their prospects are running up against a stark reality: conventional
one-size-fits-all solutions offered by digital electronics can no longer
satisfy this need, as Moore's law (exponential hardware scaling),
interconnection density, and the von Neumann architecture reach their limits.
With its superior speed and reconfigurability, analog photonics can provide
some relief to these problems; however, complex applications of analog
photonics have remained largely unexplored due to the absence of a robust
photonic integration industry. Recently, the landscape for
commercially-manufacturable photonic chips has been changing rapidly and now
promises to achieve economies of scale previously enjoyed solely by
microelectronics.
The scientific community has set out to build bridges between the domains of
photonic device physics and neural networks, giving rise to the field of
\emph{neuromorphic photonics}. This article reviews the recent progress in
integrated neuromorphic photonics. We provide an overview of neuromorphic
computing, discuss the associated technology (microelectronic and photonic)
platforms and compare their metric performance. We discuss photonic neural
network approaches and challenges for integrated neuromorphic photonic
processors while providing an in-depth description of photonic neurons and a
candidate interconnection architecture. We conclude with a future outlook of
neuro-inspired photonic processing.Comment: 28 pages, 19 figure
Power Management and SRAM for Energy-Autonomous and Low-Power Systems
We demonstrate the two first-known, complete, self-powered millimeter-scale computer systems.
These microsystems achieve zero-net-energy operation using solar energy harvesting and
ultra-low-power circuits. A medical implant for monitoring intraocular pressure (IOP) is presented
as part of a treatment for glaucoma. The 1.5mm3 IOP monitor is easily implantable because of its
small size and measures IOP with 0.5mmHg accuracy. It wirelessly transmits data to an external
wand while consuming 4.7nJ/bit. This provides rapid feedback about treatment efficacies to decrease
physician response time and potentially prevent unnecessary vision loss. A nearly-perpetual
temperature sensor is presented that processes data using a 2.1μW near-threshold ARM°R Cortex-
M3TM μP that provides a widely-used and trusted programming platform.
Energy harvesting and power management techniques for these two microsystems enable energy-autonomous
operation. The IOP monitor harvests 80nW of solar power while consuming only
5.3nW, extending lifetime indefinitely. This allows the device to provide medical information for
extended periods of time, giving doctors time to converge upon the best glaucoma treatment. The
temperature sensor uses on-demand power delivery to improve low-load dc-dc voltage conversion
efficiency by 4.75x. It also performs linear regulation to deliver power with low noise, improved
load regulation, and tight line regulation.
Low-power high-throughput SRAM techniques help millimeter-scale microsystems meet stringent
power budgets. VDD scaling in memory decreases energy per access, but also decreases stability
margins. These margins can be improved using sizing, VTH selection, and assist circuits,
as well as new bitcell designs. Adaptive Crosshairs modulation of SRAM power supplies fixes
70% of parametric failures. Half-differential SRAM design improves stability, reducing VMIN by
72mV.
The circuit techniques for energy autonomy presented in this dissertation enable millimeter-scale
microsystems for medical implants, such as blood pressure and glucose sensors, as well as
non-medical applications, such as supply chain and infrastructure monitoring. These pervasive
sensors represent the continuation of Bell’s Law, which accurately traces the evolution of computers
as they become smaller, more numerous, and more powerful. The development of
millimeter-scale massively-deployed ubiquitous computers ensures the continued expansion and
profitability of the semiconductor industry. NanoWatt circuit techniques will allow us to meet this
next frontier in IC design.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/86387/1/grgkchen_1.pd
Front-end receiver for miniaturised ultrasound imaging
Point of care ultrasonography has been the focus of extensive research over the past few decades. Miniaturised, wireless systems have been envisaged for new application areas, such as capsule endoscopy, implantable ultrasound and wearable ultrasound. The hardware constraints of such small-scale systems are severe, and tradeoffs between power consumption, size, data bandwidth and cost must be carefully balanced. To address these challenges, two synthetic aperture receiver architectures are proposed and compared. The architectures target highly miniaturised, low cost, B-mode ultrasound imaging systems. The first architecture utilises quadrature (I/Q) sampling to minimise the signal bandwidth and computational load. Synthetic aperture beamforming is carried out using a single-channel, pipelined protocol in order to minimise system complexity and power consumption. A digital beamformer dynamically apodises and focuses the data by interpolating and applying complex phase rotations to the I/Q samples. The beamformer is implemented on a Spartan-6 FPGA and consumes 296mW for a frame rate of 7Hz. The second architecture employs compressive sensing within the finite rate of innovation (FRI) framework to further reduce the data bandwidth. Signals are sampled below the Nyquist frequency, and then transmitted to a digital back-end processor, which reconstructs I/Q components non-linearly, and then carries out synthetic aperture beamforming. Both architectures were tested in hardware using a single-channel analogue front-end (AFE) that was designed and fabricated in AMS 0.35μm CMOS. The AFE demodulates RF ultrasound signals sequentially into I/Q components, and comprises a low-noise preamplifier, mixer, programmable gain amplifier (PGA) and lowpass filter. A variable gain low noise preamplifier topology is used to enable quasi-exponential time-gain control (TGC). The PGA enables digital selection of three gain values (15dB, 22dB and 25.5dB). The bandwidth of the lowpass filter is also selectable between 1.85MHz, 510kHz and 195kHz to allow for testing of both architectural frameworks. The entire AFE consumes 7.8 mW and occupies an area of 1.5×1.5 mm. In addition to the AFE, this thesis also presents the design of a pseudodifferential, log-domain multiplier-filter or “multer” which demodulates low-RF signals in the current-domain. This circuit targets high impedance transducers such as capacitive micromachined ultrasound transducers (CMUTs) and offers a 20dB improvement in dynamic range over the voltage-mode AFE. The bandwidth is also electronically tunable. The circuit was implemented in 0.35μm BiCMOS and was simulated in Cadence; however, no fabrication results were obtained for this circuit. B-mode images were obtained for both architectures. The quadrature SAB method yields a higher image SNR and 9% lower root mean squared error with respect to the RF-beamformed reference image than the compressive SAB method. Thus, while both architectures achieve a significant reduction in sampling rate, system complexity and area, the quadrature SAB method achieves better image quality. Future work may involve the addition of multiple receiver channels and the development of an integrated system-on-chip.Open Acces
Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip
The sustained demand for faster, more powerful chips has been met by the
availability of chip manufacturing processes allowing for the integration of increasing
numbers of computation units onto a single die. The resulting outcome,
especially in the embedded domain, has often been called SYSTEM-ON-CHIP
(SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC).
MPSoC design brings to the foreground a large number of challenges, one of
the most prominent of which is the design of the chip interconnection. With a
number of on-chip blocks presently ranging in the tens, and quickly approaching
the hundreds, the novel issue of how to best provide on-chip communication
resources is clearly felt.
NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable
answer to this design concern. By bringing large-scale networking concepts to
the on-chip domain, they guarantee a structured answer to present and future
communication requirements. The point-to-point connection and packet switching
paradigms they involve are also of great help in minimizing wiring overhead
and physical routing issues. However, as with any technology of recent inception,
NoC design is still an evolving discipline. Several main areas of interest
require deep investigation for NoCs to become viable solutions:
• The design of the NoC architecture needs to strike the best tradeoff among
performance, features and the tight area and power constraints of the onchip
domain.
• Simulation and verification infrastructure must be put in place to explore,
validate and optimize the NoC performance.
• NoCs offer a huge design space, thanks to their extreme customizability in
terms of topology and architectural parameters. Design tools are needed
to prune this space and pick the best solutions.
• Even more so given their global, distributed nature, it is essential to evaluate
the physical implementation of NoCs to evaluate their suitability for
next-generation designs and their area and power costs.
This dissertation performs a design space exploration of network-on-chip architectures,
in order to point-out the trade-offs associated with the design of
each individual network building blocks and with the design of network topology
overall. The design space exploration is preceded by a comparative analysis
of state-of-the-art interconnect fabrics with themselves and with early networkon-
chip prototypes. The ultimate objective is to point out the key advantages
that NoC realizations provide with respect to state-of-the-art communication
infrastructures and to point out the challenges that lie ahead in order to make
this new interconnect technology come true. Among these latter, technologyrelated
challenges are emerging that call for dedicated design techniques at all
levels of the design hierarchy. In particular, leakage power dissipation, containment
of process variations and of their effects. The achievement of the above
objectives was enabled by means of a NoC simulation environment for cycleaccurate
modelling and simulation and by means of a back-end facility for the
study of NoC physical implementation effects. Overall, all the results provided
by this work have been validated on actual silicon layout
- …