6 research outputs found
In-memory Realization of In-situ Few-shot Continual Learning with a Dynamically Evolving Explicit Memory
Continually learning new classes from a few training examples without
forgetting previous old classes demands a flexible architecture with an
inevitably growing portion of storage, in which new examples and classes can be
incrementally stored and efficiently retrieved. One viable architectural
solution is to tightly couple a stationary deep neural network to a dynamically
evolving explicit memory (EM). As the centerpiece of this architecture, we
propose an EM unit that leverages energy-efficient in-memory compute (IMC)
cores during the course of continual learning operations. We demonstrate for
the first time how the EM unit can physically superpose multiple training
examples, expand to accommodate unseen classes, and perform similarity search
during inference, using operations on an IMC core based on phase-change memory
(PCM). Specifically, the physical superposition of a few encoded training
examples is realized via in-situ progressive crystallization of PCM devices.
The classification accuracy achieved on the IMC core remains within a range of
1.28%--2.5% compared to that of the state-of-the-art full-precision baseline
software model on both the CIFAR-100 and miniImageNet datasets when continually
learning 40 novel classes (from only five examples per class) on top of 60 old
classes.Comment: Accepted at the European Solid-state Devices and Circuits Conference
(ESSDERC), September 202
Recommended from our members
A study on electrical and material characteristics of hafnium oxide with silicon interface passivation on III-V substrate for future scaled CMOS technology
textThe continuous improvement in the semiconductor industry has been successfully achieved by the reducing dimensions of CMOS (complementary metal oxide semiconductor) technology. For the last four decades, the scaling down of physical thickness of SiO₂ gate dielectrics has improved the speed of output drive current by shrinking of transistor area in front-end-process of integrated circuits. A higher number of transistors on chip resulting in faster speed and lower cost can be allowable by the scaling down and these fruitful achievements have been mainly made by the thinning thickness of one key component - Gate Dielectric - at Si based MOSFET (metal-oxide-semiconductor field effect transistor) devices. So far, SiO₂ (silicon dioxide) gate dielectric having the excellent material and electrical properties such as good interface (i.e., Dit ~ 2x10¹⁰ eV⁻¹cm⁻²), low gate leakage current, higher dielectric breakdown immunity (≥10MV/cm) and excellent thermal stability at typical Si processing temperature has been popularly used as the leading gate oxide material. The next generation Si based MOSFETs will require more aggressive gate oxide scaling to meet the required specifications. Since high-k dielectrics provide the same capacitance with a thicker film, the leakage current reduction, therefore, less the standby power consumption is one of the huge advantages. Also, it is easier to fabricate during the process because the control of film thickness is still not in the critical range compared to the same leakage current characteristic of SiO₂ film. HfO₂ based gate dielectric is considered as the most promising candidate among materials being studied since it shows good characteristics with conventional Si technology and good device performance has been reported. However, it has still many problems like insufficient thermals stability on silicon such as low crystallization temperature, low k interfacial regrowth, charge trapping and so on. The integration of hafnium based high-k dielectric into CMOS technology is also limited by major issues such as degraded channel mobility and charge trapping. One approach to overcome these obstacles is using alternative substrate materials such as SiGe, GaAs, InGaAs, and InP to improve channel mobility.Electrical and Computer Engineerin
Analysis of interface-trap effects in inversion-type InGaAs/ZrO2 MOSFETs
Interface-trap effects are analyzed in inversion-type, self-aligned In0.53Ga0.47As and In0.53Ga0.47As/In0.2Ga0.8As MOSFETs with ALD ZrO2 gate dielectric. Interface-trap densities in the order of 1e13 cm-2 eV-1 are required to explain the measured subthreshold slopes. For these Dit values, donor-like interface traps are compatible with threshold-voltage values in the 0-0.15 V range as those observed in these devices. Moreover, the presence of donor-like interface traps can explain the negative threshold-voltage shift induced by the inclusion of the In0.2Ga0.8As cap layer, as the result of the influence of interface traps located at the In0.2Ga0.8As/ZrO2 interface on the inversion channel forming at the In0.53Ga0.47As/In0.2Ga0.8As interface
A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference
Analogue in-memory computing (AIMC) with resistive memory devices could reduce the latency and energy consumption of deep neural network inference tasks by directly performing computations within memory. However, to achieve end-to-end improvements in latency and energy consumption, AIMC must be combined with on-chip digital operations and on-chip communication. Here we report a multicore AIMC chip designed and fabricated in 14 nm complementary metal–oxide–semiconductor technology with backend-integrated phase-change memory. The fully integrated chip features 64 AIMC cores interconnected via an on-chip communication network. It also implements the digital activation functions and additional processing involved in individual convolutional layers and long short-term memory units. With this approach, we demonstrate near-software-equivalent inference accuracy with ResNet and long short-term memory networks, while implementing all the computations associated with the weight layers and the activation functions on the chip. For 8-bit input/output matrix–vector multiplications, in the four-phase (high-precision) or one-phase (low-precision) operational read mode, the chip can achieve a maximum throughput of 16.1 or 63.1 tera-operations per second at an energy efficiency of 2.48 or 9.76 tera-operations per second per watt, respectively