94 research outputs found

    COMPUTE-IN-MEMORY WITH EMERGING NON-VOLATILE MEMORIES FOR ACCELERATING DEEP NEURAL NETWORKS

    Get PDF
    The objective of this research is to accelerate deep neural networks (DNNs) with emerging non-volatile memories (eNVMs) based compute-in-memory (CIM) architecture. The research first focuses on the inference acceleration and proposes a resistive random access memory (RRAM) based CIM architecture. Two generations of RRAM testchips which monolithically integrate the RRAM memory array and CMOS peripheral circuits are designed and fabricated using Winbond 90 nm and TSMC 40 nm commercial embedded RRAM process respectively. The first generation of testchip named XNOR-RRAM is dedicated for binary neural networks (BNNs) and the second generation named Flex-RRAM features 1bit-to-8bit run-time configurable precision and leverages the input sparsity of the DNN model to improve the throughput and energy efficiency. However, the non-ideal characteristics of eNVM devices, especially when utilized as multi-level analog synaptic weights, may incur a notable accuracy degradation for both training and inference. This research develops a PyTorch based framework that incorporates the device characteristics into the DNN model to evaluate the impact of the eNVM nonidealities on training/inference accuracy. The results suggest that it is challenging to directly use eNVMs for in-situ training and resistance drift remains as a critical challenge to maintain a high inference accuracy. Furthermore, to overcome the challenges posed by the asymmetric conductance tuning behavior of typical eNVMs, which is found to be the most critical nonideality that prevents the model from achieving software equivalent training accuracy, this research proposes a novel 2-transistor-1-FeFET (ferroelectric field effect transistor) based synaptic weight cell that exploits hybrid precision for in situ training and inference, which achieves near-software classification accuracy on MNIST and CIFAR-10 dataset.Ph.D

    Design of Logic-Compatible Embedded Flash Memories for Moderate Density On-Chip Non-Volatile Memory Applications

    Get PDF
    University of Minnesota Ph.D. dissertation. December 2013. Major: Electrical Engineering. Advisor: Chris H. Kim. 1 computer file (PDF); xx, 129 pages.An on-chip embedded NVM (eNVM) enables a zero-standby power system-on-a-chip with a smaller form factor, faster access speed, lower access power, and higher security than an off-chip NVM. Differently from the high density eNVM technologies such as dual-poly eflash, FeRAM, STT-MRAM, and RRAM that typically require process overhead beyond standard logic process, the moderate density eNVM technologies such as e-fuse, anti-fuse, and single-poly embedded flash (eflash) can be fabricated in a standard logic process with no process overhead. Among them, a single-poly eflash is a unique multiple-time programmable moderate density eNVM, while it is expected to play a key role in mitigating variability and reliability issues of the future VLSI technologies; however, the challenges such as a high voltage disturbance, an implementation of logic compatible High Voltage Switch (HVS), and a limited sensing margin are required to be solved for its implementation using a standard I/O device. This thesis focuses on alleviating such challenges of the single-poly eflash memory with three single-poly eflash designs proposed in a generic logic process for moderate density eNVM applications. Firstly, the proposed 5T eflash features a WL-by-WL accessible architecture with no disturbance issue of the unselected WL cells, an overstress-free multi-story HVS expanding the cell sensing margin, and a selective WL refresh scheme for the higher cell endurance. The most favorable eflash cell configuration is also studied when the performance, endurance, retention, and disturbance characteristics are all considered. Secondly, the proposed 6T eflash features the bit-by-bit re-write capability for the higher overall cell endurance, while not disturbing the unselected WL cells. The logic compatible on-chip charge pump to provide the appropriate high voltages for the proposed eflash operations is also discussed. Finally, the proposed 10T eflash features a multi-configurable HVS that does not require the boosted read supplies, and a differential cell architecture with improved retention time. All these proposed eflash memories were implemented in a 65nm standard logic process, and the test chip measurement results confirmed the functionality of the proposed designs with a reasonable retention margin, showing the competitiveness of the proposed eflash memories compared to the other moderate density eNVM candidates

    Gestión de jerarquías de memoria híbridas a nivel de sistema

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadoras y Automática y de Ku Leuven, Arenberg Doctoral School, Faculty of Engineering Science, leída el 11/05/2017.In electronics and computer science, the term ‘memory’ generally refers to devices that are used to store information that we use in various appliances ranging from our PCs to all hand-held devices, smart appliances etc. Primary/main memory is used for storage systems that function at a high speed (i.e. RAM). The primary memory is often associated with addressable semiconductor memory, i.e. integrated circuits consisting of silicon-based transistors, used for example as primary memory but also other purposes in computers and other digital electronic devices. The secondary/auxiliary memory, in comparison provides program and data storage that is slower to access but offers larger capacity. Examples include external hard drives, portable flash drives, CDs, and DVDs. These devices and media must be either plugged in or inserted into a computer in order to be accessed by the system. Since secondary storage technology is not always connected to the computer, it is commonly used for backing up data. The term storage is often used to describe secondary memory. Secondary memory stores a large amount of data at lesser cost per byte than primary memory; this makes secondary storage about two orders of magnitude less expensive than primary storage. There are two main types of semiconductor memory: volatile and nonvolatile. Examples of non-volatile memory are ‘Flash’ memory (sometimes used as secondary, sometimes primary computer memory) and ROM/PROM/EPROM/EEPROM memory (used for firmware such as boot programs). Examples of volatile memory are primary memory (typically dynamic RAM, DRAM), and fast CPU cache memory (typically static RAM, SRAM, which is fast but energy-consuming and offer lower memory capacity per are a unit than DRAM). Non-volatile memory technologies in Si-based electronics date back to the 1990s. Flash memory is widely used in consumer electronic products such as cellphones and music players and NAND Flash-based solid-state disks (SSDs) are increasingly displacing hard disk drives as the primary storage device in laptops, desktops, and even data centers. The integration limit of Flash memories is approaching, and many new types of memory to replace conventional Flash memories have been proposed. The rapid increase of leakage currents in Silicon CMOS transistors with scaling poses a big challenge for the integration of SRAM memories. There is also the case of susceptibility to read/write failure with low power schemes. As a result of this, over the past decade, there has been an extensive pooling of time, resources and effort towards developing emerging memory technologies like Resistive RAM (ReRAM/RRAM), STT-MRAM, Domain Wall Memory and Phase Change Memory(PRAM). Emerging non-volatile memory technologies promise new memories to store more data at less cost than the expensive-to build silicon chips used by popular consumer gadgets including digital cameras, cell phones and portable music players. These new memory technologies combine the speed of static random-access memory (SRAM), the density of dynamic random-access memory (DRAM), and the non-volatility of Flash memory and so become very attractive as another possibility for future memory hierarchies. The research and information on these Non-Volatile Memory (NVM) technologies has matured over the last decade. These NVMs are now being explored thoroughly nowadays as viable replacements for conventional SRAM based memories even for the higher levels of the memory hierarchy. Many other new classes of emerging memory technologies such as transparent and plastic, three-dimensional(3-D), and quantum dot memory technologies have also gained tremendous popularity in recent years...En el campo de la informática, el término ‘memoria’ se refiere generalmente a dispositivos que son usados para almacenar información que posteriormente será usada en diversos dispositivos, desde computadoras personales (PC), móviles, dispositivos inteligentes, etc. La memoria principal del sistema se utiliza para almacenar los datos e instrucciones de los procesos que se encuentre en ejecución, por lo que se requiere que funcionen a alta velocidad (por ejemplo, DRAM). La memoria principal está implementada habitualmente mediante memorias semiconductoras direccionables, siendo DRAM y SRAM los principales exponentes. Por otro lado, la memoria auxiliar o secundaria proporciona almacenaje(para ficheros, por ejemplo); es más lenta pero ofrece una mayor capacidad. Ejemplos típicos de memoria secundaria son discos duros, memorias flash portables, CDs y DVDs. Debido a que estos dispositivos no necesitan estar conectados a la computadora de forma permanente, son muy utilizados para almacenar copias de seguridad. La memoria secundaria almacena una gran cantidad de datos aun coste menor por bit que la memoria principal, siendo habitualmente dos órdenes de magnitud más barata que la memoria primaria. Existen dos tipos de memorias de tipo semiconductor: volátiles y no volátiles. Ejemplos de memorias no volátiles son las memorias Flash (algunas veces usadas como memoria secundaria y otras veces como memoria principal) y memorias ROM/PROM/EPROM/EEPROM (usadas para firmware como programas de arranque). Ejemplos de memoria volátil son las memorias DRAM (RAM dinámica), actualmente la opción predominante a la hora de implementar la memoria principal, y las memorias SRAM (RAM estática) más rápida y costosa, utilizada para los diferentes niveles de cache. Las tecnologías de memorias no volátiles basadas en electrónica de silicio se remontan a la década de1990. Una variante de memoria de almacenaje por carga denominada como memoria Flash es mundialmente usada en productos electrónicos de consumo como telefonía móvil y reproductores de música mientras NAND Flash solid state disks(SSDs) están progresivamente desplazando a los dispositivos de disco duro como principal unidad de almacenamiento en computadoras portátiles, de escritorio e incluso en centros de datos. En la actualidad, hay varios factores que amenazan la actual predominancia de memorias semiconductoras basadas en cargas (capacitivas). Por un lado, se está alcanzando el límite de integración de las memorias Flash, lo que compromete su escalado en el medio plazo. Por otra parte, el fuerte incremento de las corrientes de fuga de los transistores de silicio CMOS actuales, supone un enorme desafío para la integración de memorias SRAM. Asimismo, estas memorias son cada vez más susceptibles a fallos de lectura/escritura en diseños de bajo consumo. Como resultado de estos problemas, que se agravan con cada nueva generación tecnológica, en los últimos años se han intensificado los esfuerzos para desarrollar nuevas tecnologías que reemplacen o al menos complementen a las actuales. Los transistores de efecto campo eléctrico ferroso (FeFET en sus siglas en inglés) se consideran una de las alternativas más prometedores para sustituir tanto a Flash (por su mayor densidad) como a DRAM (por su mayor velocidad), pero aún está en una fase muy inicial de su desarrollo. Hay otras tecnologías algo más maduras, en el ámbito de las memorias RAM resistivas, entre las que cabe destacar ReRAM (o RRAM), STT-RAM, Domain Wall Memory y Phase Change Memory (PRAM)...Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEunpu

    Politecast - a new communication primitive for wireless sensor networks

    Get PDF
    Wireless sensor networks have the potential for becoming a huge market. Ericsson predicts 50 billion devices interconnected to the Internet by the year 2020. Before that, the devices must be made to be able to withstand years of usage without having to change power source as that would be too costly. These devices are typically small, inexpensive and severally resource constrained. Communication is mainly wireless, and the wireless transceiver on the node is typically the most power hungry component. Therefore, reducing the usage of radio is key to long lifetime. In this thesis I identify four problems with the conventional broadcast primitive. Based on those problems, I implement a new communication primitive. This primitive is called Politecast. I evaluate politecast in three case studies: the Steal the Light toy example, a Neighbor Discovery simulation and a full two-month deployment of the Lega system in the art gallery Liljevalchs. With the evaluations, Politecast is shown to be able to massively reduce the amount of traffic being transmitted and thus reducing congestion and increasing application performance. It also prolongs node lifetime by reducing the overhearing by waking up neighbors

    Emerging Run-Time Reconfigurable FPGA and CAD Tools

    Get PDF
    Field-programmable gate array (FPGA) is a post fabrication reconfigurable device to accelerate domain specific computing systems. It offers offer high operation speed and low power consumption. However, the design flexibility and performance of FPGAs are severely constrained by the costly on-chip memories, e.g. static random access memory (SRAM) and FLASH memory. The objective of my dissertation is to explore the opportunity and enable the use of the emerging resistance random access memory (ReRAM) in FPGA design. The emerging ReRAM technology features high storage density, low access power consumption, and CMOS compatibility, making it a promising candidate for FPGA implementation. In particular, ReRAM has advantages of the fast access and nonvolatility, enabling the on-chip storage and access of configuration data. In this dissertation, I first propose a novel three-dimensional stacking scheme, namely, high-density interleaved memory (HIM). The structure improves the density of ReRAM meanwhile effectively reducing the signal interference induced by sneak paths in crossbar arrays. To further enhance the access speed and design reliability, a fast sensing circuit is also presented which includes a new sense amplifier scheme and reference cell configuration. The proposed ReRAM FPGA leverages a similar architecture as conventional SRAM based FPGAs but utilizes ReRAM technology in all component designs. First, HIM is used to implement look-up table (LUT) and block random access memories (BRAMs) for func- tionality process. Second, a 2R1T, two ReRAM cells and one transistor, nonvolatile switch design is applied to construct connection blocks (CBs) and switch blocks (SBs) for signal transition. Furthermore, unified BRAM (uBRAM) based on the current BRAM architecture iv is introduced, offering both configuration and temporary data storage. The uBRAMs provides extremely high density effectively and enlarges the FPGA capacity, potentially saving multiple contexts of configuration. The fast configuration scheme from uBRAM to logic and routing components also makes fast run-time partial reconfiguration (PR) much easier, improving the flexibility and performance of the entire FPGA system. Finally, modern place and route tools are designed for homogeneous fabric of FPGA. The PR feature, however, requires the support of heterogeneous logic modules in order to differentiate PR modules from static ones and therefore maintain the signal integration. The existing approaches still reply on designers’ manual effort, which significantly prolongs design time and lowers design efficiency. In this dissertation, I integrate PR support into VPR – an academic place and route tool by introducing a B*-tree modular placer (BMP) and PR-aware router. As such, users are able to explore new architectures or map PR applications to a variety of FPGAs. More importantly, this enhanced feature can also support fast design automation, e.g. mapping IP core, loading pre-synthesizing logic modules, etc

    Release and Verification of an Operating System for Testing e-Flash on Microcontrollers for Automotive Applications based on Multicore Architecture

    Get PDF
    The cars produced contain an increasing number of electronic devices for active assistance to driving, safety controls, energy efficiency, passenger comfort and entertainment. Safety is the keyword and means to have electronic components high reliability. Infineon microcontroller division works to improve reliability and guarantee the quality of microcontroller flash memories. The thesis goal is to verify the operating system used to test the microcontrollers flash memorie

    Development of prototype components for the Silicon Tracking System of the CBM experiment at FAIR

    Get PDF
    Das CBM-Experiment an der zukuenftigen Beschleunigeranlage FAIR wird die Eigenschaften von Kernmaterie unter extremen Bedingungen untersuchen. Das experimentelle Programm unterscheidet sich von den Schwerionen-Experimenten an RHIC (BNL) und LHC (CERN), die Kernmaterie bei hohen Temperaturen erzeugen. Im Gegensatz dazu kann die Untersuchung des QCD-Phasendiagramms, im Bereich der hoechsten Nettobaryonendichten und moderaten Temperaturen, die nur schwach untersucht wurden, mit hoher Praezision durchgefuehrt werden. Hierzu werden Kollisionen der verschiedenen Schwerionenstrahlen, bei Energien von 10-45GeV/Nukleon, mit nuklearem Target gemessen. Das physikalische Programm des CBM Experimentes umfasst die Messung sowohl der seltenen Sonden als auch der Mengenobservablen, die aus verschiedenen Zeitphasen des Zusammenstosses der Kerne stammen. Insbesondere kann der Zerfall von Teilchen mit Charm-Quarks durch Rekonstruktion des Zerfallsvertex, versetzt von dem primaeren Wechselwirkungspunkt um mehrere hundert Mikrometer, registriert werden. Hierzu ist praezises Tracking bei voller Ereignisrekonstruktion, mit bis zu 600 Spuren der geladenen Teilchen pro Ereignis innerhalb der Akzeptanz, noetig. Andere seltene Sonden erfordern den Betrieb bei einer Wechselwirkung von bis zu 10 MHz. Das Detektor-System, dass Tracking durchfuehrt, muss eine hohe Ortsaufloesung, auf der Ebene von 10 um leisten, mit hohen Arbeitsgeschwindigkeiten zu betreiben sein und ebenso ein strahlungstolerantes Design mit geringem Materialbudget besitzen. Das Silicon Tracking System (STS) wurde entwickelt um die Spuren geladener Teilchen in einem Magnetfeld zu rekonstruieren. Das System besteht aus acht Tracking Stationen, die sich in der Oeffnung eines Dipolmagneten mit 1T Feld befinden. Bei Spuren mit Impulsen ueber 1 GeV, betraegt die Impulsaufloesung bei einem solchen System etwa 1%. Um diese Aufgabe erfuellen zu koennen, ist eine sorgfaeltige Optimierung des Detektordesigns erforderlich. Insbesondere muss ein minimales Materialbudget erreicht werden. Die Herstellung eines Detektor-Moduls erfordert Aktivitaeten mit Bezug auf die Modul-Komponenten und deren Integration. Ein Detektor-Modul ist eine grundlegende funktionelle Einheit, die einen Sensor, ein Analogmikrokabel und Front-End-Elektronik umfasst, montiert auf einer Traegerstruktur. Das Ziel der Arbeit ist es, die Qualitaetssicherungstests der Prototyp-Modulkomponenten, zur Bestaetigung des Detektor-Modul-Konzeptes durchzufuehren, und um seinen Betrieb mit radioaktiven Quellen und Teilchenstrahlen zu demonstrieren. Die doppelseitigen Silizium-Mikrostreifendetektoren wurden als Sensortechnik fuer den STS, aufgrund der Kombination einer guten Ortsaufloesung, einer zweidimensionalen Koordinatenmessung mit geringem Materialbudget (0.3%X0), der hohen Auslesegeschwindigkeit und ausreichender Strahlungstoleranz gewaehlt. Mehrere Generationen von doppelseitigen Silizium-Mikrostreifendetektoren wurden zur Erkundung strahlenharter Konstruktionsmerkmale und des Konzepts, eines grossflaechigen Sensors und dessen Kompatibilitaet mit der Leiter-Struktur des Detektor-Moduls, hergestellt. Insbesondere wurden Sensoren mit doppelter Metallschicht auf beiden Seiten und aktivem Bereich von 62x62 mm2 produziert. Die elektrische Charakterisierung der Sensoren wurde durchgefuehrt, um die gesamte Bedienbarkeit sowie die Extrahierung der Geraeteparameter feststellen zu koennen. Strom und Kapazitaets-Spannungs-Charakteristiken sowie Interstreifenparameter wurden gemessen. Das Auslesen der Sensoren wurde mithilfe einer selbstgetriggerten Front-End-Elektronik getaetigt. Ein Front-End-Board wurde auf der Grundlage eines n-XYTER-Auslesechips mit datengesteuerter Architektur entwickelt, der geeignet ist bei Auslesegeschwindigkeit von 32MHz betrieben zu werden. Die Front-End-Platine enthaelt einen externen Analog-zu-Digital-Wandler (ADC). Die Kalibrierung des ADC wurde unter Verwendung von sowohl Roentgenquelle als auch eines Impulsgenerators vorgenommen. Die Schwellenkalibrierung und Untersuchung der Temperaturabhaengigkeit der Chip-Parameter wurden durchgefuehrt. Die ultraleichten Halterungsstrukturen wurden aus Kohlefaser entwickelt, diese haben die Steifigkeit, die Detektor-Module halten, und die minimale Coulomb-Streuung der Teilchenspuren einbeziehen zu koennen. Es wurden Analogmikrokabel mit Aluminiumleiterbahnen auf einem Polyimidsubstrat produziert - eine Kombination von guter elektrischer Verbindung und geringem Materialbudget. Die Mikrokabelstruktur umfasst mehrere Lagen optimiert fuer die niedrige Kapazitaet der Leiterbahnen und den damit verbundenen geraeuscharmen Betrieb. Es wurden Analog-Mikrokabel mit Aluminiumleiterbahnen auf einem Polyimidsubstrat produziert, also eine Kombination von guter elektrischer Verbindung und geringem Materialbudget. Die Mikrokabelstruktur umfasst mehrere Lagen optimiert fuer die niedrige Kapazitaet und den damit verbundenen geraeuscharmen Betrieb. Es wurde ein Demonstrator-Tracking-Teleskop gebaut und in mehreren Strahltests, einschliesslich 2.5 GeV Protonenstrahl an COSY (Juelich), betrieben. Drei Tracking-Stationen wurden mit Hodoskopen ergaenzt. Die Datenanalyse ergab Informationen ueber Analog- und Zeitverhalten sowie Strahlenprofil. So wurden Tracking- und Alignmentinformationen erhalten. Mit speziell entwickelten Monitoring-Tools wurde die Strahlstabilitaet bewertet. Als Ergebnis der Studien, wurde die Leistung der Modulkomponenten bewertet und die Anforderungen zum Detektormodul formuliert. Die genaue Definition des endgueltigen Detektormoduldesigns jedoch, war ausserhalb des Geltungsbereichs dieser Arbeit.The CBM experiment at future accelerator facility FAIR will investigate the properties of nuclear matter under extreme conditions. The experimental programm is different from the heavy-ion experiments at RHIC (BNL) and LHC (CERN) that create nuclear matter at high temperatures. In contrast, the study of the QCD phase diagram in the region of the highest net baryon densities and moderate temperatures that is weakly explored will be performed with high precision. For this, collisions of different heavy-ion beams at the energies of 10–45GeV/nucleon with nuclear target will be measured. The physics programme of the CBM experiment includes measurement of both rare probes and bulk observables that originate from various phases of a nucleus-nucleus collision. In particular, decay of particles with charm quarks can be registered by reconstructing the decay vertex detached from the primary interaction point by several hundreds of micrometers (e.g., decay length c Tau = 123 µm for D0 meson). For this, precise tracking and full event reconstruction with up to 600 charged particle tracks per event within acceptance are required. Other rare probes require operation at interaction rate of up to 10MHz. The detector system that performs tracking has to provide high position resolution on the order of 10 µm, operate at high rates and have radiation tolerant design with low material budget. The Silicon Tracking System (STS) is being designed for charged-particle tracking in a magnetic field. The system consists of eight tracking station located in the aperture of a dipole magnet with 1T field. For tracks with momentum above 1GeV, momentum resolution of such a system is expected to be about 1%. In order to fulfill this task, thorough optimization of the detector design is required. In particular, minimal material budget has to be achieved. Production of a detector module requires research and development activities with respect to the module components and their integration. A detector module is a basic functional unit that includes a sensor, an analogue microcable and frontend electronics mounted on a support structure. The objective of the thesis is to perform quality assurance tests of the prototype module components in order to validate the concept of the detector module and to demonstrate its operation using radioactive sources and particle beams. Double-sided silicon microstrip detectors have been chosen as sensor technology for the STS because of the combination of a good spatial resolution, two-dimensional coordinate measurement achieved within low material budget (0.3%X0), high readout speed and sufficient radiation tolerance. Several generations of double-sided silicon microstrip sensors have been manufactured in order to explore the radiation hard design features and the concept of a large-area sensor compatible with ladder-type structure of the detector module. In particular, sensors with double metal layer on both sides and active area of 62×62mm2 have been produced. Electrical characterization of the sensors has been performed in order to establish the overall operability as well as to extract the device parameters. Current-voltage, capacitance-voltage characteristics and interstrip parameters have been measured. Readout of the sensors has been done using self-triggering front-end electronics. A front-end board has been developed based on the n-XYTER readout chip with data driven architecture and capable of operating at 32MHz readout rate. The front-end board included an external analog-to-digital converter (ADC). Calibration of the ADC has been performed using both 241Am X-ray source and external pulse generator. Threshold calibration and investigation of temperature dependence of chip parameters has been carried out. Low-mass support structures have been developed using carbon fibre that has the rigidity to hold the detector modules and introduce minimal Coulomb scattering of the particle tracks. Analogue microcables have been produced with aluminium traces on a polyimide substrate, thus combining good electrical connection with low material budget. Microcable structure includes several layers optimized for low trace capacitance and thus low-noise performance. A demonstrator tracking telescope has been constructed and operated in several beam tests including 2.5GeV proton beam at COSY synchrotron (Jülich). Three tracking stations have been complemented with several beam hodoscopes. Analysis of the beam data has yielded information on analogue and timing response, beam profile. Tracking and alignment information has been obtained. Beam stability has been evaluated using specially developed monitoring tools. As a result of conducted studies, performance of the module components have been evaluated and requirements to the detector module have been formulated. Practical suggestions have been made with respect to the structure of the detector module, whereas precise definition of the final detector module design was outside of the scope of this thesis

    Reliability and Security of Compute-In-Memory Based Deep Neural Network Accelerators

    Get PDF
    Compute-In-Memory (CIM) is a promising solution for accelerating DNNs at edge devices, utilizing mixed-signal computations. However, it requires more cross-layer designs from algorithm levels to hardware implementations as it behaves differently from the pure digital system. On one side, the mixed-signal computations of CIM face unignorable variations, which could hamper the software performance. On the other side, there are potential software/hardware security vulnerabilities with CIM accelerators. This research aims to solve the reliability and security issues in CIM design for accelerating Deep Neural Network (DNN) algorithms as they prevent the real-life use of the CIM-based accelerators. Some non-ideal effects in CIM accelerators are explored, which could cause reliability issues, and solved by the software-hardware co-design methods. In addition, different security vulnerabilities for SRAM-based CIM and eNVM-based CIM inference engines are defined, and corresponding countermeasures are proposed.Ph.D
    corecore