630 research outputs found

    Mixed-precision deep learning based on computational memory

    Full text link
    Deep neural networks (DNNs) have revolutionized the field of artificial intelligence and have achieved unprecedented success in cognitive tasks such as image and speech recognition. Training of large DNNs, however, is computationally intensive and this has motivated the search for novel computing architectures targeting this application. A computational memory unit with nanoscale resistive memory devices organized in crossbar arrays could store the synaptic weights in their conductance states and perform the expensive weighted summations in place in a non-von Neumann manner. However, updating the conductance states in a reliable manner during the weight update process is a fundamental challenge that limits the training accuracy of such an implementation. Here, we propose a mixed-precision architecture that combines a computational memory unit performing the weighted summations and imprecise conductance updates with a digital processing unit that accumulates the weight updates in high precision. A combined hardware/software training experiment of a multilayer perceptron based on the proposed architecture using a phase-change memory (PCM) array achieves 97.73% test accuracy on the task of classifying handwritten digits (based on the MNIST dataset), within 0.6% of the software baseline. The architecture is further evaluated using accurate behavioral models of PCM on a wide class of networks, namely convolutional neural networks, long-short-term-memory networks, and generative-adversarial networks. Accuracies comparable to those of floating-point implementations are achieved without being constrained by the non-idealities associated with the PCM devices. A system-level study demonstrates 173x improvement in energy efficiency of the architecture when used for training a multilayer perceptron compared with a dedicated fully digital 32-bit implementation

    Combined HW/SW Drift and Variability Mitigation for PCM-based Analog In-memory Computing for Neural Network Applications

    Get PDF
    Matrix-Vector Multiplications (MVMs) represent a heavy workload for both training and inference in Deep Neural Networks (DNNs) applications. Analog In-memory Computing (AIMC) systems based on Phase Change Memory (PCM) has been shown to be a valid competitor to enhance the energy efficiency of DNN accelerators. Although DNNs are quite resilient to computation inaccuracies, PCM non-idealities could strongly affect MVM operations precision, and thus the accuracy of DNNs. In this paper, a combined hardware and software solution to mitigate the impact of PCM non-idealities is presented. The drift of PCM cells conductance is compensated at the circuit level through the introduction of a conductance ratio at the core of the MVM computation. A model of the behaviour of PCM cells is employed to develop a device-aware training for DNNs and the accuracy is estimated in a CIFAR-10 classification task. This work is supported by a PCM-based AIMC prototype, designed in a 90-nm STMicroelectronics technology, and conceived to perform Multiply-and-Accumulate (MAC) computations, which are the kernel of MVMs. Results show that the MAC computation accuracy is around 95% even under the effect of cells drift. The use of a device-aware DNN training makes the networks less sensitive to weight variability, with a 15% increase in classification accuracy over a conventionally-trained Lenet-5 DNN, and a 36% gain when drift compensation is applied

    Real-time data processing in the ALICE High Level Trigger at the LHC

    Get PDF
    At the Large Hadron Collider at CERN in Geneva, Switzerland, atomic nuclei are collided at ultra-relativistic energies. Many final-state particles are produced in each collision and their properties are measured by the ALICE detector. The detector signals induced by the produced particles are digitized leading to data rates that are in excess of 48 GB/s. The ALICE High Level Trigger (HLT) system pioneered the use of FPGA- and GPU-based algorithms to reconstruct charged-particle trajectories and reduce the data size in real time. The results of the reconstruction of the collision events, available online, are used for high level data quality and detector-performance monitoring and real-time time-dependent detector calibration. The online data compression techniques developed and used in the ALICE HLT have more than quadrupled the amount of data that can be stored for offline event processing

    Thermal Heating in ReRAM Crossbar Arrays: Challenges and Solutions

    Full text link
    Increasing popularity of deep-learning-powered applications raises the issue of vulnerability of neural networks to adversarial attacks. In other words, hardly perceptible changes in input data lead to the output error in neural network hindering their utilization in applications that involve decisions with security risks. A number of previous works have already thoroughly evaluated the most commonly used configuration - Convolutional Neural Networks (CNNs) against different types of adversarial attacks. Moreover, recent works demonstrated transferability of the some adversarial examples across different neural network models. This paper studied robustness of the new emerging models such as SpinalNet-based neural networks and Compact Convolutional Transformers (CCT) on image classification problem of CIFAR-10 dataset. Each architecture was tested against four White-box attacks and three Black-box attacks. Unlike VGG and SpinalNet models, attention-based CCT configuration demonstrated large span between strong robustness and vulnerability to adversarial examples. Eventually, the study of transferability between VGG, VGG-inspired SpinalNet and pretrained CCT 7/3x1 models was conducted. It was shown that despite high effectiveness of the attack on the certain individual model, this does not guarantee the transferability to other models.Comment: 18 page

    Development of the readout electronics for the high luminosity upgrade of the CMS outer strip tracker

    Get PDF
    The High-luminosity upgrade of the LHC will deliver the dramatic increase in luminosity required for precision measurements and to probe Beyond the Standard Model theories. At the same time, it will present unprecedented challenges in terms of pileup and radiation degradation. The CMS experiment is set for an extensive upgrade campaign, which includes the replacement of the current Tracker with another all-silicon detector with improved performance and reduced mass. One of the most ambitious aspects of the future Tracker will be the ability to identify high transverse momentum track candidates at every bunch crossing and with very low latency, in order to include tracking information at the L1 hardware trigger stage, a critical and effective step to achieve triggers with high purity and low threshold. This thesis presents the development and the testing of the CMS Binary Chip 2 (CBC2), a prototype Application Specific Integrated Circuit (ASIC) for the binary front-end readout of silicon strip detectors modules in the Outer Tracker, which also integrates the logic necessary to identify high transverse momentum candidates by correlating hits from two silicon strip detectors, separated by a few millimetres. The design exploits the relation between the transverse momentum and the curvature in the trajectory of charged particles subject to the large magnetic field of CMS. The logic which follows the analogue amplification and binary conversion rejects clusters wider than a programmable maximum number of adjacent strips, compensates for the geometrical offset in the alignment of the module, and correlates the hits between the two sensor layers. Data are stored in a memory buffer before being transferred to an additional buffer stage and being serially read-out upon receipt of a Level 1 trigger. The CBC2 has been subject to extensive testing since its production in January 2013: this work reports the results of electrical characterization, of the total ionizing dose irradiation tests, and the performance of a prototype module instrumented with CBC2 in realistic conditions in a beam test. The latter is the first experimental demonstration of the Pt-selection principle central to the future of CMS. Several total-ionizing-dose tests highlighted no functional issue, but observed significant excess static current for doses <1 Mrad. The source of the excess was traced to static leakage current in the memory pipeline, and is believed to be a consequence of the high instantaneous dose delivered by the x-ray setup. Nevertheless, a new SRAM layout aimed at removing the leakage path was proposed for the CBC3. The results of single event upset testing of the chip are also reported, two of the three distinct memory circuits used in the chip were proven to meet the expected robustness, while the third will be replaced in the next iteration of the chip. Finally, the next version of the ASIC is presented, highlighting the additional features of the final prototype, such as half-strip resolution, additional trigger logic functionality, longer trigger latency and higher rate, and fully synchronous stub readout.Open Acces

    High-fidelity, near-field microwave gates in a cryogenic surface trap

    Get PDF
    We present a novel dynamical decoupling strategy for near field microwave gradient driven, Mølmer-Sørensen style, two-ion quantum logic gates, which suppresses errors from both fluctuations in the qubit frequency and imperfection in the decoupling drive itself. Using a microwave-integrated surface-trap which is operated cryogenically at 25 K and a magnetically insensitive 43-Ca+ qubit at 288 G, we demonstrate a 331 us two-ion quantum logic gates, with 4.9(11)e-3 logic error probability. This is below the 1% error threshold required for quantum error correction and represents a ~10x gate time reduction when compared to previously demonstrated near field gradient driven microwave gates below the 1% error probability threshold. Additionally, two faster gates were demonstrated without the use of dynamical decoupling. Respectively, these two gates had gate operation durations of 216.8 us & 153.8 us and measured gate error probabilities of 8.5(20)e-3 & 9.8(21)e-3. Further, we develop a method for rapid calculation of ion transport operations. We successfully demonstrate ion transport as well as crystal splitting and merging operations within two different ion traps using the waveforms calculated by this ion transport toolbox

    Full stack development toward a trapped ion logical qubit

    Get PDF
    Quantum error correction is a key step toward the construction of a large-scale quantum computer, by preventing small infidelities in quantum gates from accumulating over the course of an algorithm. Detecting and correcting errors is achieved by using multiple physical qubits to form a smaller number of robust logical qubits. The physical implementation of a logical qubit requires multiple qubits, on which high fidelity gates can be performed. The project aims to realize a logical qubit based on ions confined on a microfabricated surface trap. Each physical qubit will be a microwave dressed state qubit based on 171Yb+ ions. Gates are intended to be realized through RF and microwave radiation in combination with magnetic field gradients. The project vertically integrates software down to hardware compilation layers in order to deliver, in the near future, a fully functional small device demonstrator. This thesis presents novel results on multiple layers of a full stack quantum computer model. On the hardware level a robust quantum gate is studied and ion displacement over the X-junction geometry is demonstrated. The experimental organization is optimized through automation and compressed waveform data transmission. A new quantum assembly language purely dedicated to trapped ion quantum computers is introduced. The demonstrator is aimed at testing implementation of quantum error correction codes while preparing for larger scale iterations.Open Acces
    • …
    corecore