24 research outputs found

    Suns-VOC_\textrm{OC} characteristics of high performance kesterite solar cells

    Full text link
    Low open circuit voltage (VOCV_{OC}) has been recognized as the number one problem in the current generation of Cu2_{2}ZnSn(Se,S)4_{4} (CZTSSe) solar cells. We report high light intensity and low temperature Suns-VOCV_{OC} measurement in high performance CZTSSe devices. The Suns-VOCV_{OC} curves exhibit bending at high light intensity, which points to several prospective VOCV_{OC} limiting mechanisms that could impact the VOCV_{OC}, even at 1 sun for lower performing samples. These VOC_{OC} limiting mechanisms include low bulk conductivity (because of low hole density or low mobility), bulk or interface defects including tail states, and a non-ohmic back contact for low carrier density CZTSSe. The non-ohmic back contact problem can be detected by Suns-VOCV_{OC} measurements with different monochromatic illumination. These limiting factors may also contribute to an artificially lower JSCJ_{SC}-VOCV_{OC} diode ideality factor.Comment: 9 pages, 9 figures, 1 supplementary materia

    Efficient ConvNets for Analog Arrays

    Full text link
    Analog arrays are a promising upcoming hardware technology with the potential to drastically speed up deep learning. Their main advantage is that they compute matrix-vector products in constant time, irrespective of the size of the matrix. However, early convolution layers in ConvNets map very unfavorably onto analog arrays, because kernel matrices are typically small and the constant time operation needs to be sequentially iterated a large number of times, reducing the speed up advantage for ConvNets. Here, we propose to replicate the kernel matrix of a convolution layer on distinct analog arrays, and randomly divide parts of the compute among them, so that multiple kernel matrices are trained in parallel. With this modification, analog arrays execute ConvNets with an acceleration factor that is proportional to the number of kernel matrices used per layer (here tested 16-128). Despite having more free parameters, we show analytically and in numerical experiments that this convolution architecture is self-regularizing and implicitly learns similar filters across arrays. We also report superior performance on a number of datasets and increased robustness to adversarial attacks. Our investigation suggests to revise the notion that mixed analog-digital hardware is not suitable for ConvNets

    Training LSTM Networks with Resistive Cross-Point Devices

    Full text link
    In our previous work we have shown that resistive cross point devices, so called Resistive Processing Unit (RPU) devices, can provide significant power and speed benefits when training deep fully connected networks as well as convolutional neural networks. In this work, we further extend the RPU concept for training recurrent neural networks (RNNs) namely LSTMs. We show that the mapping of recurrent layers is very similar to the mapping of fully connected layers and therefore the RPU concept can potentially provide large acceleration factors for RNNs as well. In addition, we study the effect of various device imperfections and system parameters on training performance. Symmetry of updates becomes even more crucial for RNNs; already a few percent asymmetry results in an increase in the test error compared to the ideal case trained with floating point numbers. Furthermore, the input signal resolution to device arrays needs to be at least 7 bits for successful training. However, we show that a stochastic rounding scheme can reduce the input signal resolution back to 5 bits. Further, we find that RPU device variations and hardware noise are enough to mitigate overfitting, so that there is less need for using dropout. We note that the models trained here are roughly 1500 times larger than the fully connected network trained on MNIST dataset in terms of the total number of multiplication and summation operations performed per epoch. Thus, here we attempt to study the validity of the RPU approach for large scale networks.Comment: 17 pages, 5 figure

    Training large-scale ANNs on simulated resistive crossbar arrays

    Full text link
    Accelerating training of artificial neural networks (ANN) with analog resistive crossbar arrays is a promising idea. While the concept has been verified on very small ANNs and toy data sets (such as MNIST), more realistically sized ANNs and datasets have not yet been tackled. However, it is to be expected that device materials and hardware design constraints, such as noisy computations, finite number of resistive states of the device materials, saturating weight and activation ranges, and limited precision of analog-to-digital converters, will cause significant challenges to the successful training of state-of-the-art ANNs. By using analog hardware aware ANN training simulations, we here explore a number of simple algorithmic compensatory measures to cope with analog noise and limited weight and output ranges and resolutions, that dramatically improve the simulated training performances on RPU arrays on intermediately to large-scale ANNs

    A flexible and fast PyTorch toolkit for simulating training and inference on analog crossbar arrays

    Full text link
    We introduce the IBM Analog Hardware Acceleration Kit, a new and first of a kind open source toolkit to simulate analog crossbar arrays in a convenient fashion from within PyTorch (freely available at https://github.com/IBM/aihwkit). The toolkit is under active development and is centered around the concept of an "analog tile" which captures the computations performed on a crossbar array. Analog tiles are building blocks that can be used to extend existing network modules with analog components and compose arbitrary artificial neural networks (ANNs) using the flexibility of the PyTorch framework. Analog tiles can be conveniently configured to emulate a plethora of different analog hardware characteristics and their non-idealities, such as device-to-device and cycle-to-cycle variations, resistive device response curves, and weight and output noise. Additionally, the toolkit makes it possible to design custom unit cell configurations and to use advanced analog optimization algorithms such as Tiki-Taka. Moreover, the backward and update behavior can be set to "ideal" to enable hardware-aware training features for chips that target inference acceleration only. To evaluate the inference accuracy of such chips over time, we provide statistical programming noise and drift models calibrated on phase-change memory hardware. Our new toolkit is fully GPU accelerated and can be used to conveniently estimate the impact of material properties and non-idealities of future analog technology on the accuracy for arbitrary ANNs.Comment: Submitted to AICAS202

    Spin susceptibility and effective mass of two-dimensional electrons in MgxZn1-xO/ZnO heterostructures

    Full text link
    We report measurements of the spin susceptibility and the electron effective mass for two-dimensional electrons confined at the interfaces of MgxZn1-xO/ZnO single heterostructures (x = 0.05, 0.08, and 0.11), grown by molecular-beam epitaxy on (0001) ZnO substrates. By tuning the built-in polarization through control of the barrier composition, the electron density was systematically varied in the range of 5.6 x 10^11 to 1.6 x 10^12 cm^-2, corresponding to a range of 3.1 < rs < 5.2, where rs is the average electron spacing measured in units of the effective Bohr radius. We used the coincidence technique, where crossings of the spin-split Landau levels occur at critical tilt angles of magnetic field, to evaluate the spin susceptibility. In addition, we determined the effective mass from the temperature dependence of the Shubnikov-de Haas oscillations measured at the coincidence conditions. The susceptibility and the effective mass both gradually increase with decreasing electron density, reflecting the role of electron-electron interaction.Comment: 4 pages, 4figures, accepted for publication in Phys. Rev.
    corecore