269 research outputs found

    Accurate deep neural network inference using computational phase-change memory

    Get PDF
    In-memory computing is a promising non-von Neumann approach for making energy-efficient deep learning inference hardware. Crossbar arrays of resistive memory devices can be used to encode the network weights and perform efficient analog matrix-vector multiplications without intermediate movements of data. However, due to device variability and noise, the network needs to be trained in a specific way so that transferring the digitally trained weights to the analog resistive memory devices will not result in significant loss of accuracy. Here, we introduce a methodology to train ResNet-type convolutional neural networks that results in no appreciable accuracy loss when transferring weights to in-memory computing hardware based on phase-change memory (PCM). We also propose a compensation technique that exploits the batch normalization parameters to improve the accuracy retention over time. We achieve a classification accuracy of 93.7% on the CIFAR-10 dataset and a top-1 accuracy on the ImageNet benchmark of 71.6% after mapping the trained weights to PCM. Our hardware results on CIFAR-10 with ResNet-32 demonstrate an accuracy above 93.5% retained over a one day period, where each of the 361,722 synaptic weights of the network is programmed on just two PCM devices organized in a differential configuration.Comment: This is a pre-print of an article accepted for publication in Nature Communication

    SKT AIX 가속기의 유연한 실행을 위한 신경망 해석

    Get PDF
    학위논문 (석사) -- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2020. 8. Bernhard Egger.컴퓨터 비전 문제는 합성곱 신경망을 통해 가장 잘 해결된다. 이러한 리소스 집약적 알고리즘은 특히 애플리케이션이 수행 시간과 전력 효율을 중요시할 때, 특수 가속기를 사용하여 수행되어야 한다. 이와 같은 가속기들은 성능 향상을 위해 일반성을 저하시켜야 하는 한계를 가진다. 이 논문에서 우리의 대상 하드웨어는 다크넷 합성곱 신경망의 실행을 위해 설계된 SKT의 AIX 가속기이다. 이 논문을 통해 제안하는 방법을 통해 AIX에서 ONNX 네트워크를 유연하게 실행할 수 있으며, 이는 가속기의 지원을 다양한 프레임워크로 확장한다. 우리는 신경망의 그래프 구조를 가속기가 가지는 다른 형태의 구조로 매핑하기 위해 취해야 할 단계와 지원되지 않는 작업을 포함하는 신경망의 부분 가속을 어떻게 달성하는지 살펴볼 것이다. 본 프로젝트에서 제안하는 방법을 통해 AIX에서 ONNX 네트워크를 실행하면 ONNXRuntime에서 수행되는 기본 실행과 매우 가까운 결과를 얻을 수 있다. 실제로, 우리의 결과는 CIFAR10 샘플에서 98%의 top-1 일치율을 보인다.Computer vision problems are best addressed by convolutional neural networks (CNNs). These resource intensive algorithms require the use of specialised accelerators when the application is particularly time critical or when power efficiency plays an important role. Accelerators' performance gains are reached at the cost of generality. In this thesis, our target hardware is SKT's AIX accelerator designed for the execution of darknet CNNs. The presented work enables the flexible execution of ONNX networks on AIX extending the accelerator's support to an additional framework. We will see the steps to take to map a very different graph structure into that of the accelerator and how we still achieve partial acceleration of networks containing unsupported operations. Running ONNX networks on AIX through our project yields very close results to native execution on ONNXRuntime. Indeed, we reach a 98% top-1 prediction match on a CIFAR10 sample.Chapter 1 Introduction 1 Chapter 2 Background 3 2.1 Convolutional Neural Networks for Computer Vision 3 2.1.1 Feature Maps 4 2.1.2 Convolutions 4 2.1.3 Batch Normalisation 6 2.1.4 Activation Functions 7 2.1.5 Pooling and Subsampling 7 2.2 Open Neural Network Exchange 8 2.3 SKTs Custom Accelerator: AIX 9 Chapter 3 Design 12 3.1 Overview 12 3.2 Operator Support and Graph Division 13 Chapter 4 Implementation 15 4.1 Project Architecture 15 4.2 Graph Division: Ready Queue Exploration 17 4.3 Object Oriented Approach for Common Subgraph Interface 19 4.4 Handling the Data Flow 22 4.5 Working with a Custom Tool under Development 22 Chapter 5 Test Setup 23 5.1 CIFAR10 23 5.2 Prototyping and Converting Test Networks 24 Chapter 6 Results 26 Chapter 7 Conclusion 29 Bibliography 31 요약 34Maste

    A General Method for Calibrating Stochastic Radio Channel Models with Kernels

    Get PDF
    Publisher Copyright: CCBYCalibrating stochastic radio channel models to new measurement data is challenging when the likelihood function is intractable. The standard approach to this problem involves sophisticated algorithms for extraction and clustering of multipath components, following which, point estimates of the model parameters can be obtained using specialized estimators. We propose a likelihood-free calibration method using approximate Bayesian computation. The method is based on the maximum mean discrepancy, which is a notion of distance between probability distributions. Our method not only by-passes the need to implement any high-resolution or clustering algorithm, but is also automatic in that it does not require any additional input or manual pre-processing from the user. It also has the advantage of returning an entire posterior distribution on the value of the parameters, rather than a simple point estimate. We evaluate the performance of the proposed method by fitting two different stochastic channel models, namely the Saleh-Valenzuela model and the propagation graph model, to both simulated and measured data. The proposed method is able to estimate the parameters of both the models accurately in simulations, as well as when applied to 60 GHz indoor measurement data.Peer reviewe

    Probabilistic Safety Regions Via Finite Families of Scalable Classifiers

    Full text link
    Supervised classification recognizes patterns in the data to separate classes of behaviours. Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning. The data analyst may minimize the classification error on a class at the expense of increasing the error of the other classes. The error control of such a design phase is often done in a heuristic manner. In this context, it is key to develop theoretical foundations capable of providing probabilistic certifications to the obtained classifiers. In this perspective, we introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled. The notion of scalable classifiers is then exploited to link the tuning of machine learning with error control. Several tests corroborate the approach. They are provided through synthetic data in order to highlight all the steps involved, as well as through a smart mobility application.Comment: 13 pages, 4 figures, 1 table, submitted to IEEE TNNL

    Nonlinear wavefront reconstruction with convolutional neural networks for Fourier-based wavefront sensors

    Get PDF
    Fourier-based wavefront sensors, such as the Pyramid Wavefront Sensor (PWFS), are the current preference for high contrast imaging due to their high sensitivity. However, these wavefront sensors have intrinsic nonlinearities that constrain the range where conventional linear reconstruction methods can be used to accurately estimate the incoming wavefront aberrations. We propose to use Convolutional Neural Networks (CNNs) for the nonlinear reconstruction of the wavefront sensor measurements. It is demonstrated that a CNN can be used to accurately reconstruct the nonlinearities in both simulations and a lab implementation. We show that solely using a CNN for the reconstruction leads to suboptimal closed loop performance under simulated atmospheric turbulence. However, it is demonstrated that using a CNN to estimate the nonlinear error term on top of a linear model results in an improved effective dynamic range of a simulated adaptive optics system. The larger effective dynamic range results in a higher Strehl ratio under conditions where the nonlinear error is relevant. This will allow the current and future generation of large astronomical telescopes to work in a wider range of atmospheric conditions and therefore reduce costly downtime of such facilities.Comment: 14 pages, 7 figure

    ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework for LiDAR Point Cloud Segmentation

    Full text link
    Due to its robust and precise distance measurements, LiDAR plays an important role in scene understanding for autonomous driving. Training deep neural networks (DNNs) on LiDAR data requires large-scale point-wise annotations, which are time-consuming and expensive to obtain. Instead, simulation-to-real domain adaptation (SRDA) trains a DNN using unlimited synthetic data with automatically generated labels and transfers the learned model to real scenarios. Existing SRDA methods for LiDAR point cloud segmentation mainly employ a multi-stage pipeline and focus on feature-level alignment. They require prior knowledge of real-world statistics and ignore the pixel-level dropout noise gap and the spatial feature gap between different domains. In this paper, we propose a novel end-to-end framework, named ePointDA, to address the above issues. Specifically, ePointDA consists of three modules: self-supervised dropout noise rendering, statistics-invariant and spatially-adaptive feature alignment, and transferable segmentation learning. The joint optimization enables ePointDA to bridge the domain shift at the pixel-level by explicitly rendering dropout noise for synthetic LiDAR and at the feature-level by spatially aligning the features between different domains, without requiring the real-world statistics. Extensive experiments adapting from synthetic GTA-LiDAR to real KITTI and SemanticKITTI demonstrate the superiority of ePointDA for LiDAR point cloud segmentation.Comment: Accepted by AAAI 202

    On quantifying the value of simulation for training and evaluating robotic agents

    Full text link
    Un problème récurrent dans le domaine de la robotique est la difficulté à reproduire les résultats et valider les affirmations faites par les scientifiques. Les expériences conduites en laboratoire donnent fréquemment des résultats propres à l'environnement dans lequel elles ont été effectuées, rendant la tâche de les reproduire et de les valider ardues et coûteuses. Pour cette raison, il est difficile de comparer la performance et la robustesse de différents contrôleurs robotiques. Les environnements substituts à faibles coûts sont populaires, mais introduisent une réduction de performance lorsque l'environnement cible est enfin utilisé. Ce mémoire présente nos travaux sur l'amélioration des références et de la comparaison d'algorithmes (``Benchmarking'') en robotique, notamment dans le domaine de la conduite autonome. Nous présentons une nouvelle platforme, les Autolabs Duckietown, qui permet aux chercheurs d'évaluer des algorithmes de conduite autonome sur des tâches, du matériel et un environnement standardisé à faible coût. La plateforme offre également un environnement virtuel afin d'avoir facilement accès à une quantité illimitée de données annotées. Nous utilisons la plateforme pour analyser les différences entre la simulation et la réalité en ce qui concerne la prédictivité de la simulation ainsi que la qualité des images générées. Nous fournissons deux métriques pour quantifier l'utilité d'une simulation et nous démontrons de quelles façons elles peuvent être utilisées afin d'optimiser un environnement proxy.A common problem in robotics is reproducing results and claims made by researchers. The experiments done in robotics laboratories typically yield results that are specific to a complex setup and difficult or costly to reproduce and validate in other contexts. For this reason, it is arduous to compare the performance and robustness of various robotic controllers. Low-cost reproductions of physical environments are popular but induce a performance reduction when transferred to the target domain. This thesis present the results of our work toward improving benchmarking in robotics, specifically for autonomous driving. We build a new platform, the Duckietown Autolabs, which allow researchers to evaluate autonomous driving algorithms in a standardized framework on low-cost hardware. The platform offers a simulated environment for easy access to annotated data and parallel evaluation of driving solutions in customizable environments. We use the platform to analyze the discrepancy between simulation and reality in the case of predictivity and quality of data generated. We supply two metrics to quantify the usefulness of a simulation and demonstrate how they can be used to optimize the value of a proxy environment
    corecore