Search CORE

12,252 research outputs found

Developement of real time diagnostics and feedback algorithms for JET in view of the next step

Author: A Murari
Albanese R
Ambrosino G
Barana O
Bosak K
D Mazon
D Moreau
E de la Luna
E Joffrin
EFDA-JET Contributors
F Crisanti
F Piccolo
F Sartori
Felton R
Felton R
G Ambrosino
Giannella R
J Sanchez
Joffrin E
L Laborde
L Zabeo
Laborde L
Lao L
M Ariola
M Bruno
Moreau D
O Barana
P Arena
R Albanese
R Felton
Riva M
Watkins M
Zabeo L
Publication venue: 'IOP Publishing'
Publication date: 15/10/2004
Field of study

Real time control of many plasma parameters will be an essential aspect in the development of reliable high performance operation of Next Step Tokamaks. The main prerequisites for any feedback scheme are the precise real-time determination of the quantities to be controlled, requiring top quality and highly reliable diagnostics, and the availability of robust control algorithms. A new set of real time diagnostics was recently implemented on JET to prove the feasibility of determining, with high accuracy and time resolution, the most important plasma quantities. With regard to feedback algorithms, new model–based controllers were developed to allow a more robust control of several plasma parameters. Both diagnostics and algorithms were successfully used in several experiments, ranging from H-mode plasmas to configuration with ITBs. Since elaboration of computationally heavy measurements is often required, significant attention was devoted to non-algorithmic methods like Digital or Cellular Neural/Nonlinear Networks. The real time hardware and software adopted architectures are also described with particular attention to their relevance to ITER.Comment: 12th International Congress on Plasma Physics, 25-29 October 2004, Nice (France

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli "Parthenope"

Archivio della ricerca - Università degli studi di Napoli Federico II

PZnet: Efficient 3D ConvNet Inference on Manycore CPUs

Author: Buniatyan Davit
Li Kai
Popovych Sergiy
Seung H. Sebastian
Zlateski Aleksandar
Publication venue
Publication date: 18/03/2019
Field of study

Convolutional nets have been shown to achieve state-of-the-art accuracy in many biomedical image analysis tasks. Many tasks within biomedical analysis domain involve analyzing volumetric (3D) data acquired by CT, MRI and Microscopy acquisition methods. To deploy convolutional nets in practical working systems, it is important to solve the efficient inference problem. Namely, one should be able to apply an already-trained convolutional network to many large images using limited computational resources. In this paper we present PZnet, a CPU-only engine that can be used to perform inference for a variety of 3D convolutional net architectures. PZNet outperforms MKL-based CPU implementations of PyTorch and Tensorflow by more than 3.5x for the popular U-net architecture. Moreover, for 3D convolutions with low featuremap numbers, cloud CPU inference with PZnet outperfroms cloud GPU inference in terms of cost efficiency

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Large-Scale Optical Neural Networks based on Photoelectric Multiplication

Author: Bernstein Liane
Englund Dirk
Hamerly Ryan
Sludds Alexander
Soljačić Marin
Publication venue: 'American Physical Society (APS)'
Publication date: 01/05/2019
Field of study

Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to large (

N \gtrsim 10^6

) networks and can be operated at high (GHz) speeds and very low (sub-aJ) energies per multiply-and-accumulate (MAC), using the massive spatial multiplexing enabled by standard free-space optical components. In contrast to previous approaches, both weights and inputs are optically encoded so that the network can be reprogrammed and trained on the fly. Simulations of the network using models for digit- and image-classification reveal a "standard quantum limit" for optical neural networks, set by photodetector shot noise. This bound, which can be as low as 50 zJ/MAC, suggests performance below the thermodynamic (Landauer) limit for digital irreversible computation is theoretically possible in this device. The proposed accelerator can implement both fully-connected and convolutional networks. We also present a scheme for back-propagation and training that can be performed in the same hardware. This architecture will enable a new class of ultra-low-energy processors for deep learning.Comment: Text: 10 pages, 5 figures, 1 table. Supplementary: 8 pages, 5, figures, 2 table

arXiv.org e-Print Archive

Directory of Open Access Journals

Recommended from our members

Versatile stochastic dot product circuits based on nonvolatile memories for high performance neurocomputing and neurooptimization.

Author: Mahmoodi MR
Prezioso M
Strukov DB
Publication venue: eScholarship, University of California
Publication date: 01/11/2019
Field of study

The key operation in stochastic neural networks, which have become the state-of-the-art approach for solving problems in machine learning, information theory, and statistics, is a stochastic dot-product. While there have been many demonstrations of dot-product circuits and, separately, of stochastic neurons, the efficient hardware implementation combining both functionalities is still missing. Here we report compact, fast, energy-efficient, and scalable stochastic dot-product circuits based on either passively integrated metal-oxide memristors or embedded floating-gate memories. The circuit's high performance is due to mixed-signal implementation, while the efficient stochastic operation is achieved by utilizing circuit's noise, intrinsic and/or extrinsic to the memory cell array. The dynamic scaling of weights, enabled by analog memory devices, allows for efficient realization of different annealing approaches to improve functionality. The proposed approach is experimentally verified for two representative applications, namely by implementing neural network for solving a four-node graph-partitioning problem, and a Boltzmann machine with 10-input and 8-hidden neurons

eScholarship - University of California

Bio-Inspired Stereo Vision Calibration for Dynamic Vision Sensors

Author: Cabello Enrique
Conde Cristina
Domínguez Morales Manuel Jesús
Jiménez Fernández Ángel Francisco
Jiménez Moreno Gabriel
Linares Barranco Alejandro
Publication venue: IEEE Computer Society
Publication date: 01/01/2019
Field of study

Many advances have been made in the eld of computer vision. Several recent research trends have focused on mimicking human vision by using a stereo vision system. In multi-camera systems, a calibration process is usually implemented to improve the results accuracy. However, these systems generate a large amount of data to be processed; therefore, a powerful computer is required and, in many cases, this cannot be done in real time. Neuromorphic Engineering attempts to create bio-inspired systems that mimic the information processing that takes place in the human brain. This information is encoded using pulses (or spikes) and the generated systems are much simpler (in computational operations and resources), which allows them to perform similar tasks with much lower power consumption, thus these processes can be developed over specialized hardware with real-time processing. In this work, a bio-inspired stereovision system is presented, where a calibration mechanism for this system is implemented and evaluated using several tests. The result is a novel calibration technique for a neuromorphic stereo vision system, implemented over specialized hardware (FPGA - Field-Programmable Gate Array), which allows obtaining reduced latencies on hardware implementation for stand-alone systems, and working in real time.Ministerio de Economía y Competitividad TEC2016-77785-PMinisterio de Economía y Competitividad TIN2016-80644-

idUS. Depósito de Investigación Universidad de Sevilla

Scalable Emulation of Sign-Problem $-$ Free Hamiltonians with Room Temperature p-bits

Author: Camsari Kerem Y.
Chowdhury Shuvro
Datta Supriyo
Publication venue: 'American Physical Society (APS)'
Publication date: 30/09/2019
Field of study

The growing field of quantum computing is based on the concept of a q-bit which is a delicate superposition of 0 and 1, requiring cryogenic temperatures for its physical realization along with challenging coherent coupling techniques for entangling them. By contrast, a probabilistic bit or a p-bit is a robust classical entity that fluctuates between 0 and 1, and can be implemented at room temperature using present-day technology. Here, we show that a probabilistic coprocessor built out of room temperature p-bits can be used to accelerate simulations of a special class of quantum many-body systems that are sign-problem

-

free or stoquastic, leveraging the well-known Suzuki-Trotter decomposition that maps a

d

-dimensional quantum many body Hamiltonian to a

d

+1-dimensional classical Hamiltonian. This mapping allows an efficient emulation of a quantum system by classical computers and is commonly used in software to perform Quantum Monte Carlo (QMC) algorithms. By contrast, we show that a compact, embedded MTJ-based coprocessor can serve as a highly efficient hardware-accelerator for such QMC algorithms providing several orders of magnitude improvement in speed compared to optimized CPU implementations. Using realistic device-level SPICE simulations we demonstrate that the correct quantum correlations can be obtained using a classical p-circuit built with existing technology and operating at room temperature. The proposed coprocessor can serve as a tool to study stoquastic quantum many-body systems, overcoming challenges associated with physical quantum annealers.Comment: Fixed minor typos and expanded Appendi

arXiv.org e-Print Archive

eScholarship - University of California

FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

Author: Blott Michaela
Fraser Nicholas J.
Gambardella Giulio
Jahre Magnus
Leong Philip
Umuroglu Yaman
Vissers Kees
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2016
Field of study

Research has shown that convolutional neural networks contain significant redundancy, and high classification accuracy can be obtained even when weights and activations are reduced from floating point to binary values. In this paper, we present FINN, a framework for building fast and flexible FPGA accelerators using a flexible heterogeneous streaming architecture. By utilizing a novel set of optimizations that enable efficient mapping of binarized neural networks to hardware, we implement fully connected, convolutional and pooling layers, with per-layer compute resources being tailored to user-provided throughput requirements. On a ZC706 embedded FPGA platform drawing less than 25 W total system power, we demonstrate up to 12.3 million image classifications per second with 0.31 {\mu}s latency on the MNIST dataset with 95.8% accuracy, and 21906 image classifications per second with 283 {\mu}s latency on the CIFAR-10 and SVHN datasets with respectively 80.1% and 94.9% accuracy. To the best of our knowledge, ours are the fastest classification rates reported to date on these benchmarks.Comment: To appear in the 25th International Symposium on Field-Programmable Gate Arrays, February 201

arXiv.org e-Print Archive

Crossref

NORA - Norwegian Open Research Archives