29,780 research outputs found

    CuPit - a parallel language for neural algorithms: language reference and tutorial

    Get PDF
    CuPit is a parallel programming language with two main design goals: 1. to allow the simple, problem-adequate formulation of learning algorithms for neural networks with focus on algorithms that change the topology of the underlying neural network during the learning process and 2. to allow the generation of efficient code for massively parallel machines from a completely machine-independent program description, in particular to maximize both data locality and load balancing even for irregular neural networks. The idea to achieve these goals lies in the programming model: CuPit programs are object-centered, with connections and nodes of a graph (which is the neural network) being the objects. Algorithms are based on parallel local computations in the nodes and connections and communication along the connections (plus broadcast and reduction operations). This report describes the design considerations and the resulting language definition and discusses in detail a tutorial example program

    CuPit-2 - a parallel language for neural algorithms: language reference and tutorial

    Get PDF
    CuPit-2 is a parallel programming language with two main design goals: 1. to allow the simple, problem-adequate formulation of learning algorithms for neural networks with focus on algorithms that change the topology of the underlying neural network during the learning process and 2. to allow the generation of efficient code for massively parallel machines from a completely machine-independent program description, in particular to maximize both data locality and load balancing even for irregular neural networks. The idea to achieve these goals lies in the programming model: CuPit-2 programs are object-centered, with connections and nodes of a graph (which is the neural network) being the objects. Algorithms are based on parallel local computations in the nodes and connections and communication along the connections (plus broadcast and reduction operations). This report describes the design considerations and the resulting language definition and discusses in detail a tutorial example program. This CuPit-2 language manual and tutorial is an updated version of the original CuPit language manual [Technical Report 1994-04]. The new language CuPit-2 differs from the original CuPit in several ways. All language changes from CuPit to CuPit-2 are listed in the appendix

    Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN

    Full text link
    Over the past few years machine learning has seen a renewed explosion of interest, following a number of studies showing the effectiveness of neural networks in a range of tasks which had previously been considered incredibly hard. Neural networks' effectiveness in the fields of image recognition and natural language processing stems primarily from the vast amounts of data available to companies and researchers, coupled with the huge amounts of compute power available in modern accelerators such as GPUs, FPGAs and ASICs. There are a number of approaches available to developers for utilizing GPGPU technologies such as SYCL, OpenCL and CUDA, however many applications require the same low level mathematical routines. Libraries dedicated to accelerating these common routines allow developers to easily make full use of the available hardware without requiring low level knowledge of the hardware themselves, however such libraries are often provided by hardware manufacturers for specific hardware such as cuDNN for Nvidia hardware or MIOpen for AMD hardware. SYCL-DNN is a new open-source library dedicated to providing accelerated routines for neural network operations which are hardware and vendor agnostic. Built on top of the SYCL open standard and written entirely in standard C++, SYCL-DNN allows a user to easily accelerate neural network code for a wide range of hardware using a modern C++ interface. The library is tested on AMD's OpenCL for GPU, Intel's OpenCL for CPU and GPU, ARM's OpenCL for Mali GPUs as well as ComputeAorta's OpenCL for R-Car CV engine and host CPU. In this talk we will present performance figures for SYCL-DNN on this range of hardware, and discuss how high performance was achieved on such a varied set of accelerators with such different hardware features.Comment: 4 pages, 3 figures. In International Workshop on OpenCL (IWOCL '19), May 13-15, 2019, Bosto
    • …
    corecore