Search CORE

18,325 research outputs found

A Logic-Independent IDE

Author: Rabe Florian
Publication venue: 'Open Publishing Association'
Publication date: 01/10/2014
Field of study

The author's MMT system provides a framework for defining and implementing logical systems. By combining MMT with the jEdit text editor, we obtain a logic-independent IDE. The IDE functionality includes advanced features such as context-sensitive auto-completion, search, and change management.Comment: In Proceedings UITP 2014, arXiv:1410.785

arXiv.org e-Print Archive

Directory of Open Access Journals

PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation

Author: Ahmed Fasih
Andreas Klöckner
Bell
Bryan Catanzaro
Buck
Chandler
Dalcín
Eich
Feldman
Flanagan
Frigo
Group
Hestenes
Hesthaven
Kennedy
Klöckner
Lam
Langtangen
Lindholm
McCarthy
McCool
Nicolas Pinto
Oliphant
Owens
Paul Ivanov
Pinto
Pinto
Prud’homme
Reynders
Seiler
Stein
Valiant
van Hateren
Veldhuizen
Wang
Whaley
Yunsup Lee
Publication venue: 'Elsevier BV'
Publication date: 29/03/2011
Field of study

High-performance computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving computing environment currently exhibited by GPUs. One way of addressing this challenge is to embrace better techniques and develop tools tailored to their needs. This article presents one simple technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL, two open-source toolkits that support this technique. In introducing PyCUDA and PyOpenCL, this article proposes the combination of a dynamic, high-level scripting language with the massive performance of a GPU as a compelling two-tiered computing platform, potentially offering significant performance and productivity advantages over conventional single-tier, static systems. The concept of RTCG is simple and easily implemented using existing, robust infrastructure. Nonetheless it is powerful enough to support (and encourage) the creation of custom application-specific tools by its users. The premise of the paper is illustrated by a wide range of examples where the technique has been applied with considerable success.Comment: Submitted to Parallel Computing, Elsevie

arXiv.org e-Print Archive

Efficiently Retrieving Function Dependencies in the Linux Kernel Using XSB

Author: Hadjichristodoulou Spyros
Porter Donald E.
Warren David S.
Publication venue
Publication date: 19/08/2013
Field of study

In this paper we investigate XSB-Prolog as a static analysis engine for data represented by medium-sized graphs. We use XSB-Prolog to automatically identify function dependencies in the Linux Kernel---queries that are difficult to implement efficiently in a commodity database and that developers often have to identify manually. This project illustrates that Prolog systems are ideal for building tools for use in other disciplines that require sophisticated inferences, because Prolog is both declarative and can efficiently implement complex problem specifications through tabling and indexing.Comment: Part of WLPE 2013 proceedings (arXiv:1308.2055

arXiv.org e-Print Archive

Experiences from Exporting Major Proof Assistant Libraries

Author: Kohlhase Michael
Rabe Florian
Publication venue
Publication date: 05/05/2020
Field of study

The interoperability of proof assistants and the integration of their libraries is a highly valued but elusive goal in the field of theorem proving. As a preparatory step, in previous work, we translated the libraries of multiple proof assistants, specifically the ones of Coq, HOL Light, IMPS, Isabelle, Mizar, and PVS into a universal format: OMDoc/MMT. Each translation presented tremendous theoretical, technical, and social challenges, some universal and some system-specific, some solvable and some still open. In this paper, we survey these challenges and compare and evaluate the solutions we chose. We believe similar library translations will be an essential part of any future system interoperability solution and our experiences will prove valuable to others undertaking such efforts

arXiv.org e-Print Archive

LightNet: A Versatile, Standalone Matlab-based Environment for Deep Learning

Author: Aloimonos Yiannis
Fermuller Cornelia
Yang Yezhou
Ye Chengxi
Zhao Chen
Publication venue
Publication date: 02/08/2016
Field of study

LightNet is a lightweight, versatile and purely Matlab-based deep learning framework. The idea underlying its design is to provide an easy-to-understand, easy-to-use and efficient computational platform for deep learning research. The implemented framework supports major deep learning architectures such as Multilayer Perceptron Networks (MLP), Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). The framework also supports both CPU and GPU computation, and the switch between them is straightforward. Different applications in computer vision, natural language processing and robotics are demonstrated as experiments.Comment: Accepted to ACM MULTIMEDIA 2016 Open Source Software Competitio

arXiv.org e-Print Archive

Parallel Programming Models for Heterogeneous Many-Cores : A Survey

Author: Fang Jianbin
Huang Chun
Tang Tao
Wang Zheng
Publication venue
Publication date: 05/05/2020
Field of study

Heterogeneous many-cores are now an integral part of modern computing systems ranging from embedding systems to supercomputers. While heterogeneous many-core design offers the potential for energy-efficient high-performance, such potential can only be unlocked if the application programs are suitably parallel and can be made to match the underlying heterogeneous platform. In this article, we provide a comprehensive survey for parallel programming models for heterogeneous many-core architectures and review the compiling techniques of improving programmability and portability. We examine various software optimization techniques for minimizing the communicating overhead between heterogeneous computing devices. We provide a road map for a wide variety of different research areas. We conclude with a discussion on open issues in the area and potential research directions. This article provides both an accessible introduction to the fast-moving area of heterogeneous programming and a detailed bibliography of its main achievements.Comment: Accepted to be published at CCF Transactions on High Performance Computin

arXiv.org e-Print Archive

FPGA-based Accelerators of Deep Learning Networks for Learning and Classification: A Review

Author: El-Maleh Aiman
Sait Sadiq M.
Shawahna Ahmad
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Due to recent advances in digital technologies, and availability of credible data, an area of artificial intelligence, deep learning, has emerged, and has demonstrated its ability and effectiveness in solving complex learning problems not possible before. In particular, convolution neural networks (CNNs) have demonstrated their effectiveness in image detection and recognition applications. However, they require intensive CPU operations and memory bandwidth that make general CPUs fail to achieve desired performance levels. Consequently, hardware accelerators that use application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), and graphic processing units (GPUs) have been employed to improve the throughput of CNNs. More precisely, FPGAs have been recently adopted for accelerating the implementation of deep learning networks due to their ability to maximize parallelism as well as due to their energy efficiency. In this paper, we review recent existing techniques for accelerating deep learning networks on FPGAs. We highlight the key features employed by the various techniques for improving the acceleration performance. In addition, we provide recommendations for enhancing the utilization of FPGAs for CNNs acceleration. The techniques investigated in this paper represent the recent trends in FPGA-based accelerators of deep learning networks. Thus, this review is expected to direct the future advances on efficient hardware accelerators and to be useful for deep learning researchers.Comment: This article has been accepted for publication in IEEE Access (December, 2018

arXiv.org e-Print Archive

The Python user interface of the elsA cfd software: a coupling framework for external steering layers

Author: Lazareff Marc
Publication venue
Publication date: 18/11/2016
Field of study

The Python--elsA user interface of the elsA cfd (Computational Fluid Dynamics) software has been developed to allow users to specify simulations with confidence, through a global context of description objects grouped inside scripts. The software main features are generated documentation, context checking and completion, and helpful error management. Further developments have used this foundation as a coupling framework, allowing (thanks to the descriptive approach) the coupling of external algorithms with the cfd solver in a simple and abstract way, leading to more success in complex simulations. Along with the description of the technical part of the interface, we try to gather the salient points pertaining to the psychological viewpoint of user experience (ux). We point out the differences between user interfaces and pure data management systems such as cgns

arXiv.org e-Print Archive

Kernel methods on spike train space for neuroscience: a tutorial

Author: Li Lin
Paiva Antonio R. C.
Park Il Memming
Principe Jose C.
Seth Sohan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/02/2013
Field of study

Over the last decade several positive definite kernels have been proposed to treat spike trains as objects in Hilbert space. However, for the most part, such attempts still remain a mere curiosity for both computational neuroscientists and signal processing experts. This tutorial illustrates why kernel methods can, and have already started to, change the way spike trains are analyzed and processed. The presentation incorporates simple mathematical analogies and convincing practical examples in an attempt to show the yet unexplored potential of positive definite functions to quantify point processes. It also provides a detailed overview of the current state of the art and future challenges with the hope of engaging the readers in active participation.Comment: 12 pages, 8 figures, accepted in IEEE Signal Processing Magazin

arXiv.org e-Print Archive

ClangJIT: Enhancing C++ with Just-in-Time Compilation

Author: Finkel Hal
Poliakoff David
Richards David F.
Publication venue
Publication date: 27/04/2019
Field of study

The C++ programming language is not only a keystone of the high-performance-computing ecosystem but has proven to be a successful base for portable parallel-programming frameworks. As is well known, C++ programmers use templates to specialize algorithms, thus allowing the compiler to generate highly-efficient code for specific parameters, data structures, and so on. This capability has been limited to those specializations that can be identified when the application is compiled, and in many critical cases, compiling all potentially-relevant specializations is not practical. ClangJIT provides a well-integrated C++ language extension allowing template-based specialization to occur during program execution. This capability has been implemented for use in large-scale applications, and we demonstrate that just-in-time-compilation-based dynamic specialization can be integrated into applications, often requiring minimal changes (or no changes) to the applications themselves, providing significant performance improvements, programmer-productivity improvements, and decreased compilation time

arXiv.org e-Print Archive