18,325 research outputs found
A Logic-Independent IDE
The author's MMT system provides a framework for defining and implementing
logical systems. By combining MMT with the jEdit text editor, we obtain a
logic-independent IDE. The IDE functionality includes advanced features such as
context-sensitive auto-completion, search, and change management.Comment: In Proceedings UITP 2014, arXiv:1410.785
PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation
High-performance computing has recently seen a surge of interest in
heterogeneous systems, with an emphasis on modern Graphics Processing Units
(GPUs). These devices offer tremendous potential for performance and efficiency
in important large-scale applications of computational science. However,
exploiting this potential can be challenging, as one must adapt to the
specialized and rapidly evolving computing environment currently exhibited by
GPUs. One way of addressing this challenge is to embrace better techniques and
develop tools tailored to their needs. This article presents one simple
technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL,
two open-source toolkits that support this technique.
In introducing PyCUDA and PyOpenCL, this article proposes the combination of
a dynamic, high-level scripting language with the massive performance of a GPU
as a compelling two-tiered computing platform, potentially offering significant
performance and productivity advantages over conventional single-tier, static
systems. The concept of RTCG is simple and easily implemented using existing,
robust infrastructure. Nonetheless it is powerful enough to support (and
encourage) the creation of custom application-specific tools by its users. The
premise of the paper is illustrated by a wide range of examples where the
technique has been applied with considerable success.Comment: Submitted to Parallel Computing, Elsevie
Efficiently Retrieving Function Dependencies in the Linux Kernel Using XSB
In this paper we investigate XSB-Prolog as a static analysis engine for data
represented by medium-sized graphs. We use XSB-Prolog to automatically identify
function dependencies in the Linux Kernel---queries that are difficult to
implement efficiently in a commodity database and that developers often have to
identify manually. This project illustrates that Prolog systems are ideal for
building tools for use in other disciplines that require sophisticated
inferences, because Prolog is both declarative and can efficiently implement
complex problem specifications through tabling and indexing.Comment: Part of WLPE 2013 proceedings (arXiv:1308.2055
Experiences from Exporting Major Proof Assistant Libraries
The interoperability of proof assistants and the integration of their
libraries is a highly valued but elusive goal in the field of theorem proving.
As a preparatory step, in previous work, we translated the libraries of
multiple proof assistants, specifically the ones of Coq, HOL Light, IMPS,
Isabelle, Mizar, and PVS into a universal format: OMDoc/MMT.
Each translation presented tremendous theoretical, technical, and social
challenges, some universal and some system-specific, some solvable and some
still open. In this paper, we survey these challenges and compare and evaluate
the solutions we chose.
We believe similar library translations will be an essential part of any
future system interoperability solution and our experiences will prove valuable
to others undertaking such efforts
LightNet: A Versatile, Standalone Matlab-based Environment for Deep Learning
LightNet is a lightweight, versatile and purely Matlab-based deep learning
framework. The idea underlying its design is to provide an easy-to-understand,
easy-to-use and efficient computational platform for deep learning research.
The implemented framework supports major deep learning architectures such as
Multilayer Perceptron Networks (MLP), Convolutional Neural Networks (CNN) and
Recurrent Neural Networks (RNN). The framework also supports both CPU and GPU
computation, and the switch between them is straightforward. Different
applications in computer vision, natural language processing and robotics are
demonstrated as experiments.Comment: Accepted to ACM MULTIMEDIA 2016 Open Source Software Competitio
Parallel Programming Models for Heterogeneous Many-Cores : A Survey
Heterogeneous many-cores are now an integral part of modern computing systems
ranging from embedding systems to supercomputers. While heterogeneous many-core
design offers the potential for energy-efficient high-performance, such
potential can only be unlocked if the application programs are suitably
parallel and can be made to match the underlying heterogeneous platform. In
this article, we provide a comprehensive survey for parallel programming models
for heterogeneous many-core architectures and review the compiling techniques
of improving programmability and portability. We examine various software
optimization techniques for minimizing the communicating overhead between
heterogeneous computing devices. We provide a road map for a wide variety of
different research areas. We conclude with a discussion on open issues in the
area and potential research directions. This article provides both an
accessible introduction to the fast-moving area of heterogeneous programming
and a detailed bibliography of its main achievements.Comment: Accepted to be published at CCF Transactions on High Performance
Computin
FPGA-based Accelerators of Deep Learning Networks for Learning and Classification: A Review
Due to recent advances in digital technologies, and availability of credible
data, an area of artificial intelligence, deep learning, has emerged, and has
demonstrated its ability and effectiveness in solving complex learning problems
not possible before. In particular, convolution neural networks (CNNs) have
demonstrated their effectiveness in image detection and recognition
applications. However, they require intensive CPU operations and memory
bandwidth that make general CPUs fail to achieve desired performance levels.
Consequently, hardware accelerators that use application specific integrated
circuits (ASICs), field programmable gate arrays (FPGAs), and graphic
processing units (GPUs) have been employed to improve the throughput of CNNs.
More precisely, FPGAs have been recently adopted for accelerating the
implementation of deep learning networks due to their ability to maximize
parallelism as well as due to their energy efficiency. In this paper, we review
recent existing techniques for accelerating deep learning networks on FPGAs. We
highlight the key features employed by the various techniques for improving the
acceleration performance. In addition, we provide recommendations for enhancing
the utilization of FPGAs for CNNs acceleration. The techniques investigated in
this paper represent the recent trends in FPGA-based accelerators of deep
learning networks. Thus, this review is expected to direct the future advances
on efficient hardware accelerators and to be useful for deep learning
researchers.Comment: This article has been accepted for publication in IEEE Access
(December, 2018
The Python user interface of the elsA cfd software: a coupling framework for external steering layers
The Python--elsA user interface of the elsA cfd (Computational Fluid
Dynamics) software has been developed to allow users to specify simulations
with confidence, through a global context of description objects grouped inside
scripts. The software main features are generated documentation, context
checking and completion, and helpful error management. Further developments
have used this foundation as a coupling framework, allowing (thanks to the
descriptive approach) the coupling of external algorithms with the cfd solver
in a simple and abstract way, leading to more success in complex simulations.
Along with the description of the technical part of the interface, we try to
gather the salient points pertaining to the psychological viewpoint of user
experience (ux). We point out the differences between user interfaces and pure
data management systems such as cgns
Kernel methods on spike train space for neuroscience: a tutorial
Over the last decade several positive definite kernels have been proposed to
treat spike trains as objects in Hilbert space. However, for the most part,
such attempts still remain a mere curiosity for both computational
neuroscientists and signal processing experts. This tutorial illustrates why
kernel methods can, and have already started to, change the way spike trains
are analyzed and processed. The presentation incorporates simple mathematical
analogies and convincing practical examples in an attempt to show the yet
unexplored potential of positive definite functions to quantify point
processes. It also provides a detailed overview of the current state of the art
and future challenges with the hope of engaging the readers in active
participation.Comment: 12 pages, 8 figures, accepted in IEEE Signal Processing Magazin
ClangJIT: Enhancing C++ with Just-in-Time Compilation
The C++ programming language is not only a keystone of the
high-performance-computing ecosystem but has proven to be a successful base for
portable parallel-programming frameworks. As is well known, C++ programmers use
templates to specialize algorithms, thus allowing the compiler to generate
highly-efficient code for specific parameters, data structures, and so on. This
capability has been limited to those specializations that can be identified
when the application is compiled, and in many critical cases, compiling all
potentially-relevant specializations is not practical. ClangJIT provides a
well-integrated C++ language extension allowing template-based specialization
to occur during program execution. This capability has been implemented for use
in large-scale applications, and we demonstrate that
just-in-time-compilation-based dynamic specialization can be integrated into
applications, often requiring minimal changes (or no changes) to the
applications themselves, providing significant performance improvements,
programmer-productivity improvements, and decreased compilation time
- …