37,488 research outputs found
Evaluating Rapid Application Development with Python for Heterogeneous Processor-based FPGAs
As modern FPGAs evolve to include more het- erogeneous processing elements,
such as ARM cores, it makes sense to consider these devices as processors first
and FPGA accelerators second. As such, the conventional FPGA develop- ment
environment must also adapt to support more software- like programming
functionality. While high-level synthesis tools can help reduce FPGA
development time, there still remains a large expertise gap in order to realize
highly performing implementations. At a system-level the skill set necessary to
integrate multiple custom IP hardware cores, interconnects, memory interfaces,
and now heterogeneous processing elements is complex. Rather than drive FPGA
development from the hardware up, we consider the impact of leveraging Python
to ac- celerate application development. Python offers highly optimized
libraries from an incredibly large developer community, yet is limited to the
performance of the hardware system. In this work we evaluate the impact of
using PYNQ, a Python development environment for application development on the
Xilinx Zynq devices, the performance implications, and bottlenecks associated
with it. We compare our results against existing C-based and hand-coded
implementations to better understand if Python can be the glue that binds
together software and hardware developers.Comment: To appear in 2017 IEEE 25th Annual International Symposium on
Field-Programmable Custom Computing Machines (FCCM'17
Lightweight Multilingual Software Analysis
Developer preferences, language capabilities and the persistence of older
languages contribute to the trend that large software codebases are often
multilingual, that is, written in more than one computer language. While
developers can leverage monolingual software development tools to build
software components, companies are faced with the problem of managing the
resultant large, multilingual codebases to address issues with security,
efficiency, and quality metrics. The key challenge is to address the opaque
nature of the language interoperability interface: one language calling
procedures in a second (which may call a third, or even back to the first),
resulting in a potentially tangled, inefficient and insecure codebase. An
architecture is proposed for lightweight static analysis of large multilingual
codebases: the MLSA architecture. Its modular and table-oriented structure
addresses the open-ended nature of multiple languages and language
interoperability APIs. We focus here as an application on the construction of
call-graphs that capture both inter-language and intra-language calls. The
algorithms for extracting multilingual call-graphs from codebases are
presented, and several examples of multilingual software engineering analysis
are discussed. The state of the implementation and testing of MLSA is
presented, and the implications for future work are discussed.Comment: 15 page
Towards Python-based Domain-specific Languages for Self-reconfigurable Modular Robotics Research
This paper explores the role of operating system and high-level languages in
the development of software and domain-specific languages (DSLs) for
self-reconfigurable robotics. We review some of the current trends in
self-reconfigurable robotics and describe the development of a software system
for ATRON II which utilizes Linux and Python to significantly improve software
abstraction and portability while providing some basic features which could
prove useful when using Python, either stand-alone or via a DSL, on a
self-reconfigurable robot system. These features include transparent socket
communication, module identification, easy software transfer and reliable
module-to-module communication. The end result is a software platform for
modular robots that where appropriate builds on existing work in operating
systems, virtual machines, middleware and high-level languages.Comment: Presented at DSLRob 2011 (arXiv:1212.3308
Recommended from our members
Integrated Dynamic Facade Control with an Agent-based Architecture for Commercial Buildings
Dynamic façades have significant technical potential to minimize heating, cooling, and lighting energy use and peak electric demand in the perimeter zone of commercial buildings, but the performance of these systems is reliant on being able to balance complex trade-offs between solar control, daylight admission, comfort, and view over the life of the installation. As the context for controllable energy-efficiency technologies grows more complex with the increased use of intermittent renewable energy resources on the grid, it has become increasingly important to look ahead towards more advanced approaches to integrated systems control in order to achieve optimum life-cycle performance at a lower cost. This study examines the feasibility of a model predictive control system for low-cost autonomous dynamic façades. A system architecture designed around lightweight, simple agents is proposed. The architecture accommodates whole building and grid level demands through its modular, hierarchical approach. Automatically-generated models for computing window heat gains, daylight illuminance, and discomfort glare are described. The open source Modelica and JModelica software tools were used to determine the optimum state of control given inputs of window heat gains and lighting loads for a 24-hour optimization horizon. Penalty functions for glare and view/ daylight quality were implemented as constraints. The control system was tested on a low-power controller (1.4 GHz single core with 2 GB of RAM) to evaluate feasibility. The target platform is a low-cost ($35/unit) embedded controller with 1.2 GHz dual-core cpu and 1 GB of RAM. Configuration and commissioning of the curtainwall unit was designed to be largely plug and play with minimal inputs required by the manufacturer through a web-based user interface. An example application was used to demonstrate optimal control of a three-zone electrochromic window for a south-facing zone. The overall approach was deemed to be promising. Further engineering is required to enable scalable, turnkey solutions
cphVB: A System for Automated Runtime Optimization and Parallelization of Vectorized Applications
Modern processor architectures, in addition to having still more cores, also
require still more consideration to memory-layout in order to run at full
capacity. The usefulness of most languages is deprecating as their
abstractions, structures or objects are hard to map onto modern processor
architectures efficiently.
The work in this paper introduces a new abstract machine framework, cphVB,
that enables vector oriented high-level programming languages to map onto a
broad range of architectures efficiently. The idea is to close the gap between
high-level languages and hardware optimized low-level implementations. By
translating high-level vector operations into an intermediate vector bytecode,
cphVB enables specialized vector engines to efficiently execute the vector
operations.
The primary success parameters are to maintain a complete abstraction from
low-level details and to provide efficient code execution across different,
modern, processors. We evaluate the presented design through a setup that
targets multi-core CPU architectures. We evaluate the performance of the
implementation using Python implementations of well-known algorithms: a jacobi
solver, a kNN search, a shallow water simulation and a synthetic stencil
simulation. All demonstrate good performance
Making an Embedded DBMS JIT-friendly
While database management systems (DBMSs) are highly optimized, interactions
across the boundary between the programming language (PL) and the DBMS are
costly, even for in-process embedded DBMSs. In this paper, we show that
programs that interact with the popular embedded DBMS SQLite can be
significantly optimized - by a factor of 3.4 in our benchmarks - by inlining
across the PL / DBMS boundary. We achieved this speed-up by replacing parts of
SQLite's C interpreter with RPython code and composing the resulting
meta-tracing virtual machine (VM) - called SQPyte - with the PyPy VM. SQPyte
does not compromise stand-alone SQL performance and is 2.2% faster than SQLite
on the widely used TPC-H benchmark suite.Comment: 24 pages, 18 figure
Fine-grained Language Composition: A Case Study
Although run-time language composition is common, it normally takes the form
of a crude Foreign Function Interface (FFI). While useful, such compositions
tend to be coarse-grained and slow. In this paper we introduce a novel
fine-grained syntactic composition of PHP and Python which allows users to
embed each language inside the other, including referencing variables across
languages. This composition raises novel design and implementation challenges.
We show that good solutions can be found to the design challenges; and that the
resulting implementation imposes an acceptable performance overhead of, at
most, 2.6x.Comment: 27 pages, 4 tables, 5 figure
The Astrophysical Multipurpose Software Environment
We present the open source Astrophysical Multi-purpose Software Environment
(AMUSE, www.amusecode.org), a component library for performing astrophysical
simulations involving different physical domains and scales. It couples
existing codes within a Python framework based on a communication layer using
MPI. The interfaces are standardized for each domain and their implementation
based on MPI guarantees that the whole framework is well-suited for distributed
computation. It includes facilities for unit handling and data storage.
Currently it includes codes for gravitational dynamics, stellar evolution,
hydrodynamics and radiative transfer. Within each domain the interfaces to the
codes are as similar as possible. We describe the design and implementation of
AMUSE, as well as the main components and community codes currently supported
and we discuss the code interactions facilitated by the framework.
Additionally, we demonstrate how AMUSE can be used to resolve complex
astrophysical problems by presenting example applications.Comment: 23 pages, 25 figures, accepted for A&
PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation
High-performance computing has recently seen a surge of interest in
heterogeneous systems, with an emphasis on modern Graphics Processing Units
(GPUs). These devices offer tremendous potential for performance and efficiency
in important large-scale applications of computational science. However,
exploiting this potential can be challenging, as one must adapt to the
specialized and rapidly evolving computing environment currently exhibited by
GPUs. One way of addressing this challenge is to embrace better techniques and
develop tools tailored to their needs. This article presents one simple
technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL,
two open-source toolkits that support this technique.
In introducing PyCUDA and PyOpenCL, this article proposes the combination of
a dynamic, high-level scripting language with the massive performance of a GPU
as a compelling two-tiered computing platform, potentially offering significant
performance and productivity advantages over conventional single-tier, static
systems. The concept of RTCG is simple and easily implemented using existing,
robust infrastructure. Nonetheless it is powerful enough to support (and
encourage) the creation of custom application-specific tools by its users. The
premise of the paper is illustrated by a wide range of examples where the
technique has been applied with considerable success.Comment: Submitted to Parallel Computing, Elsevie
- …