424 research outputs found
FASTCUDA: Open Source FPGA Accelerator & Hardware-Software Codesign Toolset for CUDA Kernels
Using FPGAs as hardware accelerators that communicate with a central CPU is becoming a common practice in the embedded design world but there is no standard methodology and toolset to facilitate this path yet. On the other hand, languages such as CUDA and OpenCL provide standard development environments for Graphical Processing Unit (GPU) programming. FASTCUDA is a platform that provides the necessary software toolset, hardware architecture, and design methodology to efficiently adapt the CUDA approach into a new FPGA design flow. With FASTCUDA, the CUDA kernels of a CUDA-based application are partitioned into two groups with minimal user intervention: those that are compiled and executed in parallel software, and those that are synthesized and implemented in hardware. A modern low power FPGA can provide the processing power (via numerous embedded micro-CPUs) and the logic capacity for both the software and hardware implementations of the CUDA kernels. This paper describes the system requirements and the architectural decisions behind the FASTCUDA approach
Design Ltd.: Renovated Myths for the Development of Socially Embedded Technologies
This paper argues that traditional and mainstream mythologies, which have
been continually told within the Information Technology domain among designers
and advocators of conceptual modelling since the 1960s in different fields of
computing sciences, could now be renovated or substituted in the mould of more
recent discourses about performativity, complexity and end-user creativity that
have been constructed across different fields in the meanwhile. In the paper,
it is submitted that these discourses could motivate IT professionals in
undertaking alternative approaches toward the co-construction of
socio-technical systems, i.e., social settings where humans cooperate to reach
common goals by means of mediating computational tools. The authors advocate
further discussion about and consolidation of some concepts in design research,
design practice and more generally Information Technology (IT) development,
like those of: task-artifact entanglement, universatility (sic) of End-User
Development (EUD) environments, bricolant/bricoleur end-user, logic of
bricolage, maieuta-designers (sic), and laissez-faire method to socio-technical
construction. Points backing these and similar concepts are made to promote
further discussion on the need to rethink the main assumptions underlying IT
design and development some fifty years later the coming of age of software and
modern IT in the organizational domain.Comment: This is the peer-unreviewed of a manuscript that is to appear in D.
Randall, K. Schmidt, & V. Wulf (Eds.), Designing Socially Embedded
Technologies: A European Challenge (2013, forthcoming) with the title
"Building Socially Embedded Technologies: Implications on Design" within an
EUSSET editorial initiative (www.eusset.eu/
Semantic transfer in Verbmobil
This paper is a detailed discussion of semantic transfer in the context of the Verbmobil Machine Translation project. The use of semantic transfer as a translation mechanism is introduced and justified by comparison with alternative approaches. Some criteria for evaluation of transfer frameworks are discussed and a comparison is made of three different approaches to the representation of translation rules or equivalences. This is followed by a discussion of control of application of transfer rules and interaction with a domain description and inference component
Towards Automatic Learning of Heuristics for Mechanical Transformations of Procedural Code
The current trend in next-generation exascale systems goes towards
integrating a wide range of specialized (co-)processors into traditional
supercomputers. However, the integration of different specialized devices
increases the degree of heterogeneity and the complexity in programming such
type of systems. Due to the efficiency of heterogeneous systems in terms of
Watt and FLOPS per surface unit, opening the access of heterogeneous platforms
to a wider range of users is an important problem to be tackled. In order to
bridge the gap between heterogeneous systems and programmers, in this paper we
propose a machine learning-based approach to learn heuristics for defining
transformation strategies of a program transformation system. Our approach
proposes a novel combination of reinforcement learning and classification
methods to efficiently tackle the problems inherent to this type of systems.
Preliminary results demonstrate the suitability of the approach for easing the
programmability of heterogeneous systems.Comment: Part of the Program Transformation for Programmability in
Heterogeneous Architectures (PROHA) workshop, Barcelona, Spain, 12th March
2016, 9 pages, LaTe
A framework for automatically generating optimized digital designs from C-language loops
Reconfigurable computing has the potential for providing significant performance increases to a number of computing applications. However, realizing these benefits requires digital design experience and knowledge of hardware description languages (HDLs). While a number of tools have focused on translation of high-level languages (HLLs) to HDLs, the tools do not always create optimized digital designs that are competitive with hand-coded solutions. This work describes an automatic optimization in the C-to-HDL transformation that reorganizes operations between pipeline stages in order to reduce critical path lengths. The effects of this optimization are examined on the MD5, SHA-1, and Smith-Waterman algorithms. Results show that the optimization results in performance gains of 13%-37% and that the automatically-generated implementations perform comparably to hand-coded implementations
A framework for automatically generating optimized digital designs from C-language loops
Reconfigurable computing has the potential for providing significant performance increases to a number of computing applications. However, realizing these benefits requires digital design experience and knowledge of hardware description languages (HDLs). While a number of tools have focused on translation of high-level languages (HLLs) to HDLs, the tools do not always create optimized digital designs that are competitive with hand-coded solutions. This work describes an automatic optimization in the C-to-HDL transformation that reorganizes operations between pipeline stages in order to reduce critical path lengths. The effects of this optimization are examined on the MD5, SHA-1, and Smith-Waterman algorithms. Results show that the optimization results in performance gains of 13%-37% and that the automatically-generated implementations perform comparably to hand-coded implementations
- …