Search CORE

13,495 research outputs found

OpenCLIPER: an OpenCL-based C++ Framework for Overhead-Reduced Medical Image Processing and Reconstruction on Heterogeneous Devices

Author: Alberola-López Carlos
Martín-Fernández Marcos
Martín-González Elena
Moya-Sáez Elisa
Rodríguez-Cayetano Manuel
Royuela-del-Val Javier
Simmross-Wattenberg Federico
Publication venue
Publication date: 31/07/2018
Field of study

Medical image processing is often limited by the computational cost of the involved algorithms. Whereas dedicated computing devices (GPUs in particular) exist and do provide significant efficiency boosts, they have an extra cost of use in terms of housekeeping tasks (device selection and initialization, data streaming, synchronization with the CPU and others), which may hinder developers from using them. This paper describes an OpenCL-based framework that is capable of handling dedicated computing devices seamlessly and that allows the developer to concentrate on image processing tasks. The framework handles automatically device discovery and initialization, data transfers to and from the device and the file system and kernel loading and compiling. Data structures need to be defined only once independently of the computing device; code is unique, consequently, for every device, including the host CPU. Pinned memory/buffer mapping is used to achieve maximum performance in data transfers. Code fragments included in the paper show how the computing device is almost immediately and effortlessly available to the users algorithms, so they can focus on productive work. Code required for device selection and initialization, data loading and streaming and kernel compilation is minimal and systematic. Algorithms can be thought of as mathematical operators (called processes), with input, output and parameters, and they may be chained one after another easily and efficiently. Also for efficiency, processes can have their initialization work split from their core workload, so process chains and loops do not incur in performance penalties. Algorithm code is independent of the device type targeted

arXiv.org e-Print Archive

Repositorio Documental de la Universidad de Valladolid

C Language Extensions for Hybrid CPU/GPU Programming with StarPU

Author: Courtès Ludovic
Publication venue
Publication date: 05/04/2013
Field of study

Modern platforms used for high-performance computing (HPC) include machines with both general-purpose CPUs, and "accelerators", often in the form of graphical processing units (GPUs). StarPU is a C library to exploit such platforms. It provides users with ways to define "tasks" to be executed on CPUs or GPUs, along with the dependencies among them, and by automatically scheduling them over all the available processing units. In doing so, it also relieves programmers from the need to know the underlying architecture details: it adapts to the available CPUs and GPUs, and automatically transfers data between main memory and GPUs as needed. While StarPU's approach is successful at addressing run-time scheduling issues, being a C library makes for a poor and error-prone programming interface. This paper presents an effort started in 2011 to promote some of the concepts exported by the library as C language constructs, by means of an extension of the GCC compiler suite. Our main contribution is the design and implementation of language extensions that map to StarPU's task programming paradigm. We argue that the proposed extensions make it easier to get started with StarPU,eliminate errors that can occur when using the C library, and help diagnose possible mistakes. We conclude on future work

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

A Language and Hardware Independent Approach to Quantum-Classical Computing

Author: Chen Mengsu
Dumitrescu Eugene F.
Feng Wu-chun
Humble Travis S.
Liakh Dmitry
McCaskey Alexander J.
Publication venue
Publication date: 01/01/2018
Field of study

Heterogeneous high-performance computing (HPC) systems offer novel architectures which accelerate specific workloads through judicious use of specialized coprocessors. A promising architectural approach for future scientific computations is provided by heterogeneous HPC systems integrating quantum processing units (QPUs). To this end, we present XACC (eXtreme-scale ACCelerator) --- a programming model and software framework that enables quantum acceleration within standard or HPC software workflows. XACC follows a coprocessor machine model that is independent of the underlying quantum computing hardware, thereby enabling quantum programs to be defined and executed on a variety of QPUs types through a unified application programming interface. Moreover, XACC defines a polymorphic low-level intermediate representation, and an extensible compiler frontend that enables language independent quantum programming, thus promoting integration and interoperability across the quantum programming landscape. In this work we define the software architecture enabling our hardware and language independent approach, and demonstrate its usefulness across a range of quantum computing models through illustrative examples involving the compilation and execution of gate and annealing-based quantum programs

arXiv.org e-Print Archive

Directory of Open Access Journals

Using RDF to Model the Structure and Process of Systems

Author: Bollen Johan
Gershenson Carlos
Rodriguez Marko A.
Watkins Jennifer H.
Publication venue
Publication date: 01/01/2007
Field of study

Many systems can be described in terms of networks of discrete elements and their various relationships to one another. A semantic network, or multi-relational network, is a directed labeled graph consisting of a heterogeneous set of entities connected by a heterogeneous set of relationships. Semantic networks serve as a promising general-purpose modeling substrate for complex systems. Various standardized formats and tools are now available to support practical, large-scale semantic network models. First, the Resource Description Framework (RDF) offers a standardized semantic network data model that can be further formalized by ontology modeling languages such as RDF Schema (RDFS) and the Web Ontology Language (OWL). Second, the recent introduction of highly performant triple-stores (i.e. semantic network databases) allows semantic network models on the order of

10^9

edges to be efficiently stored and manipulated. RDF and its related technologies are currently used extensively in the domains of computer science, digital library science, and the biological sciences. This article will provide an introduction to RDF/RDFS/OWL and an examination of its suitability to model discrete element complex systems.Comment: International Conference on Complex Systems, Boston MA, October 200

arXiv.org e-Print Archive

CiteSeerX