Search CORE

3,582 research outputs found

From a Competition for Self-Driving Miniature Cars to a Standardized Experimental Platform: Concept, Models, Architecture, and Evaluation

Author: Berger Christian
Publication venue
Publication date: 01/01/2014
Field of study

Context: Competitions for self-driving cars facilitated the development and research in the domain of autonomous vehicles towards potential solutions for the future mobility. Objective: Miniature vehicles can bridge the gap between simulation-based evaluations of algorithms relying on simplified models, and those time-consuming vehicle tests on real-scale proving grounds. Method: This article combines findings from a systematic literature review, an in-depth analysis of results and technical concepts from contestants in a competition for self-driving miniature cars, and experiences of participating in the 2013 competition for self-driving cars. Results: A simulation-based development platform for real-scale vehicles has been adapted to support the development of a self-driving miniature car. Furthermore, a standardized platform was designed and realized to enable research and experiments in the context of future mobility solutions. Conclusion: A clear separation between algorithm conceptualization and validation in a model-based simulation environment enabled efficient and riskless experiments and validation. The design of a reusable, low-cost, and energy-efficient hardware architecture utilizing a standardized software/hardware interface enables experiments, which would otherwise require resources like a large real-scale test track.Comment: 17 pages, 19 figues, 2 table

arXiv.org e-Print Archive

Chalmers Research

Conceptual roles of data in program: analyses and applications

Author: Deng Yunbo
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2003
Field of study

Program comprehension is the prerequisite for many software evolution and maintenance tasks. Currently, the research falls short in addressing how to build tools that can use domain-specific knowledge to provide powerful capabilities for extracting valuable information for facilitating program comprehension. Such capabilities are critical for working with large and complex program where program comprehension often is not possible without the help of domain-specific knowledge.;Our research advances the state-of-art in program analysis techniques based on domain-specific knowledge. The program artifacts including variables and methods are carriers of domain concepts that provide the key to understand programs. Our program analysis is directed by domain knowledge stored as domain-specific rules. Our analysis is iterative and interactive. It is based on flexible inference rules and inter-exchangeable and extensible information storage. We designed and developed a comprehensive software environment SeeCORE based on our knowledge-centric analysis methodology. The SeeCORE tool provides multiple views and abstractions to assist in understanding complex programs. The case studies demonstrate the effectiveness of our method. We demonstrate the flexibility of our approach by analyzing two legacy programs in distinct domains

Digital Repository @ Iowa State University (ISU)

Heterogeneous computing with an algorithmic skeleton framework

Author: Soldado Fábio Miguel Cardoso
Publication venue
Publication date: 01/03/2014
Field of study

The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite its specific purpose design, they have been increasingly used for general computations with very good results. Hence, there is a growing effort from the community to seamlessly integrate this kind of devices in everyday computing. However, to fully exploit the potential of a system comprising GPUs and CPUs, these devices should be presented to the programmer as a single platform. The efficient combination of the power of CPU and GPU devices is highly dependent on each device’s characteristics, resulting in platform specific applications that cannot be ported to different systems. Also, the most efficient work balance among devices is highly dependable on the computations to be performed and respective data sizes. In this work, we propose a solution for heterogeneous environments based on the abstraction level provided by algorithmic skeletons. Our goal is to take full advantage of the power of all CPU and GPU devices present in a system, without the need for different kernel implementations nor explicit work-distribution.To that end, we extended Marrow, an algorithmic skeleton framework for multi-GPUs, to support CPU computations and efficiently balance the work-load between devices. Our approach is based on an offline training execution that identifies the ideal work balance and platform configurations for a given application and input data size. The evaluation of this work shows that the combination of CPU and GPU devices can significantly boost the performance of our benchmarks in the tested environments, when compared to GPU-only executions

Repositório da Universidade Nova de Lisboa

Toward optimised skeletons for heterogeneous parallel architecture with performance cost model

Author: Armih Khari A.
Publication venue: Mathematical and Computer Sciences
Publication date: 01/07/2013
Field of study

High performance architectures are increasingly heterogeneous with shared and distributed memory components, and accelerators like GPUs. Programming such architectures is complicated and performance portability is a major issue as the architectures evolve. This thesis explores the potential for algorithmic skeletons integrating a dynamically parametrised static cost model, to deliver portable performance for mostly regular data parallel programs on heterogeneous archi- tectures. The rst contribution of this thesis is to address the challenges of program- ming heterogeneous architectures by providing two skeleton-based programming libraries: i.e. HWSkel for heterogeneous multicore clusters and GPU-HWSkel that enables GPUs to be exploited as general purpose multi-processor devices. Both libraries provide heterogeneous data parallel algorithmic skeletons including hMap, hMapAll, hReduce, hMapReduce, and hMapReduceAll. The second contribution is the development of cost models for workload dis- tribution. First, we construct an architectural cost model (CM1) to optimise overall processing time for HWSkel heterogeneous skeletons on a heterogeneous system composed of networks of arbitrary numbers of nodes, each with an ar- bitrary number of cores sharing arbitrary amounts of memory. The cost model characterises the components of the architecture by the number of cores, clock speed, and crucially the size of the L2 cache. Second, we extend the HWSkel cost model (CM1) to account for GPU performance. The extended cost model (CM2) is used in the GPU-HWSkel library to automatically nd a good distribution for both a single heterogeneous multicore/GPU node, and clusters of heteroge- neous multicore/GPU nodes. Experiments are carried out on three heterogeneous multicore clusters, four heterogeneous multicore/GPU clusters, and three single heterogeneous multicore/GPU nodes. The results of experimental evaluations for four data parallel benchmarks, i.e. sumEuler, Image matching, Fibonacci, and Matrix Multiplication, show that our combined heterogeneous skeletons and cost models can make good use of resources in heterogeneous systems. Moreover using cores together with a GPU in the same host can deliver good performance either on a single node or on multiple node architectures

ROS: The Research Output Service. Heriot-Watt University Edinburgh

Improving Utility of GPU in Accelerating Industrial Applications with User-centred Automatic Code Translation

Author: Anvari-Moghaddam Amjad
Codreanu Valeriu
Dong Feng
Liu Baoquan
Min Geyong
Roerdink Jos B. T. M.
Williams David
Yang Po
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2018
Field of study

SMEs (Small and medium-sized enterprises), particularly those whose business is focused on developing innovative produces, are limited by a major bottleneck on the speed of computation in many applications. The recent developments in GPUs have been the marked increase in their versatility in many computational areas. But due to the lack of specialist GPU (Graphics processing units) programming skills, the explosion of GPU power has not been fully utilized in general SME applications by inexperienced users. Also, existing automatic CPU-to-GPU code translators are mainly designed for research purposes with poor user interface design and hard-to-use. Little attentions have been paid to the applicability, usability and learnability of these tools for normal users. In this paper, we present an online automated CPU-to-GPU source translation system, (GPSME) for inexperienced users to utilize GPU capability in accelerating general SME applications. This system designs and implements a directive programming model with new kernel generation scheme and memory management hierarchy to optimize its performance. A web-service based interface is designed for inexperienced users to easily and flexibly invoke the automatic resource translator. Our experiments with non-expert GPU users in 4 SMEs reflect that GPSME system can efficiently accelerate real-world applications with at least 4x and have a better applicability, usability and learnability than existing automatic CPU-to-GPU source translators

LJMU Research Online (Liverpool John Moores University)

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

VBN

White Rose Research Online

University of Bedfordshire Repository

Dissertations of the University of Groningen

The Role of Agent Interaction in Models of Computing: Panelist Reviews

Author: Arbab F
Goldin D
Luck M
McBurney P
Robertson D
Wegner P
Publication venue
Publication date: 01/01/2005
Field of study

Southampton (e-Prints Soton)

Elsevier - Publisher Connector

CWI's Institutional Repository

Data- and task parallel image processing on a mixed SIMD-ILP platform using skeletons and asynchronous RPC

Author: Caarls W.
Corporaal H.
Jonker P.P.
Publication venue: STW Technology Foundation
Publication date: 01/01/2004
Field of study

Pure OAI Repository

Book of Abstracts of the Minisymposium on Publicly Available Geometric/Topological Software

Author: Karavelas Menelaos I.
Teillaud Monique
Publication venue
Publication date: 01/01/2012
Field of study

ACMAC

GSWO: A Programming Model for GPU-enabled Parallelization of Sliding Window Operations in Image Processing

Author: Baoquan Liu
Bloy
Bronstein
Chen
Chen
Conotter
Dally
David Williams
Feng Dong
Gil
Gordon Clapworthy
Han
Happ
Hardie
Hasan
Jos BTM Roerdink
Min
Nie
Oten
Po Yang
Sanchez
Shivakumara
Soille
Thurley
Valeriu Codreanu
Wang
Wu
Xu
Zhikun Deng
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Sliding Window Operations (SWOs) are widely used in image processing applications. They often have to be performed repeatedly across the target image, which can demand significant computing resources when processing large images with large windows. In applications in which real-time performance is essential, running these filters on a CPU often fails to deliver results within an acceptable timeframe. The emergence of sophisticated graphic processing units (GPUs) presents an opportunity to address this challenge. However, GPU programming requires a steep learning curve and is error-prone for novices, so the availability of a tool that can produce a GPU implementation automatically from the original CPU source code can provide an attractive means by which the GPU power can be harnessed effectively. This paper presents a GPUenabled programming model, called GSWO, which can assist GPU novices by converting their SWO-based image processing applications from the original C/C++ source code to CUDA code in a highly automated manner. This model includes a new set of simple SWO pragmas to generate GPU kernels and to support effective GPU memory management. We have implemented this programming model based on a CPU-to-GPU translator (C2GPU). Evaluations have been performed on a number of typical SWO image filters and applications. The experimental results show that the GSWO model is capable of efficiently accelerating these applications, with improved applicability and a speed-up of performance compared to several leading CPU-to- GPU source-to-source translators