Search CORE

5,481 research outputs found

High volume colour image processing with massively parallel embedded processors

Author: Bond Winston
Jacobs Jan
Pouls Roel
Smit Gerard J.M.
Publication venue: University of Twente, CTIT
Publication date: 01/01/2005
Field of study

Currently Oc´e uses FPGA technology for implementing colour image processing for their high volume colour printers. Although FPGA technology provides enough performance it, however, has a rather tedious development process. This paper describes the research conducted on an alternative implementation technology: software defined massively parallel processing. It is shown that this technology not only leads to a reduction in development time but also adds flexibility to the design

Juelich Shared Electronic Resources

University of Twente Research Information

Mining Dynamic Document Spaces with Massively Parallel Embedded Processors

Author: Dai Rui
Jacobs Jan W.M.
Smit Gerard J.M.
Publication venue: Springer Verlag
Publication date: 01/01/2006
Field of study

Currently Océ investigates future document management services. One of these services is accessing dynamic document spaces, i.e. improving the access to document spaces which are frequently updated (like newsgroups). This process is rather computational intensive. This paper describes the research conducted on software development for massively parallel processors. A prototype has been built which processes streams of information from specified newsgroups and transforms them into personal information maps. Although this technology does speed up the training part compared to a general purpose processor implementation, however, its real benefits emerges with larger problem dimensions because of the scalable approach. It is recommended to improve on quality of the map as well as on visualisation and to better profile the performance of the other parts of the pipeline, i.e. feature extraction and visualisation

University of Twente Research Information

Memory and information processing in neuromorphic systems

Author: Indiveri Giacomo
Liu Shih-Chii
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this paper we present a survey of brain-inspired processor architectures that support models of cortical networks and deep neural networks. These architectures range from serial clocked implementations of multi-neuron systems to massively parallel asynchronous ones and from purely digital systems to mixed analog/digital systems which implement more biological-like models of neurons and synapses together with a suite of adaptation and learning mechanisms analogous to the ones found in biological nervous systems. We describe the advantages of the different approaches being pursued and present the challenges that need to be addressed for building artificial neural processing systems that can display the richness of behaviors seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed neuromorphic computing platforms and system

arXiv.org e-Print Archive

ZORA

GPU-based pedestrian detection for autonomous driving

Author: Campmany Canes Victor
Universitat Autònoma de Barcelona. Escola d'Enginyeria
Publication venue
Publication date: 11/02/2016
Field of study

Pedestrian detection has gained a lot of prominence during the last few years. Besides the fact that it is one of the hardest tasks within computer vision, it involves huge computational costs. Obtaining acceptable real-time performance, measured in frames per second (fps), for the most advanced algorithms is nowadays a hard challenge. In this work, we propose a GPU implementation of a well-known pedestrian detection system (i.e., HOGLBP-SVM) specially designed for the Tegra X1 embedded GPU. It includes LBP and HOG as feature descriptors and SVM as classifiers. We introduce significant algorithmic adjustments and optimizations to adapt the problem to the NVIDIA GPU architecture without sacrificing accuracy. The aim of this work is to offer a real-time system providing reliable results.La detecció de vianants ha estat un tema de molt interès els darrers anys. A part de ser una de les tasques més complexes de la visió per computador, implica uns costos computacionals molt elevats. Obtenir un rendiment de temps real acceptable, mesurat en imatges processades per segon (fps), per la majoria d'algoritmes més avançats és una fita complicada. Aquest treball proposa una implementació en GPU d'un conegut detector de vianants (i.e., HOGLBP-SVM) dissenyat expressament per la Tegra X1, una GPU encastada. El detector inclou els mètodes LBP i HOG com descriptors de característiques i un SVM com a classificador. El sistema introdueix ajustos algorítmics i optimitzacions per adaptar el problema a l'arquitectura d'una GPU NVIDIA sense sacrificar precisió. L'objectiu és proporcionar un sistema de temps real que alhora sigui robust.La detección de peatones ha ganado mucho interés en los últimos años. A parte de ser una de las tareas más complejas dentro la visión por computador, esta implica unos costes computacionales muy elevados. Obtener un rendimiento de tiempo real aceptable, medido en imágenes procesadas por segundo (fps), para la mayoría de algoritmos más avanzados es un hito complicado. Este trabajo propone una implementación en GPU de un conocido detector de peatones (i.e., HOGLBP-SVM) diseñado para la Tegra X1, una GPU embebida. El detector incluye los metodos LBP i HOG como descriptores de características i un SVM como clasificador. El sistema introduce ajustes algorítmicos i optimizaciones para adaptar el problema a la arquitectura de una GPU NVIDIA sin sacrificar precisión. El objetivo es proporcionar un sistema de tiempo real que a la vez sea robusto

Diposit Digital de Documents de la UAB

Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device

Author: Buckley Kevan
Sillitoe Ian
Yang Shufan
Yu Zheqi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Currently, most designers face a daunting task to research different design flows and learn the intricacies of specific software from various manufacturers in hardware/software co-design. An urgent need of creating a scalable hardware/software co-design platform has become a key strategic element for developing hardware/software integrated systems. In this paper, we propose a new design flow for building a scalable co-design platform on FPGA-based system-on-chip. We employ an integrated approach to implement a histogram oriented gradients (HOG) and a support vector machine (SVM) classification on a programmable device for pedestrian tracking. Not only was hardware resource analysis reported, but the precision and success rates of pedestrian tracking on nine open access image data sets are also analysed. Finally, our proposed design flow can be used for any real-time image processingrelated products on programmable ZYNQ-based embedded systems, which benefits from a reduced design time and provide a scalable solution for embedded image processing products

Enlighten

Analysing Astronomy Algorithms for GPUs and Beyond

Author: Amdahl
Asanovic
B. R. Barsdell
Bate
Belleman
Blelloch
Brunner
C. J. Fluke
Che
Clark
D. G. Barnes
Hamada
Harris
Högbom
Jonsson
Kayser
Knuth
Levoy
Moore
Nitadori
Schive
Schneider
Schneider
Taylor
Thompson
Wambsganss
Wayth
Publication venue: 'Wiley'
Publication date: 01/01/2010
Field of study

Astronomy depends on ever increasing computing power. Processor clock-rates have plateaued, and increased performance is now appearing in the form of additional processor cores on a single chip. This poses significant challenges to the astronomy software community. Graphics Processing Units (GPUs), now capable of general-purpose computation, exemplify both the difficult learning-curve and the significant speedups exhibited by massively-parallel hardware architectures. We present a generalised approach to tackling this paradigm shift, based on the analysis of algorithms. We describe a small collection of foundation algorithms relevant to astronomy and explain how they may be used to ease the transition to massively-parallel computing architectures. We demonstrate the effectiveness of our approach by applying it to four well-known astronomy problems: Hogbom CLEAN, inverse ray-shooting for gravitational lensing, pulsar dedispersion and volume rendering. Algorithms with well-defined memory access patterns and high arithmetic intensity stand to receive the greatest performance boost from massively-parallel architectures, while those that involve a significant amount of decision-making may struggle to take advantage of the available processing power.Comment: 10 pages, 3 figures, accepted for publication in MNRA

arXiv.org e-Print Archive

CiteSeerX

Crossref

Swinburne Research Bank

OpenForensics:a digital forensics GPU pattern matching approach for the 21st century

Author: Bayne Ethan
Ferguson R. I.
Sampson A. T.
Publication venue
Publication date: 21/03/2018
Field of study

Pattern matching is a crucial component employed in many digital forensic (DF) analysis techniques, such as file-carving. The capacity of storage available on modern consumer devices has increased substantially in the past century, making pattern matching approaches of current generation DF tools increasingly ineffective in performing timely analyses on data seized in a DF investigation. As pattern matching is a trivally parallelisable problem, general purpose programming on graphic processing units (GPGPU) is a natural fit for this problem. This paper presents a pattern matching framework - OpenForensics - that demonstrates substantial performance improvements from the use of modern parallelisable algorithms and graphic processing units (GPUs) to search for patterns within forensic images and local storage devices

Abertay Research Portal