Search CORE

205 research outputs found

A portable platform for accelerated PIC codes and its application to GPUs using OpenACC

Author: Brunner S.
Gheller G.
Hariri F.
Jocksch A.
Lanti E.
Messmer P.
Progsch J.
Tran T. M.
Villard L.
Publication venue: 'Elsevier BV'
Publication date: 09/03/2016
Field of study

We present a portable platform, called PIC_ENGINE, for accelerating Particle-In-Cell (PIC) codes on heterogeneous many-core architectures such as Graphic Processing Units (GPUs). The aim of this development is efficient simulations on future exascale systems by allowing different parallelization strategies depending on the application problem and the specific architecture. To this end, this platform contains the basic steps of the PIC algorithm and has been designed as a test bed for different algorithmic options and data structures. Among the architectures that this engine can explore, particular attention is given here to systems equipped with GPUs. The study demonstrates that our portable PIC implementation based on the OpenACC programming model can achieve performance closely matching theoretical predictions. Using the Cray XC30 system, Piz Daint, at the Swiss National Supercomputing Centre (CSCS), we show that PIC_ENGINE running on an NVIDIA Kepler K20X GPU can outperform the one on an Intel Sandybridge 8-core CPU by a factor of 3.4

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Repository for Publications and Research Data

Elsevier - Publisher Connector

Optimizing LBVH‐Construction and Hierarchy‐Traversal to accelerate kNN Queries on Point Clouds using the GPU

Author: Guthe Michael
Jakob Johannes
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

Crossref

EPub Bayreuth

GPU-Accelerated nearest neighbor search for 3d registration

Author: Andreas Nüchter
Deyuan Qiu
Stefan May
Publication venue
Publication date: 01/01/2009
Field of study

Abstract. Nearest Neighbor Search (NNS) is employed by many computer vision algorithms. The computational complexity is large and constitutes a challenge for real-time capability. The basic problem is in rapidly processing a huge amount of data, which is often addressed by means of highly sophisticated search methods and parallelism. We show that NNS based vision algorithms like the Iterative Closest Points algorithm (ICP) can achieve real-time capability while preserving compact size and moderate energy consumption as it is needed in robotics and many other domains. The approach exploits the concept of general purpose computation on graphics processing units (GPGPU) and is compared to parallel processing on CPU. We apply this approach to the 3D scan registration problem, for which a speed-up factor of 88 compared to a sequential CPU implementation is reported

CiteSeerX

Crossref

Fraunhofer-ePrints

pub H-BRS - Publikationsserver der Hochschule Bonn-Rhein-Sieg

Working With Incremental Spatial Data During Parallel (GPU) Computation

Author: Chisholm Robert
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 29/10/2019
Field of study

Central to many complex systems, spatial actors require an awareness of their local environment to enable behaviours such as communication and navigation. Complex system simulations represent this behaviour with Fixed Radius Near Neighbours (FRNN) search. This algorithm allows actors to store data at spatial locations and then query the data structure to find all data stored within a fixed radius of the search origin. The work within this thesis answers the question: What techniques can be used for improving the performance of FRNN searches during complex system simulations on Graphics Processing Units (GPUs)? It is generally agreed that Uniform Spatial Partitioning (USP) is the most suitable data structure for providing FRNN search on GPUs. However, due to the architectural complexities of GPUs, the performance is constrained such that FRNN search remains one of the most expensive common stages between complex systems models. Existing innovations to USP highlight a need to take advantage of recent GPU advances, reducing the levels of divergence and limiting redundant memory accesses as viable routes to improve the performance of FRNN search. This thesis addresses these with three separate optimisations that can be used simultaneously. Experiments have assessed the impact of optimisations to the general case of FRNN search found within complex system simulations and demonstrated their impact in practice when applied to full complex system models. Results presented show the performance of the construction and query stages of FRNN search can be improved by over 2x and 1.3x respectively. These improvements allow complex system simulations to be executed faster, enabling increases in scale and model complexity

White Rose E-theses Online

Large Spatial Database Indexing with aX-tree

Author: Hadeel Hadeel Jazzaa
Joan Lu
Samson Grace
Showole Aminat A.
Usman Mistura M.
Publication venue
Publication date: 05/04/2018
Field of study

Spatial databases are optimized for the management of data stored based on their geometric space. Researchers through high degree scalability have proposed several spatial indexing structures towards this effect. Among these indexing structures is the X-tree. The existing X-trees and its variants are designed for dynamic environment, with the capability for handling insertions and deletions. Notwithstanding, the X-tree degrades on retrieval performance as dimensionality increases and brings about poor worst-case performance than sequential scan. We propose a new X-tree packing techniques for static spatial databases which performs better in space utilization through cautious packing. This new improved structure yields two basic advantage: It reduces the space overhead of the index and produces a better response time, because the aX-tree has a higher fan-out and so the tree always ends up shorter. New model for super-node construction and effective method for optimal packing using an improved str bulk-loading technique is proposed. The study reveals that proposed system performs better than many existing spatial indexing structure

University of Huddersfield Repository

Co-design Hardware and Algorithm for Vector Search

Author: Alonso Gustavo
He Zhenhao
Hoefler Torsten
Jiang Wenqi
Li Shigang
Licht Johannes de Fine
Rekatsinas Theodoros
Renggli Cedric
Shi Runbin
Zhang Shuai
Zhu Yu
Publication venue
Publication date: 27/06/2023
Field of study

Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents. As performance demands for vector search systems surge, accelerated hardware offers a promising solution in the post-Moore's Law era. We introduce \textit{FANNS}, an end-to-end and scalable vector search framework on FPGAs. Given a user-provided recall requirement on a dataset and a hardware resource budget, \textit{FANNS} automatically co-designs hardware and algorithm, subsequently generating the corresponding accelerator. The framework also supports scale-out by incorporating a hardware TCP/IP stack in the accelerator. \textit{FANNS} attains up to 23.0

\times

and 37.2

\times

speedup compared to FPGA and CPU baselines, respectively, and demonstrates superior scalability to GPUs, achieving 5.5

\times

and 7.6

\times

speedup in median and 95\textsuperscript{th} percentile (P95) latency within an eight-accelerator configuration. The remarkable performance of \textit{FANNS} lays a robust groundwork for future FPGA integration in data centers and AI supercomputers.Comment: 11 page

arXiv.org e-Print Archive

Efficient nearest-neighbor computation for GPU-based motion planning

Author: Christian Lauterbach
Dinesh Manocha
Jia Pan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Abstract — We present a novel k-nearest neighbor search algorithm (KNNS) for proximity computation in motion planning algorithm that exploits the computational capa-bilities of many-core GPUs. Our approach uses locality sen-sitive hashing and cuckoo hashing to construct an efficient KNNS algorithm that has linear space and time complexity and exploits the multiple cores and data parallelism effec-tively. In practice, we see magnitude improvement in speed and scalability over prior GPU-based KNNS algorithm. On some benchmarks, our KNNS algorithm improves the performance of overall planner by 20−40 times for CPU-based planner and up to 2 times for GPU-based planner. I

CiteSeerX

Crossref

HKU Scholars Hub

New technologies for big multimedia data treatment

Author: Barrionuevo Mercedes Deolinda
Britos Luis
Bustos Fabricio
Gil Costa Graciela Verónica
Lopresti Mariela
Mancini Virginia
Miranda Natalia Carolina
Ochoa Cesar
Piccoli María Fabiana
Printista Alicia Marcela
Reyes Nora Susana
Publication venue: Iberoamerican Science & Technology Education Consortium
Publication date: 01/12/2013
Field of study

With the technology advance and the growth of Internet, the information that can be found in this net, as well as the number of users that access to look for speciﬁc data is bigger. Therefore, it is desirable to have a search system that allows to retrieve information at a reasonable time and in an efﬁcient way. In this paper we show two computing paradigms appropriate to apply in the treatment of large amounts of data consisting of objects such as images, text, sound and video, using hybrid computing over MPI+OpenMP and GPGPU. The proposal is developed through experience gained in the construction of various indexes and the subsequent search, through them, of multimedia objects.Fil: Barrionuevo, Mercedes Deolinda. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina;Fil: Britos, Luis. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina;Fil: Bustos, Fabricio. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina;Fil: Gil Costa, Graciela Verónica. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina;Fil: Lopresti, Mariela. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina;Fil: Mancini, Virginia. Universidad Nacional de San Luis. Facultad de Cs.fisico Matematicas y Naturales. Laboratorio de Inv.en Inteligencia Artificial; Argentina;Fil: Miranda, Natalia Carolina. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina;Fil: Ochoa, Cesar. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina;Fil: Piccoli, María Fabiana. Universidad Nacional de San Luis. Facultad de Cs.fisico Matematicas y Naturales. Laboratorio de Inv.en Inteligencia Artificial; Argentina;Fil: Printista, Alicia Marcela. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina;Fil: Reyes, Nora Susana. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Servicio de Difusión de la Creación Intelectual

Haptic Interaction with 3D oriented point clouds on the GPU

Author: Yanes Luis
Publication venue
Publication date: 01/09/2015
Field of study

Real-time point-based rendering and interaction with virtual objects is gaining popularity and importance as di�erent haptic devices and technologies increasingly provide the basis for realistic interaction. Haptic Interaction is being used for a wide range of applications such as medical training, remote robot operators, tactile displays and video games. Virtual object visualization and interaction using haptic devices is the main focus; this process involves several steps such as: Data Acquisition, Graphic Rendering, Haptic Interaction and Data Modi�cation. This work presents a framework for Haptic Interaction using the GPU as a hardware accelerator, and includes an approach for enabling the modi�cation of data during interaction. The results demonstrate the limits and capabilities of these techniques in the context of volume rendering for haptic applications. Also, the use of dynamic parallelism as a technique to scale the number of threads needed from the accelerator according to the interaction requirements is studied allowing the editing of data sets of up to one million points at interactive haptic frame rates

University of East Anglia digital repository