Search CORE

248 research outputs found

Doctor of Philosophy

Author: Sun Weibin
Publication venue: University of Utah
Publication date: 01/08/2014
Field of study

dissertationAs the base of the software stack, system-level software is expected to provide ecient and scalable storage, communication, security and resource management functionalities. However, there are many computationally expensive functionalities at the system level, such as encryption, packet inspection, and error correction. All of these require substantial computing power. What's more, today's application workloads have entered gigabyte and terabyte scales, which demand even more computing power. To solve the rapidly increased computing power demand at the system level, this dissertation proposes using parallel graphics pro- cessing units (GPUs) in system software. GPUs excel at parallel computing, and also have a much faster development trend in parallel performance than central processing units (CPUs). However, system-level software has been originally designed to be latency-oriented. GPUs are designed for long-running computation and large-scale data processing, which are throughput-oriented. Such mismatch makes it dicult to t the system-level software with the GPUs. This dissertation presents generic principles of system-level GPU computing developed during the process of creating our two general frameworks for integrating GPU computing in storage and network packet processing. The principles are generic design techniques and abstractions to deal with common system-level GPU computing challenges. Those principles have been evaluated in concrete cases including storage and network packet processing applications that have been augmented with GPU computing. The signicant performance improvement found in the evaluation shows the eectiveness and eciency of the proposed techniques and abstractions. This dissertation also presents a literature survey of the relatively young system-level GPU computing area, to introduce the state of the art in both applications and techniques, and also their future potentials

The University of Utah: J. Willard Marriott Digital Library

Accelerated volumetric reconstruction from uncalibrated camera views

Author: Brisc Felicia
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/03/2008
Field of study

While both work with images, computer graphics and computer vision are inverse problems. Computer graphics starts traditionally with input geometric models and produces image sequences. Computer vision starts with input image sequences and produces geometric models. In the last few years, there has been a convergence of research to bridge the gap between the two fields. This convergence has produced a new field called Image-based Rendering and Modeling (IBMR). IBMR represents the effort of using the geometric information recovered from real images to generate new images with the hope that the synthesized ones appear photorealistic, as well as reducing the time spent on model creation. In this dissertation, the capturing, geometric and photometric aspects of an IBMR system are studied. A versatile framework was developed that enables the reconstruction of scenes from images acquired with a handheld digital camera. The proposed system targets applications in areas such as Computer Gaming and Virtual Reality, from a lowcost perspective. In the spirit of IBMR, the human operator is allowed to provide the high-level information, while underlying algorithms are used to perform low-level computational work. Conforming to the latest architecture trends, we propose a streaming voxel carving method, allowing a fast GPU-based processing on commodity hardware

Irish Universities

DCU Online Research Access Service

GPU Accelerated protocol analysis for large and long-term traffic traces

Author: Nottingham Alastair Timothy
Publication venue: Faculty of Science, Computer Science
Publication date: 01/01/2016
Field of study

This thesis describes the design and implementation of GPF+, a complete general packet classification system developed using Nvidia CUDA for Compute Capability 3.5+ GPUs. This system was developed with the aim of accelerating the analysis of arbitrary network protocols within network traffic traces using inexpensive, massively parallel commodity hardware. GPF+ and its supporting components are specifically intended to support the processing of large, long-term network packet traces such as those produced by network telescopes, which are currently difficult and time consuming to analyse. The GPF+ classifier is based on prior research in the field, which produced a prototype classifier called GPF, targeted at Compute Capability 1.3 GPUs. GPF+ greatly extends the GPF model, improving runtime flexibility and scalability, whilst maintaining high execution efficiency. GPF+ incorporates a compact, lightweight registerbased state machine that supports massively-parallel, multi-match filter predicate evaluation, as well as efficient arbitrary field extraction. GPF+ tracks packet composition during execution, and adjusts processing at runtime to avoid redundant memory transactions and unnecessary computation through warp-voting. GPF+ additionally incorporates a 128-bit in-thread cache, accelerated through register shuffling, to accelerate access to packet data in slow GPU global memory. GPF+ uses a high-level DSL to simplify protocol and filter creation, whilst better facilitating protocol reuse. The system is supported by a pipeline of multi-threaded high-performance host components, which communicate asynchronously through 0MQ messaging middleware to buffer, index, and dispatch packet data on the host system. The system was evaluated using high-end Kepler (Nvidia GTX Titan) and entry level Maxwell (Nvidia GTX 750) GPUs. The results of this evaluation showed high system performance, limited only by device side IO (600MBps) in all tests. GPF+ maintained high occupancy and device utilisation in all tests, without significant serialisation, and showed improved scaling to more complex filter sets. Results were used to visualise captures of up to 160 GB in seconds, and to extract and pre-filter captures small enough to be easily analysed in applications such as Wireshark

South East Academic Libraries System (SEALS)

Rhodes Repository (SEALS)

A Scalable Framework for Monte Carlo Simulation Using FPGA-based Hardware Accelerators with Application to SPECT Imaging A SCALABLE FRAMEWORK FOR MONTE CARLO SIMULATION USING FPGA-BASED HARDWARE ACCELERATORS WITH APPLICATION TO SPECT IMAGING TITLE: A Scal

Author: B.Eng. Phillip J Kinsman
Biomedical Engineering
David Leung
Electrical B Eng
Ho Fai Ko
Kaveh Elizeh
Mark Jobes
Phillip J Kinsman
Roomi Sahi
Publication venue
Publication date: 11/04/2020
Field of study

Abstract As the number of transistors that are integrated onto a silicon die continues to increase, the compute power is becoming a commodity. This has enabled a whole host of new applications that rely on high-throughput computations. Recently, the need for faster and cost-effective applications in form-factor constrained environments has driven an interest in on-chip acceleration of algorithms based on Monte Carlo simula- Processor. Futhermore, we have created a framework for further increasing parallelism by scaling our architecture across multiple compute devices and by extending our original design to a multi-FPGA system nearly linear increase in acceleration with logic resources was achieved. iv Acknowledgements One could hardly put into words the contributions made to this work by the many wonderful people who surround me on a daily basis. I count myself blessed to have family, friends and colleagues that support and encourage me and to recognize each individually would be impossible. Nonetheless, there are some people without whose explicit mention this thesis would be incomplete

CiteSeerX

Hardware Acceleration Using Functional Languages

Author: Hodaňová Andrea
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2013
Field of study

Cílem této práce je prozkoumat možnosti využití funkcionálního paradigmatu pro hardwarovou akceleraci, konkrétně pro datově paralelní úlohy. Úroveň abstrakce tradičních jazyků pro popis hardwaru, jako VHDL a Verilog, přestáví stačit. Pro popis na algoritmické či behaviorální úrovni se rozmáhají jazyky původně navržené pro vývoj softwaru a modelování, jako C/C++, SystemC nebo MATLAB. Funkcionální jazyky se s těmi imperativními nemůžou měřit v rozšířenosti a oblíbenosti mezi programátory, přesto je předčí v mnoha vlastnostech, např. ve verifikovatelnosti, schopnosti zachytit inherentní paralelismus a v kompaktnosti kódu. Pro akceleraci datově paralelních výpočtů se často používají jednotky FPGA, grafické karty (GPU) a vícejádrové procesory. Praktická část této práce rozšiřuje existující knihovnu Accelerate pro počítání na grafických kartách o výstup do VHDL. Accelerate je možno chápat jako doménově specifický jazyk vestavěný do Haskellu s backendem pro prostředí NVIDIA CUDA. Rozšíření pro vysokoúrovňovou syntézu obvodů ve VHDL představené v této práci používá stejný jazyk a frontend.The aim of this thesis is to research how the functional paradigm can be used for hardware acceleration with an emphasis on data-parallel tasks. The level of abstraction of the traditional hardware description languages, such as VHDL or Verilog, is becoming to low. High-level languages from the domains of software development and modeling, such as C/C++, SystemC or MATLAB, are experiencing a boom for hardware description on the algorithmic or behavioral level. Functional Languages are not so commonly used, but they outperform imperative languages in verification, the ability to capture inherent paralellism and the compactness of code. Data-parallel task are often accelerated on FPGAs, GPUs and multicore processors. In this thesis, we use a library for general-purpose GPU programs called Accelerate and extend it to produce VHDL. Accelerate is a domain-specific language embedded into Haskell with a backend for the NVIDIA CUDA platform. We use the language and its frontend, and create a new backend for high-level synthesis of circuits in VHDL.

Digital library of Brno University of Technology

National Repository of Grey Literature

A Real-Time Predictive Vehicular Collision Avoidance System on an Embedded General-Purpose GPU

Author: Hegman Andrew
Publication venue: Scholars Junction
Publication date: 10/08/2018
Field of study

Collision avoidance is an essential capability for autonomous and assisted-driving ground vehicles. In this work, we developed a novel model predictive control based intelligent collision avoidance (CA) algorithm for a multi-trailer industrial ground vehicle implemented on a General Purpose Graphical Processing Unit (GPGPU). The CA problem is formulated as a multi-objective optimal control problem and solved using a limited look-ahead control scheme in real-time. Through hardware-in-the-loop-simulations and experimental results obtained in this work, we have demonstrated that the proposed algorithm, using NVIDA’s CUDA framework and the NVIDIA Jetson TX2 development platform, is capable of dynamically assisting drivers and maintaining the vehicle a safe distance from the detected obstacles on-thely. We have demonstrated that a GPGPU, paired with an appropriate algorithm, can be the key enabler in relieving the computational burden that is commonly associated with model-based control problems and thus make them suitable for real-time applications

Scholars Junction - Mississippi State University Institutional Repository

Development of a GPGPU accelerated tool to simulate advection-reaction-diffusion phenomena in 2D

Author: Carlotto Tomas
Publication venue: UFFS
Publication date: 02/05/2018
Field of study

Computational models are powerful tools to the study of environmental systems, playing a fundamental role in several fields of research (hydrological sciences, biomathematics, atmospheric sciences, geosciences, among others). Most of these models require high computational capacity, especially when one considers high spatial resolution and the application to large areas. In this context, the exponential increase in computational power brought by General Purpose Graphics Processing Units (GPGPU) has drawn the attention of scientists and engineers to the development of low cost and high performance parallel implementations of environmental models. In this research, we apply GPGPU computing for the development of a model that describes the physical processes of advection, reaction and diffusion. This presentation is held in the form of three self-contained articles. In the first one, we present a GPGPU implementation for the solution of the 2D groundwater flow equation in unconfined aquifers for heterogenous and anisotropic media. We implement a finite difference solution scheme based on the Crank- Nicolson method and show that the GPGPU accelerated solution implemented using CUDA C/C++ (Compute Unified Device Architecture) greatly outperforms the corresponding serial solution implemented in C/C++. The results show that accelerated GPGPU implementation is capable of delivering up to 56 times acceleration in the solution process using an ordinary office computer. In the second article, we study the application of a diffusive-logistic growth (DLG) model to the problem of forest growth and regeneration. The study focuses on vegetation belonging to preservation areas, such as riparian buffer zones. The study was developed in two stages: (i) a methodology based on Artificial Neural Network Ensembles (ANNE) was applied to evaluate the width of riparian buffer required to filter 90% of the residual nitrogen; (ii) the DLG model was calibrated and validated to generate a prognostic of forest regeneration in riparian protection bands considering the minimum widths indicated by the ANNE. The solution was implemented in GPGPU and it was applied to simulate the forest regeneration process for forty years on the riparian protection bands along the Ligeiro river, in Brazil. The results from calibration and validation showed that the DLG model provides fairly accurate results for the modelling of forest regeneration. In the third manuscript, we present a GPGPU implementation of the solution of the advection-reaction-diffusion equation in 2D. The implementation is designed to be general and flexible to allow the modeling of a wide range of processes, including those with heterogeneity and anisotropy. We show that simulations performed in GPGPU allow the use of mesh grids containing more than 20 million points, corresponding to an area of 18,000 km? in a standard Landsat image resolution.Os modelos computacionais s?o ferramentas poderosas para o estudo de sistemas ambientais, desempenhando um papel fundamental em v?rios campos de pesquisa (ci?ncias hidrol?gicas, biomatem?tica, ci?ncias atmosf?ricas, geoci?ncias, entre outros). A maioria desses modelos requer alta capacidade computacional, especialmente quando se considera uma alta resolu??o espacial e a aplica??o em grandes ?reas. Neste contexto, o aumento exponencial do poder computacional trazido pelas Unidades de Processamento de Gr?ficos de Prop?sito Geral (GPGPU) chamou a aten??o de cientistas e engenheiros para o desenvolvimento de implementa??es paralelas de baixo custo e alto desempenho para modelos ambientais. Neste trabalho, aplicamos computa??o em GPGPU para o desenvolvimento de um modelo que descreve os processos f?sicos de advec??o, rea??o e difus?o. Esta disserta??o ? apresentada sob a forma de tr?s artigos. No primeiro, apresentamos uma implementa??o em GPGPU para a solu??o da equa??o de fluxo de ?guas subterr?neas 2D em aqu?feros n?o confinados para meios heterog?neos e anisotr?picos. Foi implementado um esquema de solu??o de diferen?as finitas com base no m?todo Crank- Nicolson e mostramos que a solu??o acelerada GPGPU implementada usando CUDA C / C ++ supera a solu??o serial correspondente implementada em C / C ++. Os resultados mostram que a implementa??o acelerada por GPGPU ? capaz de fornecer acelera??o de at? 56 vezes no processo da solu??o usando um computador de escrit?rio comum. No segundo artigo estudamos a aplica??o de um modelo de crescimento log?stico difusivo (DLG) ao problema de crescimento e regenera??o florestal. O estudo foi desenvolvido em duas etapas: (i) Aplicou-se uma metodologia baseada em Comites de Rede Neural Artificial (ANNE) para avaliar a largura da faixa de prote??o rip?ria necess?ria para filtrar 90% do nitrog?nio residual; (ii) O modelo DLG foi calibrado e validado para gerar um progn?stico de regenera??o florestal em faixas de prote??o rip?rias considerando as larguras m?nimas indicadas pela ANNE. A solu??o foi implementada em GPGPU e aplicada para simular o processo de regenera??o florestal para um per?odo de quarenta anos na faixa de prote??o rip?ria ao longo do rio Ligeiro, no Brasil. Os resultados da calibra??o e valida??o mostraram que o modelo DLG fornece resultados bastante precisos para a modelagem de regenera??o florestal. No terceiro artigo, apresenta-se uma implementa??o em GPGPU para solu??o da equa??o advec??o-rea??o-difus?o em 2D. A implementa??o ? projetada para ser geral e flex?vel para permitir a modelagem de uma ampla gama de processos, incluindo caracter?sticas como heterogeneidade e anisotropia do meio. Neste trabalho mostra-se que as simula??es realizadas em GPGPU permitem o uso de malhas contendo mais de 20 milh?es de pontos (vari?veis), correspondendo a uma ?rea de 18.000 km? em resolu??o de 30m padr?o das imagens Landsat

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidade Federal da Fronteira Sul

Maruchitenantona kuraudo kankyōka deno GPU no daiikkyū keisan shigen toshite no chūshōka

Author: Suzuki Yusuke
スズキユウスケ
鈴木勇介
Publication venue: 慶應義塾大学大学院理工学研究科
Publication date
Field of study

KeiO Academic Resource Archive

Recommended from our members

Simulation for Reliability, Hardware Security, and Ising Computing in VLSI Chip Design

Author: Cook Chase William
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The continued scaling of VLSI circuits has provided a wealth of opportunities andchallenges to the VLSI circuit design area. Both these challenges and opportunities, however,require new simulation tools that can enable their solution or exploitation as classicalmethods typically dealt with problem domains with smaller scales or less complexity. Inthis dissertation, simulation methods are presented to address the emerging VLSI designtopics of Electromigration induced aging and Ising computing and are then applied to theapplication areas of hardware security and graph partitioning respectively.The Electromigration aging effect in VLSI circuits is a long-term reliability issueaffecting current carrying metal wires leading to IR drop degradation. Typically, simpleanalytical equations can determine a wire’s effective age or if it will be affected by the EMaging effect at all. However, these classical methods are overly conservative and can lead toover design or unnecessary design iterations. Furthermore, it is expected that the EM agingeffect will become more severe in future Integrated Cirucits (ICs) due to increasing currentdensities and the prevalance of polycrystaline copper atom structures seen at small wiredimensions. For this reason, more comprehensive simulation techniques that can efficientlysimulate the EM effect with less conservative results can help mitigate overdesign andincrease design margins while reducing design iterations.The area of Hardware Security is becoming increasingly important as the chipsupply chain becomes more globalized and the integrity of chips becomes more diffiuclt toverify. Utilizing the accurate simulation techniques for EM, we can utilize this reliabilityeffect to demonstrate how a reliability based attack could be perpatrated. Furthermore, wecan utilize this aging effect as a defense mechanism to help us validate the integrity of anIC and detect counterfeit chips in the component supply chain market.Ising computing is an emerging method of solving combinatorial optimization problemsby simulating the interactions of so-called spin glasses and their interactions. Borrowingconcepts from quantum computing, this methods mimics the quantum interaction betweenspin glasses in such a way that finding a ground state of these spin glass models leadsto the solution of a particular problem. In this dissertation, effective methods of simulatingthe spin glass interactions using General Purpose Graphics Processing Units (GPGPUs)and finding their ground state are developed.In addition to the GPU based Ising model simulations, important combinatorialproblems can be mapped to the Ising model. In this dissertation the Ising solver is appliedto graph partitioning which can be utilized in VLSI design and many other domains as well.Specifically, solvers for the maxcut problem and the balanced min-cut partitioning problemare developed

eScholarship - University of California