Search CORE

136 research outputs found

Real-Time Implementation and Performance Optimization of Local Derivative Pattern Algorithm on GPUs

Author: Chandran Nisha
Gangodkar Durgaprasad
Mittal Ankush
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/12/2018
Field of study

Pattern based texture descriptors are widely used in Content Based Image Retrieval (CBIR) for efficient retrieval of matching images. Local Derivative Pattern (LDP), a higher order local pattern operator, originally proposed for face recognition, encodes the distinctive spatial relationships contained in a local region of an image as the feature vector. LDP efficiently extracts finer details and provides efficient retrieval however, it was proposed for images of limited resolution. Over the period of time the development in the digital image sensors had paid way for capturing images at a very high resolution. LDP algorithm though very efficient in content-based image retrieval did not scale well when capturing features from such high-resolution images as it becomes computationally very expensive. This paper proposes how to efficiently extract parallelism from the LDP algorithm and strategies for optimally implementing it by exploiting some inherent General-Purpose Graphics Processing Unit (GPGPU) characteristics. By optimally configuring the GPGPU kernels, image retrieval was performed at a much faster rate. The LDP algorithm was ported on to Compute Unified Device Architecture (CUDA) supported GPGPU and a maximum speed up of around 240x was achieved as compared to its sequential counterpart

IAES journal

Crossref

Institute of Advanced Engineering and Science

Doctor of Philosophy

Author: Ha Linh Khanh
Publication venue: University of Utah
Publication date: 15/08/2011
Field of study

dissertationStochastic methods, dense free-form mapping, atlas construction, and total variation are examples of advanced image processing techniques which are robust but computationally demanding. These algorithms often require a large amount of computational power as well as massive memory bandwidth. These requirements used to be ful lled only by supercomputers. The development of heterogeneous parallel subsystems and computation-specialized devices such as Graphic Processing Units (GPUs) has brought the requisite power to commodity hardware, opening up opportunities for scientists to experiment and evaluate the in uence of these techniques on their research and practical applications. However, harnessing the processing power from modern hardware is challenging. The di fferences between multicore parallel processing systems and conventional models are signi ficant, often requiring algorithms and data structures to be redesigned signi ficantly for efficiency. It also demands in-depth knowledge about modern hardware architectures to optimize these implementations, sometimes on a per-architecture basis. The goal of this dissertation is to introduce a solution for this problem based on a 3D image processing framework, using high performance APIs at the core level to utilize parallel processing power of the GPUs. The design of the framework facilitates an efficient application development process, which does not require scientists to have extensive knowledge about GPU systems, and encourages them to harness this power to solve their computationally challenging problems. To present the development of this framework, four main problems are described, and the solutions are discussed and evaluated: (1) essential components of a general 3D image processing library: data structures and algorithms, as well as how to implement these building blocks on the GPU architecture for optimal performance; (2) an implementation of unbiased atlas construction algorithms|an illustration of how to solve a highly complex and computationally expensive algorithm using this framework; (3) an extension of the framework to account for geometry descriptors to solve registration challenges with large scale shape changes and high intensity-contrast di fferences; and (4) an out-of-core streaming model, which enables developers to implement multi-image processing techniques on commodity hardware

The University of Utah: J. Willard Marriott Digital Library

A Real-Time Capable Software-Defined Receiver Using GPU for Adaptive Anti-Jam GPS Sensors

Author: Applebaum
Balaei
Balaei
Balaei
Borre
Datta-Barua
David S. De Lorenzo
Dennis Akos
Enge
Fante
Hobiger
Jan
Jiwon Seo
Jiyun Lee
Kalyanaraman
Kaplan
Kirk
Lee
Misra
Murphy
O’Brien
Pelletier
Per Enge
Rife
Rife
Sherman Lo
Soloviev
Yu-Hsuan Chen
Publication venue: Molecular Diversity Preservation International (MDPI)
Publication date: 01/01/2011
Field of study

Due to their weak received signal power, Global Positioning System (GPS) signals are vulnerable to radio frequency interference. Adaptive beam and null steering of the gain pattern of a GPS antenna array can significantly increase the resistance of GPS sensors to signal interference and jamming. Since adaptive array processing requires intensive computational power, beamsteering GPS receivers were usually implemented using hardware such as field-programmable gate arrays (FPGAs). However, a software implementation using general-purpose processors is much more desirable because of its flexibility and cost effectiveness. This paper presents a GPS software-defined radio (SDR) with adaptive beamsteering capability for anti-jam applications. The GPS SDR design is based on an optimized desktop parallel processing architecture using a quad-core Central Processing Unit (CPU) coupled with a new generation Graphics Processing Unit (GPU) having massively parallel processors. This GPS SDR demonstrates sufficient computational capability to support a four-element antenna array and future GPS L5 signal processing in real time. After providing the details of our design and optimization schemes for future GPU-based GPS SDR developments, the jamming resistance of our GPS SDR under synthetic wideband jamming is presented. Since the GPS SDR uses commercial-off-the-shelf hardware and processors, it can be easily adopted in civil GPS applications requiring anti-jam capabilities

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Statistical and Machine Learning Analysis of the Human Brain Functional Network in a Multi-Site Resting-State Functional MRI Database Framework

Author: Artiles Oswaldo
Saeed Fahad, Ed.
Publication venue: FIU Digital Commons
Publication date: 01/01/2023
Field of study

The human brain has a complex network structure that is non-random and multiscale. It consists of subsystems coupled by a nonlinear dynamic, enabling it to produce complex responses to various external inputs and self-organize. To understand the physical structure and specific brain functions, it is essential to comprehend the connectivity of the hundreds of billions of neurons in the human brain. Functional connectivity (FC) in modern neuroscience is the statistical temporal dependencies between neuronal activation events occurring in spatially separated brain regions. Resting-state functional magnetic resonance imaging (rs-fMRI) is a non-invasive imaging technique widely used in neuroscience to understand the functional connectivity of the human brain. The studies presented in this dissertation were based on the models and methods from network neuroscience, which is an active area of research developed in the last three decades. These methods were used to model and analyze the functional human brain networks in a multi-site rs-fMRI data framework. The contributions made in this dissertation to the study of the functional connectivity of the human brain network are: 1. The GPU-based Sparse Fast Fourier Transform (SFFT) of k-sparse signals; 2. The GPU-based breadth-first search algorithm; 3. The GPU-based betweenness centrality graph metric algorithm; 4. A comprehensive approach to solving the problem of confounding effects in the machine learning classification models of rs-fMRI multi-site data; and 5. A preliminary assessment of time-varying functional connectivity in a multisite data rs-fMRI framework. We hope that the neuroscience research community will use and improve these contributions to enhance the discovery of the functions and structure of the human brain. This will lead to a better understanding of the causes of brain disorders and the development of useful and effective biomarkers for their diagnosis

DigitalCommons@Florida International University

Large-scale Machine Learning in High-dimensional Datasets

Author: Hansen Toke Jansen
Publication venue: Technical University of Denmark
Publication date: 01/01/2013
Field of study

Online Research Database In Technology

Efficient probabilistic and geometric anatomical mapping using particle mesh approximation on GPUs

Author: Gerig Guido
Ha Linh
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2011
Field of study

pre-printDeformable image registration in the presence of considerable contrast differences and large size and shape changes presents significant research challenges. First, it requires a robust registration framework that does not depend on intensity measurements and can handle large nonlinear shape variations. Second, it involves the expensive computation of nonlinear deformations with high degrees of freedom. Often it takes a significant amount of computation time and thus becomes infeasible for practical purposes. In this paper, we present a solution based on two key ideas: a new registration method that generates a mapping between anatomies represented as a multicompartment model of class posterior images and geometries and an implementation of the algorithm using particle mesh approximation on Graphical Processing Units (GPUs) to fulfill the computational requirements. We show results on the registrations of neonatal to 2-year old infant MRIs. Quantitative validation demonstrates that our proposed method generates registrations that better maintain the consistency of anatomical structures over time and provides transformations that better preserve structures undergoing large deformations than transformations obtained by standard intensity-only registration. We also achieve the speedup of three orders of magnitudes compared to a CPU reference implementation, making it possible to use the technique in time-critical applications

The University of Utah: J. Willard Marriott Digital Library

An Evaluation of Emerging Many-Core Parallel Programming Models

Author: Boulton Michael
Gaudin Wayne
Martineau Matt J
McIntosh-Smith Simon N
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/03/2016
Field of study

Crossref

Explore Bristol Research

Efficient Probabilistic and Geometric Anatomical Mapping Using Particle Mesh Approximation on GPUs

Author: Gerig Guido
Gilmore John H.
Ha Linh
Joshi Sarang
Prastawa Marcel
Silva Cláudio T.
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2011
Field of study

Deformable image registration in the presence of considerable contrast differences and large size and shape changes presents significant research challenges. First, it requires a robust registration framework that does not depend on intensity measurements and can handle large nonlinear shape variations. Second, it involves the expensive computation of nonlinear deformations with high degrees of freedom. Often it takes a significant amount of computation time and thus becomes infeasible for practical purposes. In this paper, we present a solution based on two key ideas: a new registration method that generates a mapping between anatomies represented as a multicompartment model of class posterior images and geometries and an implementation of the algorithm using particle mesh approximation on Graphical Processing Units (GPUs) to fulfill the computational requirements. We show results on the registrations of neonatal to 2-year old infant MRIs. Quantitative validation demonstrates that our proposed method generates registrations that better maintain the consistency of anatomical structures over time and provides transformations that better preserve structures undergoing large deformations than transformations obtained by standard intensity-only registration. We also achieve the speedup of three orders of magnitudes compared to a CPU reference implementation, making it possible to use the technique in time-critical applications

Crossref

Directory of Open Access Journals

PubMed Central