105 research outputs found
Configurable 3D-integrated focal-plane sensor-processor array architecture
A mixed-signal Cellular Visual Microprocessor architecture with digital processors is
described. An ASIC implementation is also demonstrated. The architecture is composed of a
regular sensor readout circuit array, prepared for 3D face-to-face type integration, and one or
several cascaded array of mainly identical (SIMD) processing elements. The individual array
elements derived from the same general HDL description and could be of different in size, aspect
ratio, and computing resources
GPU-Based Simulation of Cellular Neural Networks for Image Processing
The inherent massive parallelism of cellular neural networks makes them an ideal computational platform for kernelbased
algorithms and image processing. General-purpose GPUs provide similar massive parallelism, but it can be difficult to
design algorithms to make optimal use of the hardware. The presented research includes a GPU abstraction based on cellular neural networks. The abstraction offers a simplified view of massively parallel computation which remains reasonably efficient. An image processing library with visualization software has been developed to showcase the flexibility and power of cellular computation on GPUs. Benchmarks of the library indicate that
commodity GPUs can be used to significantly accelerate CNN research and offer a viable alternative to CPU-based image processing algorithms
Dynamically reconfigurable architecture for embedded computer vision systems
The objective of this research work is to design, develop and implement a new architecture which integrates on the same chip all the processing levels of a complete Computer Vision system, so that the execution is efficient without compromising the power consumption while keeping a reduced cost. For this purpose, an analysis and classification of different mathematical operations and algorithms commonly used in Computer Vision are carried out, as well as a in-depth review of the image processing capabilities of current-generation hardware devices. This permits to determine the requirements and the key aspects for an efficient architecture. A representative set of algorithms is employed as benchmark to evaluate the proposed architecture, which is implemented on an FPGA-based system-on-chip. Finally, the prototype is compared to other related approaches in order to determine its advantages and weaknesses
2D operators on topographic and non-topographic architectures-implementation, efficiency analysis, and architecture selection methodology
Topographic and non-topographic image processing architectures and chips, developed within the CNN community recently, are analysed and compared. It is achieved on a way that the 2D operators are collected to classes according to their implementation methods on the different architectures, and the main implementation parameters of the different operator classes are compared. Based on the results, an efficient architecture selection methodology is formalized
GPU-Accelerated Computation of Vietoris-Rips Persistence Barcodes
The computation of Vietoris-Rips persistence barcodes is both
execution-intensive and memory-intensive. In this paper, we study the
computational structure of Vietoris-Rips persistence barcodes, and identify
several unique mathematical properties and algorithmic opportunities with
connections to the GPU. Mathematically and empirically, we look into the
properties of apparent pairs, which are independently identifiable persistence
pairs comprising up to 99% of persistence pairs. We give theoretical upper and
lower bounds of the apparent pair rate and model the average case. We also
design massively parallel algorithms to take advantage of the very large number
of simplices that can be processed independently of each other. Having
identified these opportunities, we develop a GPU-accelerated software for
computing Vietoris-Rips persistence barcodes, called Ripser++. The software
achieves up to 30x speedup over the total execution time of the original Ripser
and also reduces CPU-memory usage by up to 2.0x. We believe our
GPU-acceleration based efforts open a new chapter for the advancement of
topological data analysis in the post-Moore's Law era.Comment: 36 pages, 15 figures. To be published in Symposium on Computational
Geometry 202
An investigation into adaptive power reduction techniques for neural hardware
In light of the growing applicability of Artificial Neural Network (ANN) in the signal processing field [1] and the present thrust of the semiconductor industry towards lowpower SOCs for mobile devices [2], the power consumption of ANN hardware has become a very important implementation issue. Adaptability is a powerful and useful feature of neural networks. All current approaches for low-power ANN hardware techniques are ânon-adaptiveâ with respect to the power consumption of the network (i.e. power-reduction is not an objective of the adaptation/learning process). In the research work presented in this thesis, investigations on possible adaptive power reduction techniques have been carried out, which attempt to exploit the adaptability of neural networks in order to reduce the power consumption. Three separate approaches for such adaptive power reduction are proposed: adaptation of size, adaptation of network weights and adaptation of calculation precision. Initial case studies exhibit promising results with significantpower reduction
An Optical Character Recognition Engine for Graphical Processing Units
This dissertation investigates how to build an optical character recognition engine (OCR) for a graphical processing unit (GPU). I introduce basic concepts for both building an OCR engine and for programming on the GPU. I then describe the SegRec algorithm in detail and discuss my findings
- âŠ