47,507 research outputs found
Optimal processor assignment for pipeline computations
The availability of large scale multitasked parallel architectures introduces the following processor assignment problem for pipelined computations. Given a set of tasks and their precedence constraints, along with their experimentally determined individual responses times for different processor sizes, find an assignment of processor to tasks. Two objectives are of interest: minimal response given a throughput requirement, and maximal throughput given a response time requirement. These assignment problems differ considerably from the classical mapping problem in which several tasks share a processor; instead, it is assumed that a large number of processors are to be assigned to a relatively small number of tasks. Efficient assignment algorithms were developed for different classes of task structures. For a p processor system and a series parallel precedence graph with n constituent tasks, an O(np2) algorithm is provided that finds the optimal assignment for the response time optimization problem; it was found that the assignment optimizing the constrained throughput in O(np2log p) time. Special cases of linear, independent, and tree graphs are also considered
Generalized Methodology for Array Processor Design of Real-time Systems
Many techniques and design tools have been developed for mapping algorithms to array processors. Linear mapping is usually used for regular algorithms. Large and complex problems are not regular by nature and regularization may cause a computational overhead which prevents the ability to meet real-time deadlines. In this paper, a systematic design methodology for mapping partially-regular as well as regular Dependence Graphs is presented. In this approach the set of all optimal solutions is generated under the given constraints. Due to nature of the problem and the tight timing constraints of real-time systems the set of alternative solutions is limited. An image processing example is discusse
FPGA Implementations Comparison of Neuro-cortical Inspired Convolution Processors for Spiking Systems
Image convolution operations in digital computer systems are usually
very expensive operations in terms of resource consumption (processor
resources and processing time) for an efficient Real-Time application. In these
scenarios the visual information is divided in frames and each one has to be
completely processed before the next frame arrives. Recently a new method for
computing convolutions based on the neuro-inspired philosophy of spiking
systems (Address-Event-Representation systems, AER) is achieving high
performances. In this paper we present two FPGA implementations of AERbased
convolution processors that are able to work with 64x64 images and
programmable kernels of up to 11x11 elements. The main difference is the use
of RAM for integrators in one solution and the absence of integrators in the
second solution that is based on mapping operations. The maximum equivalent
operation rate is 163.51 MOPS for 11x11 kernels, in a Xilinx Spartan 3 400
FPGA with a 50MHz clock. Formulations, hardware architecture, operation
examples and performance comparison with frame-based convolution
processors are presented and discussed.Ministerio de Ciencia e Innovación TEC2006-11730-C03-02Junta de Andalucía P06-TIC-0141
Hierarchical stack filtering : a bitplane-based algorithm for massively parallel processors
With the development of novel parallel architectures for image processing, the implementation
of well-known image operators needs to be reformulated to take advantage of the so-called
massive parallelism. In this work, we propose a general algorithm that implements a large
class of nonlinear filters, called stack filters, with a 2D-array processor. The proposed method consists of decomposing an image into bitplanes with the bitwise decomposition, and then process every bitplane hierarchically. The filtered image is reconstructed by simply stacking the filtered bitplanes according to their order of significance. Owing to its hierarchical structure, our algorithm allows us to trade-off between image quality and processing time, and to significantly reduce the computation time of low-entropy images. Also, experimental tests show that the processing time of our method is substantially lower than that of classical methods when using large structuring elements. All these features are of interest to a variety of real-time applications based on morphological operations such as video segmentation and video enhancement
On the AER Convolution Processors for FPGA
Image convolution operations in digital computer
systems are usually very expensive operations in terms of
resource consumption (processor resources and processing time)
for an efficient Real-Time application. In these scenarios the
visual information is divided into frames and each one has to be
completely processed before the next frame arrives in order to
warranty the real-time. A spike-based philosophy for computing
convolutions based on the neuro-inspired Address-Event-
Representation (AER) is achieving high performances. In this
paper we present two FPGA implementations of AER-based
convolution processors for relatively small Xilinx FPGAs
(Spartan-II 200 and Spartan-3 400), which process 64x64 images
with 11x11 convolution kernels. The maximum equivalent
operation rate that can be reached is 163.51 MOPS for 11x11
kernels, in a Xilinx Spartan 3 400 FPGA with a 50MHz clock.
Formulations, hardware architecture, operation examples and
performance comparison with frame-based convolution
processors are presented and discussed.Ministerio de Ciencia e Innovación TEC2006-11730-C03-02Ministerio de Ciencia e Innovación TEC2009-10639-C04-02Junta de Andalucía P06-TIC-0141
Formal and Informal Methods for Multi-Core Design Space Exploration
We propose a tool-supported methodology for design-space exploration for
embedded systems. It provides means to define high-level models of applications
and multi-processor architectures and evaluate the performance of different
deployment (mapping, scheduling) strategies while taking uncertainty into
account. We argue that this extension of the scope of formal verification is
important for the viability of the domain.Comment: In Proceedings QAPL 2014, arXiv:1406.156
A survey of parallel algorithms for fractal image compression
This paper presents a short survey of the key research work that has been undertaken in the application of parallel algorithms for Fractal image compression. The interest in fractal image compression techniques stems from their ability to achieve high compression ratios whilst maintaining a very high quality in the reconstructed image. The main drawback of this compression method is the very high computational cost that is associated with the encoding phase. Consequently, there has been significant interest in exploiting parallel computing architectures in order to speed up this phase, whilst still maintaining the advantageous features of the approach. This paper presents a brief introduction to fractal image compression, including the iterated function system theory upon
which it is based, and then reviews the different techniques that have been, and can be, applied in order to parallelize the compression algorithm
An SMP Soft Classification Algorithm for Remote Sensing
This work introduces a symmetric multiprocessing (SMP) version of the continuous iterative
guided spectral class rejection (CIGSCR) algorithm, a semiautomated classification algorithm for remote
sensing (multispectral) images. The algorithm uses soft data clusters to produce a soft classification
containing inherently more information than a comparable hard classification at an increased computational
cost. Previous work suggests that similar algorithms achieve good parallel scalability, motivating the parallel
algorithm development work here. Experimental results of applying parallel CIGSCR to an image with
approximately 10^8 pixels and six bands demonstrate superlinear speedup. A soft two class classification is
generated in just over four minutes using 32 processors
- …