Improving silicon technology has offered the possibility of heterogeneous platforms involving multiple multiprocessors, DSP processors, and FPGAs, but the key issue is the creation of the methodologies and tools that allow designers to quick and efficient map complex DSP systems onto such platforms. Typically, these design processes will involve application modeling and development of transformations for mapping these application models onto hardware architecture models against some key performance criteria such as timing, area, or power consumption.
Improving silicon technology has offered the possibility of heterogeneous platforms involving multiple multiprocessors, DSP processors, and FPGAs, but the key issue is the creation of the methodologies and tools that allow designers to quick and efficient map complex DSP systems onto such platforms. Typically, these design processes will involve application modeling and development of transformations for mapping these application models onto hardware architecture models against some key performance criteria such as timing, area, or power consumption.
The purpose of this special issue is to highlight work which addresses the limitations in mapping from the application model onto the architecture model for complex DSP systems. These are not addressed in current design tool offerings to any great extent and issues include automatic translation of features in application specific models such as tokens and actors into "architecture model" specific expressions; exploration of algorithmic parallelism in such a way to make it match hardware platforms and development of transformations to reduce energy consumption and area against a throughput budget.
The papers addresses a number of topics ranging from model of computation (MoC) representations through to tools to explore realizations from SystemC descriptions.
In "SPRINT: a tool to generate concurrent transaction level models from sequential code," J. Cockx et al. describe a tool to generate a concurrent SystemC transaction level model from sequential code. Using this tool, different parallelization alternatives were evaluated during the design of an MPEG-4 simple profile encoder and an embedded zero tree coder. With their approach, generation was carried out in minutes thereby allowing extensive exploration of the design space.
In "Self-timed scheduling analysis for real-time applications," O. M. Moreira and M. J. G. Bekooij describe an approach that uses multirate dataflow graphs (MRDFs) to schedule the tasks of a hard real-time streaming application onto a multiprocessor system-on-chip. They extended the temporal analysis of self-time scheduling (STS) for MRDF graphs to model not only the average throughput but latency as well. This allows the maximum latency to be determined for jobs with periodic, sporadic, and bursty sources. The approach is demonstrated for a simplified channel equalizer and a wireless LAN receiver.
In "Development and evaluation of high-performance image decorrelation algorithms for the nonalternating 3D wavelet transform," E. Moyano-Ávila et al. show that by exploiting the inherent features of the application and the computing platforms they can achieve speedups in performance. They describe the implementation of a standard threedimensional wavelet transform on an SGI Origin 38000, shared memory supercomputer and show that by partitioning the video sequences into groups and by implementing parallel versions of the algorithms, a workload distribution strategy can be deployed efficiently to distribute the workload across several processors.
In "Exploiting the expressiveness of cyclo-static dataflow to model multimedia implementations," K. Denolf et al. look at the design of complex multimedia systems using MoC descriptions. They show how cyclo-static dataflow (CSDF) makes a suitable MoC and show how "implementation specific" aspects can be expressed using CSDF. In particular, they
