Search CORE

2 research outputs found

A Comparative Study of Scheduling Techniques for Multimedia Applications on SIMD Pipelines

Author: Arslan Mehmet Ali
Gruian Flavius
Kuchcinski Krzysztof
Publication venue
Publication date: 01/01/2015
Field of study

Parallel architectures are essential in order to take advantage of the parallelism inherent in streaming applications. One particular branch of these employ hardware SIMD pipelines. In this paper, we analyse several scheduling techniques, namely ad hoc overlapped execution, modulo scheduling and modulo scheduling with unrolling, all of which aim to efficiently utilize the special architecture design. Our investigation focuses on improving throughput while analysing other metrics that are important for streaming applications, such as register pressure, buffer sizes and code size. Through experiments conducted on several media benchmarks, we present and discuss trade-offs involved when selecting any one of these scheduling techniques.Comment: Presented at DATE Friday Workshop on Heterogeneous Architectures and Design Methods for Embedded Image Systems (HIS 2015) (arXiv:1502.07241

arXiv.org e-Print Archive

Lund University Publications

Programming support for reconfigurable custom vector architectures

Author: Arslan Mehmet Ali
Gruian Flavius
Kuchcinski Krzysztof
Liu Yangxurui
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

High performance requirements increased the popularity of unconventional architectures. While providing better performance, such architectures are generally harder to program and generate code for. In this paper, we present our approach to ease programmability and code generation for such architectures. We present a domain specific language (DSL) for the programming part, and a constraint programming approach to scheduling with memory allocation. Our experiments on implementing a kernel extracted from a DSP appli- cation on an example reconfigurable custom architecture shows that it is possible to achieve performance close to hand-written machine code that is scheduled without memory allocation

Crossref

Lund University Publications