Programming Heterogeneous MPSoCs using MAPS by Sheng, Weihua et al.
Programming Heterogeneous MPSoCs using MAPS
Weihua Sheng, Jeronimo Castrillon, Anastasia Stulova, Maximilian Odendahl,
Rainer Leupers, Gerd Ascheid
Institute for Communication Technologies and Embedded Systems
RWTH Aachen University, Germany
sheng@ice.rwth-aachen.de
ABSTRACT
Programming heterogeneous MPSoCs (Multi-Processor Sys-
tems on Chip) is a grand challenge for SoC providers and
users today. The HW integration level of commercial MP-
SoC platforms increases at a fast pace. However, lacking
tools to support MPSoC programming results in a much
slower pace of SW productivity, which hampers deploying
MPSoCs significantly. In this paper, we introduce the MAPS
(MPSoC Application Programming Studio), a programming
tool suite for heterogeneous MPSoC architectures. It uses
both sequential C and a C language extension for describing
applications in the form of process networks, and it performs
optimized temporal and spatial task-to-processor mapping
for MPSoC platforms. Several case studies have been pre-
sented to show the capabilities of MAPS as a promising so-
lution for MPSoC compilation and programming.
1. PROBLEM STATEMENT
Heterogeneous MPSoCs are widely used in modern embed-
ded systems. Programming such platforms remains a grand
challenge for SoC providers and users today [1]. What is
the right MPSoC programming model that captures both
parallel computations and certified sequential C code? How
to achieve optimized temporal and spatial task-to-processor
mapping to meet real-time constraints? How to parallelize
legacy C code? How to explore the vast SW mapping design
space? Those are just a few examples of the multitude of
SW design issues that MPSoCs pose.
To illustrate the problem of MPSoC programming, Fig. 1
shows a comparison of programming flows for uni-processor
systems and for MPSoCs. In the uni-processor flow, SW
programmers follow the sequential programming model (C
being the most popular language) and rely on the compilers
to generate target-specific code correctly and optimally, as
shown in Fig. 1 (a). This has been a successful practice
for the past few decades before MPSoCs due to the heroic
role of compilers. However, traditional compiler technology
does not scale for MPSoC. MPSoC SW requirements are
doubled every ten months while SW productivity is dou-
bled only every 2 years [2]. Fig. 1 (b) demonstrates the
current problematic programming flow for MPSoCs, which
results in low SW productivity. Applications firstly need to
be partitioned in parallel, followed by the step of spatial and
temporal mapping of those partitioned tasks onto MPSoC
processing elements. Heterogeneous MPSoC programmable
processing elements often come with their own SW stack
(API, OS) and own compiler. Therefore, after partitioning
Figure 1: Programming Flow (a) Uni-processor (b)
MPSoC
and mapping, correct code must be generated respectively
for those processing elements to be further compiled. This
process is currently manual and error-prone, let alone more
iterations are usually needed to reach the optimized perfor-
mance.
Compared to the uni-processor, the programming flow for
MPSoCs obviously involves now more tasks which fall on
the programmers’ shoulders, e.g. partitioning and mapping.
Those are non-trivial tasks. As the counterpart of tradi-
tional compiler in the uni-processor programming, little tools
support exists in the MPSoC programming.
2. MAPS OVERVIEW
As a result of a long-term R&D investment, MAPS Compiler
has been developed at ICE of RWTH Aachen University,
which is a tool framework with an Eclipse-based IDE that
eases programming of heterogeneous MPSoC architectures.
The main features of MAPS are:
• Compilation framework for multi-core systems:
MAPS compiler uses both sequential C and a C lan-
guage extension for describing applications in the form
of process networks as inputs. It performs source-to-
source translation to generate target specific code to
leverage the existing C compiler technology for multi-
core processing elements. By executing the code on
a real or virtual target platform, the user can quickly
evaluate the result quality and, if required, explore fur-
ther SW mapping options. The compiler framework is
retargetable thus minimizing the SW tools investment
as well.
• Light-weight C extension for parallel program-
ming: MAPS is developed in the context of embedded
systems e.g. wireless and multimedia. Dataflow mod-
els such as KPN (Kahn Process Networks) [3] and
SDF (Synchronous Dataflow) [4] closely resemble ap-
plications in those domains. CPN (C for Process Net-
works) [5] is an easy-to-use C extension used in MAPS
that models concurrent processes and applications as
well as legacy code. Programmers with C experience
need little learning time to start parallel programming
right away.
• Advanced mapping and scheduling: Mapping and
scheduling is critical in MPSoC programming as they
determine systems performance largely. Besides man-
ually specifying spatial and temporal task-to-processor
mapping, MAPS provides means to compute schedul-
ing and mapping automatically to meet given con-
straints. This applies not only to single applications
but also to multiple application scenarios.
• Sequential C partitioning facilities: Manual par-
titioning of sequential legacy code is known to be diffi-
cult and error prone. MAPS provides a set of profiling
and analysis facilities for semi-automatic program par-
titioning. Programmers are guided with parallelization
hints to ease the migration path.
• Easy usability through an IDE: MAPS has in-
tegrated its features into an Eclipse-based IDE with
lots of graphical visualization of analysis results. It
provides a comfortable programming environment for
multi-core developers.
• Collaborative with other state-of-the-art tools:
MAPS is not a stand-alone solution. It is connected
to many state-of-the-art design tools such as Synop-
sys virtual platforms [6]. Collaboration of MAPS with
other tools enables much more synergy of tools for em-
bedded system developers.
3. CASE STUDIES
MAPS, as a MPSoC compiler infrastructure, has been used
in a number of case studies to demonstrate its practicalness
and value in increasing the SW productivity for modern em-
bedded systems. A few examples are presented in this sec-
tion.
3.1 TI OMAP
TI’s OMAP3530 [7] is a heterogeneous MPSoC which fea-
tures an ARM Cortex A8 processor and a TI C64x+ DSP
(Digital Signal Processor). Software-wise, on the ARM side,
Linux is used as OS, while on the DSP side a lightweight mul-
titasking operating system, called DSP/BIOS [8] from TI,
is used. While OMAP exhibits a typical heterogeneous HW
set-up, its software tool chains are also heterogeneous. For
instance, the Linux OS running ARM is maintained by open-
source community. TI proprietary SW runs on the DSP and
provides a DSP/Link layer which handles the inter-processor
communications. Compiler tool chains are also separate on
the processors. All those add to a high entrance barrier for
multi-core programmers.
We have retargeted MAPS compiler towards OMAP3530 [5].
The inputs to the MAPS compiler are the programs written
Figure 2: OSIP-based Heterogeneous MPSoC Plat-
form
in CPN and also the mapping description which specifies the
spatial task-to-processor mapping. Then, source-to-source
translation is performed to generate partitioned C code for
the ARM and the DSP respectively, which can be compiled
by the native C compilers. The results [5] have shown that
the automated compilation increases the productivity of SW
development greatly.
3.2 Architecture with HW Task Scheduler
HW based task scheduling is popular in embedded systems
which has stringent hard real-time constraints. OSIP [9]
is an Application Specific Instruction Set Processor (ASIP)
tailored for task management in MPSoCs. It uses inter-
rupt signals to notify the processors of a scheduling event.
Upon an interrupt, the processors in an MPSoC fetch the
information of the task to be executed from the OSIP’s in-
terface. A lightweight Application Programming Interface
(OSIP-API) is provided with OSIP that enables low latency
task dispatching in a heterogeneous platform.
Figure 2 shows an example heterogeneous MPSoC platform
with OSIP scheduler modeled in Synopsys Platform Archi-
tect [6]. It includes one ARM926EJ-S processor, two 4 slot
VLIW processors and an OSIP interconnected by an AMBA
AHB bus.
MAPS has been used and extended to support program-
ming the OSIP-based platform and provide debugging sup-
port. The cast study [10] has proved that MAPS is flexible
and extensible enough to support complicated heterogeneous
MPSoCs with specialized APIs and configurations. It also
has shown the productivity increase provided by MAPS. The
MAPS compiler allows to test different configurations faster
than if coding each of them by hand using the OSIP-APIs.
The debugging facilities greatly simplifies application devel-
opment cycles as well.
3.3 SDR Design
Mobile terminals nowadays have to support several wire-
less communication standards (so-called waveforms), some
of which continue to undergo modifications after product re-
leases. The concept of Software Defined Radio (SDR) has
recently emerged to provide such flexibility and to reduce
silicon costs. SDR aims at describing waveforms by means
of a high level language, releasing them from complex hard-
ware/software design decisions, thereby increasing produc-
tivity. The Nucleus Concept [11], illustrated in Figure 3
has been proposed to describe a component-based method-
Figure 3: The Nucleus Waveform Development Con-
cept [12]
ology in which waveforms are composed out of library blocks
called Nuclei, which are algorithmic entities that represent
demanding computational kernels that are common across
different wireless communication standards. A Nucleus can
be implemented in different platforms in several ways. Each
of these implementations is called a flavor, including plat-
form dependent implementation details, several of which are
crucial for the tool flow. In the Nucleus Concept, a flow
was envisioned in which a mapper selects the best match-
ing flavors to implement a waveform under time and energy
constraints (see Figure 3).
The Nucleus tool flow – an implementation based on the Nu-
cleus Concept is presented in [12]. The flow is built on top of
the MAPS framework and integrated with state-of-the-art
tools for system level design. As a Waveform Description
Language (WDL), the CPN language of the MAPS frame-
work is used. This language allows to specify applications
following the KPN model. The Nucleus Mapper selects the
best flavors from a given platform to implement the wave-
form. The best options are exported in form of simulation
models at different levels of abstraction. Enabled by the
MAPS compiler, for functional verification and debugging,
abstract level simulators based on the Virtual Processing
Unit (VPU) technology [13] are automatically generated.
Low level simulators for timing verification are set up in
the form of virtual platforms with cycle accurate processor
models. The case study [12] on a MIMO OFDM system
shows that the MAPS-based tooling has met the expected
goals of SDR methodology such as correctness, portability
end efficiency.
4. SUMMARY
This paper presents the MAPS compiler which aims at tack-
ling the challenge of MPSoC programming. It is a tool
framework with an Eclipse-based IDE that eases program-
ming of heterogeneous MPSoC architectures, while ensur-
ing optimized system performance. Over several years of
R&D, MAPS has taken the step from basic research to a
tool framework that enables programming of real-life com-
plex MPSoC platforms. We plan to present this new tech-
nology to a wider audience, receive feedback for future en-
hancements, and make MAPS accessible to early adopters
in industry.
5. ACKNOWLEDGMENT
This work has been supported by the UMIC Research Cen-
tre, RWTH Aachen University.
6. REFERENCES
[1] G. Martin, “Overview of the mpsoc design challenge,” in
Design Automation Conference, 2006 43rd ACM/IEEE,
pp. 274 –279, 0-0 2006.
[2] W. Ecker, W. Mueller, and R. Doemer.,
Hardware-dependent Software - Principles and Practice,
ch. Hardware-dependent Software - Introduction and
Overview. Springer, 2008.
[3] G. Kahn, “The Semantics of a Simple Language for Parallel
Programming,” in Information Processing ’74: Proceedings
of the IFIP Congress (J. L. Rosenfeld, ed.), pp. 471–475,
New York, NY: North-Holland, 1974.
[4] Edward A. Lee and David G. Messerschmitt, “Synchronous
Data Flow,” in Proceeding of the IEEE, vol. 75, 1987.
[5] W. Sheng, S. Schu¨rmans, M. Odendahl, R. Leupers, and
G. Ascheid, “Automatic Calibration of Streaming
Applications for Software Mapping Exploration,” in
Proceedings of the International Symposium on
System-on-Chip (SoC), 2011.
[6] Synopsys, “System Level Design: Platform Architect and
Processor Designer.”
[7] Texas Instruments, “OMAP35x Product Bulletin.” [Online]
Available http://www.ti.com/lit/sprt457 (accessed
11/2010).
[8] D. Dart, “DSP/BIOS Kernel Technical Overview,” Texas
Instruments Application Report, August 2001.
[9] J. Castrillon and et al., “Task Management in MPSoCs: an
ASIP Approach,” in ICCAD ’09: Proc. of the 2009 Intr.
Conf. on Computer-Aided Design, ACM, 2009.
[10] J. Castrillon, A. Shah, L. Murillo, R. Leupers, and
G. Ascheid, “Backend for virtual platforms with hardware
scheduler in the maps framework,” in Circuits and Systems
(LASCAS), 2011 IEEE Second Latin American
Symposium on, pp. 1 –4, feb. 2011.
[11] Ramakrishnan, V., Witte, E. M., Kempf, T., Kammler, D.,
Ascheid, G. and H. Meyr, Adrat, M. and M. Antweiler,
“Efficient and Portable SDR Waveform Development: The
Nucleus Concept,” in IEEE Military Communications
Conference (MILCOM 2009), (Boston, USA), Oct 2009.
[12] J. Castrillon, S. Schu¨rmans, A. Stulova, W. Sheng,
T. Kempf, R. Leupers, G. Ascheid, and H. Meyr,
“Component-based waveform development: the Nucleus
tool flow for efficient and portable software defined radio,”
in Analog Integrated Circuits and Signal Processing, June
2011.
[13] T. Kempf, M. Doerper, R. Leupers, G. Ascheid, H. Meyr,
T. Kogel, and B. Vanthournout, “A Modular Simulation
Framework for Spatial and Temporal Task Mapping onto
Multi-Processor SoC Platforms,” in DATE ’05: Proceedings
of the conference on Design, Automation and Test in
Europe, (Washington, DC, USA), pp. 876 – 881 Vol. 2,
IEEE Computer Society, Mar 2005.
