GPU4S: Embedded GPUs in Space by Kosmidis, Leonidas et al.
GPU4S: Embedded GPUs in Space
Leonidas Kosmidis∗, Jérôme Lachaize†, Jaume Abella∗
Olivier Notebaert†, Francisco J. Cazorla∗,‡, David Steenari§
∗Barcelona Supercomputing Center (BSC), Spain
†Airbus Defence and Space, France
‡Spanish National Research Council (IIIA-CSIC), Spain
§European Space Agency, The Netherlands
Abstract—Following the same trend of automotive and avion-
ics, the space domain is witnessing an increase in the on-board
computing performance demands. This raise in performance
needs comes from both control and payload parts of the space-
craft and calls for advanced electronics able to provide high
computational power under the constraints of the harsh space
environment. On the non-technical side, for strategic reasons it is
mandatory to get European independence on the used computing
technology. In this project, which is still in its early phases, we
study the applicability of embedded GPUs in space, which have
shown a dramatic improvement of their performance per-watt
ratio coming from their proliferation in consumer markets based
on competitive European technology. To that end, we perform
an analysis of the existing space application domains to identify
which software domains can benefit from their use. Moreover,
we survey the embedded GPU domain in order to assess whether
embedded GPUs can provide the required computational power
and identify the challenges which need to be addressed for their
adoption in space. In this paper, we describe the steps to be
followed in the project, as well as the results of our preliminary
analyses in the first months of the project.
I. INTRODUCTION AND BACKGROUND
The space market is in constant search for high performance,
scalable processing solutions to satisfy the increased compu-
tational needs of future missions for increased autonomy and
data processing. While reusing solutions from other domains
can reduce non-recurrent costs, space has its unique set of
constraints. Increased performance demands are required for
the platform computers, which are in charge of controlling
critical functionalities like the spacecraft power distribution,
navigation and guidance, as well as for the payload comput-
ers (in charge of controlling the payload devices, and pre-
processing acquired payload data before its transmission to
the ground) to process more data.
Graphics processing units (GPUs), initially a special pur-
pose type of accelerator for visualisation tasks, have since sev-
eral years ago outperformed Central Processing Units (CPUs)
raw performance and energy efficiency. This opens the door to
achieve unprecedented performance with a very high energy
efficiency for demanding computations, becoming essential
for high-performance computing. As a matter of fact, one
quarter (125) of the supercomputers in the recent edition of
the TOP500 / Green500 list (as of June 2019) are based on
GPUs, including the two most powerful supercomputers. Past
studies analysed the applicability of high-performance GPUs
in space [1][2]. Those studies concluded that although their
energy efficiency is high, their power consumption is an order
of magnitude higher than the limited power budget of a space
system, which is limited to a couple of Watts.
Interestingly, GPUs entered in the embedded domain to
satisfy the increasing demand for multimedia-based hand-
held and consumer devices such as smartphones, in-vehicle
entertainment systems, televisions, set-top boxes etc. They
were re-designed compared to their high-performance desk-
top/server counterparts to exhibit low-power requirements,
essential for battery power and thermally-constrained devices.
Improvements in the transistor technology allowed achieving
impressive performance capabilities that were only possible
in high performance systems of the past decade [3]. Although
the GPU vs CPU performance ratio is lower in the embedded
domain than in the high-performance one due to power and
thermal constraints, mobile GPUs are increasingly considered
for accelerating heavy workloads, for applications ranging
from signal processing, to advanced driving assistance systems
(ADAS) in cars, as well as prototype supercomputers for
exascale [4][5].
Despite their promising characteristics, embedded GPUs
have not been explored for their applicability in this domain.
This project aims at covering this gap by providing an initial
assessment of existing embedded GPUs, as a first step of their
further exploration and adoption in this domain.
The rest of the paper is organised as follows. Section II
introduces the project, which includes the on-going and future
activities. Section III analyses the performance demand of
current and future space missions. Section IV details the
outcome of work on the analysis of the embedded GPU
market. Finally, Section V describes the future work and
Section VI presents the main conclusions of our work so far.
II. THE PROJECT
The project explores the suitability of embedded GPUs for
space from both, software and hardware perspectives. In order
to reach its main goal, the project is organised in 4 main
activities as shown in Figure 1.
Space mission analysis: In this activity, we survey potential
space areas and particular algorithms and applications that
can benefit from the use of GPUs in space, ensuring that
the identified applications are suitable for the GPU execution
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, 
including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to 










Fig. 1. Main activities. Box width does not represent activity duration
model. This is important because there are fundamental design
differences between GPUs and CPUs which are required to
be taken into account to understand how certain algorithms
used in space, e.g. those with divergent path execution among
different threads or those not performing coalesced memory
accesses, behave differently in embedded GPUs w.r.t. high-
performance GPUs. In particular, certain space algorithms
may not be suitable for embedded GPUs despite their high
computational nature, and might require to be redesigned.
We provide a preliminary list of applications already vali-
dated as GPU-compatible in Section III. Moreover, we define
a preliminary set of criteria that can be used for the selection
of a GPU candidates depending on the mission profile.
Embedded GPU analysis: In parallel to previous activity,
we survey the available commercial-of-the-shelf (COTS) hard-
ware and soft-IP (Intellectual Property), in order to identify
their characteristics with the goal of selecting a candidate
board for evaluation in the next activities. We mainly focus
on European IP, which in turn dominates the market and can
provide complete independence in the European space sector.
In this line, in order to understand the wide range of embedded
GPUs, we perform a taxonomy of existing products.
In addition to the mission profile criteria from the Space
mission analysis, an important aspect for the selection of
a candidate board, is the performance and energy-efficiency
characteristics, due to the fundamental architectural differ-
ences between their designs and that of their high-performance
counterparts, which need to be exploited by the software. This
fact has only been superficially explored up to date [6], with
many works in the literature overlooking these differences
and treating them as equivalent. Another equally important
aspect is the available software ecosystem that is crucial for
the applicability in space. Moreover, we identify software
applications in the space domain that can benefit from GPUs
and justify their use. Furthermore, it is important to ensure
that these applications and their corresponding algorithms fit
the programming model of GPUs.
Next, we survey the potential market options in order to
select the appropriate candidates for evaluation. We started by
providing a taxonomy of the available GPU options, listing
the characteristics and the differences between products from
different vendors or families of GPUs in each category, namely
GPUs and soft-IP.
The classification and the summary of our preliminary
observations of the survey are presented in Section IV.
Candidate Platform Comparison: From both previous
activities, we will identify promising GPU candidate plat-
forms. As a next step we will perform a thorough comparison
between the selected embedded GPU candidates and existing
and future processing devices for the space domain. This way,
the potential benefits of embedded GPUs can be evaluated,
specifically regarding the high-performance requirements of
future missions, while respecting the power and thermal lim-
itations of the space environment. In order to do this, we
will perform the evaluation by porting representative kernels
derived from existing space algorithms to the GPU. Moreover,
performance and other data from past missions and previous
Application-Specific Integrated Circuit (ASIC) processes will
be used for comparison, by normalising them according to
the current space technology node (65nm). This will allow
the selection of the most appropriate GPU platform for space
from the evaluated candidates.
Next Steps for GPU adoption: As final step of the project,
we will define the future steps required towards the adoption
of GPUs in space, by identifying current limitations and
proposing appropriate solutions to overcome them. In order
to assess the next steps for the adoption of GPUs in the
space domain with a system integrator view, we will examine
the necessary steps for qualification of COTS systems by
addressing their reliability concerns at system level or the
development of radiation hardened components. In the former
case we will also provide software fault-tolerance solutions
specifically designed for these platforms, in case a COTS
embedded GPU is selected in the previous task. Finally, an
ESA provided algorithm will be ported to the GPU platform.
III. SPACE APPLICATION SURVEY
In this Section, we examine the space applications domain
regarding the use of embedded GPUs. The performance re-
quirements of space missions are constantly increasing. As an
example of current missions, the Gaia astrometry mission
(devoted to the measurement of the positions of celestial
bodies), launched in 2013 by the European Space Agency as
a follow-up mission to the Hipparcos mission in 1989, has a
hundred times better accuracy and aims to map 1.7 billion
stars, 4 orders of magnitude more stars than its predeces-
sor. Additionally, scientific missions can generate a massive
amount of data, which may be challenging to transmit even on
earth’s surface. Most on-board scientific instruments, including
the ones used for observation, use arrays of sensors, which
allow for parallel processing. For example, the Alpha Magnetic
Spectrometer (AMS-02) instrument at the International Space
Station (ISS) [7], produces data in the rate of 7 Gigabit/s. For
this reason, it uses an array of 600 CPUs to reduce the amount
of data by 3 orders of magnitude, before transmission [8].
Finally, future space missions such as the Active Debris
Removal (ADR) will need an unparallelled amount of compu-
tational power to achieve new functionalities, for autonomous
guidance and navigation control (GNC) based on image
processing and machine learning [9], in order to identify,
approach, rendez-vous and finally remove the target. Similarly,
the on board data processing for next-generation missions will
increase in almost every kind of mission [10][11][12]:
• For Science and robotic exploration, high data perfor-
mance acquisition chains will be required to meet the high
data rate generated by instruments (spectrometers, im-
agers etc.) as the importance of on board-autonomy and
processing needs for planetary exploration is increased.
• For Earth Observation, the evolution in sensor resolution,
dynamic range and faster readout rates has led to a
dramatic increase of sensor bandwidth and data volume,
creating significant bottlenecks in downlink capacity and
a need of very high on-board data processing capabilities
for both data processing and compression. Moreover, new
technologies such as deforming mirrors, require important
on-board processing. Video from space could also be a
new application, as well as the image enhancement based
on the acquisition of several images.
• For next generation launchers, considerably increasing
data rate and on-board processing capabilities can enable
interesting applications, such as increased user telemetry
transmission channels. This includes the transmission of
video data to monitor critical agile launch manoeuvres,
which are not possible with todays transmission rates of
up to 400 Mbits/s with a compression ratio of 20.
• The navigation equipment for new missions like the
space tug concept required for Active Debris Removal
(ADR) [9] which will be able to change the orbit of
non-cooperative spacecraft, will be based on computer
vision. This type of processing, including localisation and
mapping, is essential and very demanding.
• In the telecommunication domain, future applications
go beyond the classical geosynchronous equatorial orbit
(GEO) based telecommunication systems, and include
missions such as spectrum surveillance or machine to
machine communication from the Low Earth Orbit with
medium or large spacecraft constellations.
• Radar processing is also expected to follow a technologi-
cal evolution similar to and synergistic with the telecom-
munication domain, since both require high-performance
signal processing.
• Launchers is the only domain in which performance
requirements are not expected to grow. For expandable
launchers, requirements for high performance processing
are not growing very fast since the need for video
transmission during the flight (mainly for post-mortem
analysis) requires a strong data reduction ratio: in order
of 20. This is currently covered and no fast evolution
of the need is foreseen, remaining in a similar range to
the current Earth Observation requirement. On reusable
launchers, the re-entry/landing phases will require high
performance highly available processing devices espe-
cially for vision based navigation and guidance for land-
ing. In this study, such application is considered as similar
with the Planetary Approach and Landing already covered
by the exploration domain. So the launcher use case will
not be particularly considered further in this study.
• Finally, future missions will require significant autonomy
and agility provided by advanced robotics for in-orbit
TABLE I
EXAMPLE SPACE DOMAINS AND REPRESENTATIVE ALGORITHMS,
SUITABLE FOR GPU ACCELERATION.
 
 
Use Case type Algorithm name Algorithm description 
Science/Image 
processing 
GAIA VPU Complex Image Processing 




Euclid NIR Near-Infra-Red image sensor 
data processing   
Optical 
Observation 
GO3S High resolution image 
processing algorithm    
Compression ESA CCSDS 122 
Image data 
compression 





ADS-B – OCE Automatic dependent 
surveillance – broadcast 
(ADS–B) algorithm-  
surveillance technology in 
which an aircraft determines 












 Object detection and 
classification 
 
operation and space exploration, including landing and
rover navigation. In the long term, it is anticipated that
increasing artificial intelligence will be required on-board
most spacecraft.
The emergence of embedded GPUs with high performance
and low power consumption presents a real opportunity to
drastically increase the on-board processing capability and
satisfy the future space applications’ performance needs.
A first set of applications fitting in the above-identified
areas has been selected and presented in the Table I. Several
of these computationally demanding algorithms have already
been selected for high performance data processing in cur-
rent and future missions. Our analysis indicates that all the
representative algorithms in the table could be appropriate for
GPU acceleration. This is based on the fact that most GPU
microarchitectures suffer a significant performance penalty
with software exhibiting irregular behaviour, e.g. not accessing
consecutive memory positions (memory coalescing) or taking
different paths (branch divergence). A first study of those
applications shows that they are free of path divergence, while
their memory accesses are regular.
Already acquired knowledge and exploitation of those exist-
ing and operational software extracts in previous and current
parallel studies guarantee both a good starting point for this
GPU work and, potentially, an existing baseline for perfor-
mance and porting effort comparisons. In order to complete
the survey and ensure to cover as many potential domains
as possible, an internal meeting with all Airbus programs
will be organised to identify further algorithms that could be
interesting for GPU implementation.
The space application domain survey includes also the
definition of a set of criteria for the selected GPU candidates
based on the specific details of particular application domains
and missions. For example, Low Earth Orbit (LEO) missions
have less stringent availability requirements which allow the
use of COTS components, especially if they have reliability
features as products that are already qualified for use in
industrial or safety critical domains such as the automotive.
Some of the metrics we have identified so far coming from
the mission profile include: the availability, usable in space,
of the technology/equipment, the algorithms implementation
feasibility, the power and thermal constraint of the mission,
the I/O interfaces, the non-recurring development cost for
radiation testing and the development of mitigation techniques,
the recurring cost of the equipment regarding the number
of pieces, and the flexibility to use different algorithms at
different steps of the mission timeline and the update of the
algorithm for correction or extension.
IV. GPU TAXONOMY
In this Section, we present a taxonomy of mobile GPUs,
summarised in Fig 2 in order to better explain the various
options of potential candidates for the space domain.
At the top level we classify GPUs as either embedded or
high-performance and focus just on the former category, since













Fig. 2. GPU Taxonomy
The first difference we observed in the mobile GPU market
is that, unlike the desktop GPU market which is dominated
by 3 major players (NVIDIA, AMD and in lesser degree
Intel), the embedded GPU market features several big GPU
IP providers. Most of them are European (ARM, Imagination
Technologies) or started as European (Adreno was initially
designed under the Imageon brand name by the Finnish
company BitBoys, now Qualcomm Finland, after acquisition
by ATI and AMD). Similarly Broadcom’s VideoCore used
in the low-cost European educational computer Raspberry Pi
has been developed by the UK-based Alphamosaic before it
was purchased by Broadcom. In addition, ThinkSilicon which
specialises on GPUs for IoT devices is based in Greece.
Another key element in our work is that based on previous
works [6][13][14] we were able to extend the survey beyond
the obvious choice of OpenCL programmable devices (right
part in Fig 2), which allows better covering the available
embedded GPU market and providing a wider spectrum for
the appropriate selection of embedded GPUs. Note that in
order to complete the survey, especially regarding the potential
embedded GPU IP acquisition to make a rad-hard chip, it is
required to contact the various vendors, which requires a long
administrative process that it is currently under its way.
Mobile GPUs can be first divided into two broad categories,
COTS GPUs and FPGA implementations. In the former cate-
gory we find mobile GPUs which are implemented as ASICs
(Application-Specific Integrated Circuits), typically as part of
a larger SoC (System-On-Chip) which usually contains CPUs
and other devices e.g. DSPs, embedded memory etc. In the
latter category, FPGA implementations allow reconfiguration
of the circuitry in order to facilitate hardware development,
especially when a product is not fully defined or additional
requirements are introduced at a later time. Both categories
are interesting for the space domain.
A. COTS GPUs
COTS designs provide small recurring costs and higher per-
formance compared to FPGA implementations using the same
design technology (often specified in nm). COTS components
are manufactured in advanced state-of-the-art silicon technolo-
gies like 10nm FinFET since they target widely used consumer
devices in order to benefit from the higher performance
and power characteristics of those processes. However, the
reliability in those markets regarding transient and permanent
faults has not been addressed, since these products are not
used in critical systems and they are expected to be replaced
voluntarily by the users every 2-3 years with later products
with more advanced capabilities. Therefore, additional effort
is required in order to shield those components from radiation
effects either in hardware or in software.
The COTS GPU category can be further subdivided into
Low-End and High-End products.
A.1. Low-end GPUs
Low-end GPUs support only graphics APIs such as OpenGL
ES 2 and have a large share of the mobile market [15][16].
These products feature a simpler architectural design, which
is incapable of supporting OpenCL but results in lower power
consumption. The most prominent example is ARM’s Mali-
4XX, probably the most licensed embedded GPU to date,
which has a 20% of the smartphone market share [17], without
including in this market portion other massively produced
consumer devices such as single-board-computers, set-top-
boxes, tv-sets, automotive systems and FPGAs, such as the
Zynq UltraScale+. An advantage of such a widely used
older device is the maturity of technology and its associated
development tools, which provide evidence from use and
documentation of their known problems, if not already fixed
in later hardware/driver revisions.
Other products in this category from different vendors
are Qualcomm’s Adreno 2xx, Broadcomm’s VideoCore IV,
Imagination Technologies PowerVR SGX and ThinkSilicon’s
NemaPico and NemaTiny. Although OpenCL support is not
provided on low-end GPUs, and cannot be implemented with-
out low-level access to the GPU design, efficient solutions for
general purpose computations on them exist [13][14].
A potential issue with the adoption in space of both low and
high-end embedded GPUs, is the non-disclosed architectural
design of almost all embedded GPU, which prevents getting
the required full control over the execution and observability
for space qualification. The vendors provide only high-level
information about the implementation and all the development
tools are closed source. The only GPU with open specification
so far is the VideoCore IV from Broadcom, however the
limited available development tools from the open source
community work at assembly level. This translates to low
productivity and high complexity, while there is no debugging
or profiling method available.
A.2. High-end GPUs
GPUs in the high-end of the embedded spectrum support
both graphics and compute APIs, such as OpenCL. Such archi-
tectures are ARMs Mali T6xx-T8xx and G7X families, Imag-
inations latest SGX and Rogue families, Qualcomm’s Adreno
3xx and above, ThinkSilicon’s NemaSmall and Vivante’s latest
GC series. Despite those architectures theoretically can support
an OpenCL runtime, GPU vendors do not always provide a
driver. In fact, OpenCL is usually only available to developers
for use with certain development kits, while recently Google
dropped its use in Android. As a consequence, GPU vendors
are less keen to develop, release or support OpenCL drivers
for mobile GPUs, unless there is an explicit interest from
large companies such as Samsung or Sony for certain GPU
models in their flagship products, or from other domains such
as supercomputing. Therefore, the selection of a high-end GPU
has to involve a deeper analysis than a simple analysis of
vendors’ products sheets.
Since most of the embedded GPUs target mainly consumer
markets, they don’t address explicitly safety requirements.
The exceptions in this is case are Imagination’s PowerVR
6XT (GX6650) GPU found in ASIL-B certified automotive
platforms such as the Renesas R-Car H3 and Imagination’s
latest GPU Series Furian and NVIDIA’s Xavier, which are
designed targeting ASIL-D certification, the highest assurance
level in automotive systems.
B. FPGA Solutions
FPGA solutions offer lower performance but the underlying
COTS FPGA device can be radiation hardened and qualified
for space use such as Xilinx’s high-density single-event im-
mune V5QV. Moreover, studies of new silicon technologies
have shown better reliability than the current technology used
in space (65 nm), increasing the ambitions for use in space.
The Xilinx report about Failures in Time (FIT) on Xilinx
FPGA [18], shows an improvement with technologies that
shall be confirmed with the next FPGA generation. Such solu-
tions can provide additional advantages for long interplanetary
missions. In particular, the FPGA configurability can reduce
the time-to-launch of a several year mission, even in the case
that the desired hardware accelerator is not fully developed,
debugged or tested. Or, in the case that a new more effective
image compression algorithm is invented, a hardware acceler-
ator can be reprogrammed to support it, thus reducing the data
size and therefore the time for downlink communications. For
the above reasons, both potential solutions have their unique
advantages, and therefore they are considered in our survey.
This group can be further divided into two categories:
B.1. Soft GPU cores
Soft embedded GPU cores are implemented in RTL (Reg-
ister Transfer Level) using a hardware description language
such as Verilog or VHDL. The design is then synthesised
on the FPGA, where it can be used transparently to the
software either using graphics or compute APIs. Some GPU
designer firms explicitly offer evaluation solutions for certain
FPGA devices such as Think Silicon with their high-end
NEMA GPU and low-end Think2.5D products for Xilinx’s
Zynq platforms. However, the feedback we have received from
all commercial GPU IP vendors so far is that most of the
high-end embedded GPUs exceed the capacity of existing
FPGAs and only reduced configurations of such designs can fit
on very expensive (∼50K euros) FPGA development boards.
In addition, according to the vendors the achieved target
frequencies of such designs on the FPGA are very low and in
conjuction with the reduced configuration, they result in very
low performance compared to their ASIC implementation.
Therefore, we do not suggest using commercial GPU IP cores
for FPGA implementation but only for rad-hard ASICs.
Open source research-oriented soft-GPU cores are also
available, such as MIAW [19] and FlexGrip [20], which
implement AMD and NVIDIA like GPU microarchitectures
or FGPU [21] which does not resemble any commercial
GPU. However, these cores do not have a low-power GPU
microarchitecture – such as tile-based and deferred rendering
architectures [6] –, only implement a subset of the instruction
sets (typically limited to integer instructions) and support
only compute APIs, not graphics as well. Moreover, the
projects come with limited-functionality development tools,
and without debugging or profiling capabilities, while most
of them are not maintained any more and therefore lack any
type of support. Besides these problems, the most important
roadblock for using these designs in space is their licensing
conditions. Being GPL-licensed for the majority of the cases,
using them in a commercial setup with proprietary hardware
IP blocks such as SpaceWire, would require the release of the
RTL code of the entire platform. For this reason, open source
GPU designs are not suitable for this domain.
B.2. High-Level Synthesis
Modern FPGAs also support OpenCL using High-Level
Synthesis (HLS). Such products translate OpenCL to custom
circuits, which are configured for execution on the FPGA
fabric. Although this is not a GPU solution per se, this solution
is based on OpenCL, which provides the same software
interface as a high-end embedded GPU or soft-GPU core.
Such products are both available from Xilinx and Altera,
including recent Intel’s HARP prototypes, which integrate both
a CPU and an FPGA in the same chip. The reconfiguration
of this solution provides the advantage that the hardware
resources of the FPGA have the potential to be utilised more
effectively among various algorithms compared to a fixed-
design soft-GPU solution. However, FPGA reconfiguration
(flashing) takes much longer than executing different kernels
on a GPU, and although FPGAs are currently used in space
missions, this functionality has never been used before.
High-level synthesis in OpenCL facilitates significantly the
development and debugging effort compared to a hardware
description language. However, the efficiency of the generated
circuitry from the high-level synthesis hast to be evaluated.
Moreover, existing space qualified FPGAs such as Xilinx’s
V5QV do not support high-level synthesis. Finally, our anal-
ysis indicates that the existing HLS tools are unable to get
the same OpenCL code and run it unmodified on an FPGA,
either because they require additional glue code between the
host and accelerator side or because the kernel code has to be
heavily modified and annotated, so that the generated hardware
is efficient, in a degree that it will have nothing in common
with the original OpenCL code. For this reason, we have
concluded that investigating further this path is beyond the
scope our project, which is focused only on embedded GPUs,
and it is more subject to a future project related to ASIC
hardware design for space.
C. GPU Survey Summary
Based on the 4 categories of the embedded GPUs which we
have identified in our taxonomy, our analysis indicates that the
FPGA path should not be pursuited further for implementing
commercial COTS GPU designs, nor open source GPU de-
signs. HLS has potential, but it has been deemed out of the
scope of this project.
Therefore only commercial embedded GPU devices are
considered for the next project phases, either in their COTS
SoC implementations or in IP for potential use in future rad-
hard ASICs. Both high-end and low-end products can be used,
although they offer different trade-offs in performance, energy
efficiency, programming interfaces, maturity, development and
debugging tools, open specification and functional safety.
Based on the additional vendor information we will collect
in the near future and the importance of each of the trafeoffs,
we will select the most appropriate embedded GPU candidates
for evaluation and comparison with existing on-board devices.
V. CURRENT AND FUTURE PROJECT WORK
Next, we will perform the experimental evaluation of var-
ious embedded GPUs from the ones mentioned earlier. The
experimental evaluation involves measuring the performance
and the power efficiency. We are currently finishing porting
algorithms from the representative space application list to
the target GPUs and expect to have experimental results by
the conference date to include in our presentation. In order to
cope with the multiple categories of GPUs and their different
programming APIs, we envision the use of Brook Auto [14]
which can support multiple back-ends and can facilitate the
certification/qualification of the GPU software.
For the comparison, we will establish a set of trade-off
criteria, not only based on the performance efficiency, but also
on other properties that affect the adoption of GPUs in space,
such as the availability and the capabilities of development
and profiling tools, the radiation tolerance and the total cost
of the hardware and qualification to name a few. All results
will be normalised on the current space silicon technology
(65nm) and will be compared with existing on-board devices.
The comparison will result in the selection of the most
appropriate GPU target which will be used to implement a
demonstrator with an ESA space application. Moreover, in
case that the most appropriate device is COTS, a software
reliability plan will be proposed. Finally, the study will con-
clude with the definition of the roadmap for the adoption of
this target in the space domain.
VI. SUMMARY
In this paper we described the goals and the preliminary
results of the GPU4S project, which studies the applicability
of embedded GPUs in the space domain.
From our early survey results in space applications and
domains, we have identified so far that GPUs are appropriate
for a wide range of algorithms from several domains like
vision based navigation, image processing, neural network
processing and signal processing, and that depending on
the particular mission, there are different characteristics like
reliability requirement and thermal which need to be taken
into account for the selection of the candidate GPUs.
Regarding the embedded GPU domain we have seen that
several embedded GPU IP options are potential candidates,
most of which European, but each one offers different trade-
offs which have to be further evaluated in the next steps of the
project, once all information is obtained, in order to perform
the final selection of the platforms that will be evaluated
experimentally. However, our analysis indicates that soft GPU
IP solutions on FPGA as well as high-level synthesis are not
appropriate for further exploration in this project (HLS) or at
all (soft GPUs).
ACKNOWLEDGMENTS
This work has received funding from the the European
Space Agency (ESA) under the GPU4S (GPU for Space)
Project, answer to the ESA ITT AO/1-9010/17/NL/AF ten-
der with title ”Low Power GPU Solutions For High Per-
formance On-Board Data Processing” and from the Euro-
pean Research Council (ERC) under the European Union’s
Horizon 2020 research and innovation programme (grant
agreement No. 772773). This work has also been partially
supported by the Spanish Ministry of Economy and Com-
petitiveness (MINECO) under grant TIN2015-65316-P and
the HiPEAC Network of Excellence. MINECO partially sup-
ported Leonidas Kosmidis under Juan de la Cierva Formación
postdoctoral fellowship (FJCI-2017-34095) and Jaume Abella
under Ramon y Cajal postdoctoral fellowship (RYC-2013-
14717).
REFERENCES
[1] D. Gonzalez-Arjona and G. Furano, “High-Performance Avionics Solu-
tion for Advanced and Complex GNC Systems for ADR (HIPNOS),”
TEC-ED & TEC-SW Final Presentation Days at ESA, December 2017.
[2] G. Lentaris, K. Maragos, I. Stratakos, L. Papadopoulos, O. Papaniko-
laou, D. Soudris, M. Lourakis, X. Zabulis, D. Gonzalez-Arjona, and
G. Furano, “High-Performance Embedded Computing in Space: Evalu-
ation of Platforms for Vision-Based Navigation,” Journal of Aerospace
Information Systems, vol. 15, no. 4, pp. 178–192, February 2018.
[3] “Bringing Console Quality Lighting to Mobile (Presented by Imagi-
nation Technologies), Game Developer Conference 2014,” http://www.
gdcvault.com/play/1020691/Bringing-Console-Quality-Lighting-to,
Last accessed: 24-06-2019.
[4] N. Rajovic, P. M. Carpenter, I. Gelado, N. Puzovic, A. Ramirez, and
M. Valero, “Supercomputing with commodity CPUs: Are mobile SoCs
ready for HPC?” in SC ’13: Proceedings of the International Conference
on High Performance Computing, Networking, Storage and Analysis,
Nov 2013, pp. 1–12.
[5] J. A. Ross, D. A. Richie, S. J. Park, D. R. Shires, and L. L. Pollock,
“A case study of OpenCL on an Android mobile GPU,” in 2014 IEEE
High Performance Extreme Computing Conference (HPEC), Sep. 2014,
pp. 1–6.
[6] M. M. Trompouki and L. Kosmidis, “Optimisation Opportunities and
Evaluation for GPGPU applications on Low-End Mobile GPUs,” in
Design, Automation Test in Europe Conference Exhibition (DATE), 2017.
[7] A. Behcet, “Alpha Magnetic Spectrometer (AMS02) experiment on the
International Space Station (ISS),” Nuclear Science and Techniques,
vol. 14, pp. 182–194, August 2003.
[8] NASA, “Alpha Magnetic Spectrometer - 02,” https://ams.nasa.gov, Last
accessed: 24-06-2019.
[9] S. Kawamoto, Y. Ohkawa, H. Okamoto, K. Iki, T. Okumura,
Y. Katayama, M. Hayashi, Y. Horikawa, H. Kato, N. Murakami,
T. Yamamoto, K. Inoue, and M. Ohnishi, “Current Status of Research
and Development on Active Debris Removal at JAXA,” 7th European
Conference on Space Debris (SDC7), 2017.
[10] O. Notebaert, J. Franklin, V. Lefftz, J. Moreno, M. Patte, M. Syed, and
A. Wagner, “Way Forward for High Performance Payload Processing
Development,” in Data Systems in Aerospace (DASIA), 2012.
[11] M. Patte and O. Notebaert, “Enabling Technologies for Efficient High
Performance Processing in Space Applications,” in European Conference
for Aeronautics and Space Sciences (EUCASS), 2015.
[12] B. Glass and R. Jansen, Eds., International Workshop on Analogue
and Mixed-Signal Integrated Circuits for Space Applications (AMICSA)
2016, Gothenburg, Sweden. ESA, 2016.
[13] M. M. Trompouki and L. Kosmidis, “Towards General Purpose Com-
putations on Low-end Mobile GPUs,” in Design, Automation Test in
Europe Conference Exhibition (DATE), 2016.
[14] M. M. Trompouki and L. Kosmidis, “Brook Auto: High-Level
Certification-Friendly Programming for GPU-powered Automotive Sys-
tems,” in Proceedings of the 55th Annual Design Automation Conference
(DAC), 2018.
[15] Khronos, “OpenGL ES Overview,” 2018, http://www.khronos.org/
opengles, Last accessed: 24-06-2019.
[16] Google Developers, “OpenGL ES Version,” 2018, https://developer.
android.com/about/dashboards/index.html, Last accessed: 24-06-2019.
[17] ARM, “Mali-400,” 2018, https://developer.arm.com/products/
graphics-and-multimedia/mali-gpus/mali-400-gpu, Last accessed:
24-06-2019.
[18] Xilinx, “Device Reliability Report- Xilinx UG116 (v10.5.2).”
[19] R. Balasubramanian, V. Gangadhar, Z. Guo, C.-H. Ho, C. Joseph,
J. Menon, M. P. Drumond, R. Paul, S. Prasad, P. Valathol, and K. Sankar-
alingam, “Enabling GPGPU Low-Level Hardware Explorations with
MIAOW: An Open-Source RTL Implementation of a GPGPU,” ACM
Trans. Archit. Code Optim., vol. 12, no. 2, Jun. 2015.
[20] K. Andryc, M. Merchant, and R. Tessier, “FlexGrip: A soft GPGPU for
FPGAs,” in International Conference on Field-Programmable Technol-
ogy (FPT), 2013.
[21] M. Al Kadi, B. Janssen, and M. Huebner, “FGPU: An SIMT-
Architecture for FPGAs,” in Proceedings of the 2016 ACM/SIGDA
International Symposium on Field-Programmable Gate Arrays (FPGA),
2016.
