30,291 research outputs found
Efficient compilation of .NET programs for embedded systems
International audienceThe overhead associated with object-oriented languages has been the major drawback in their adoption by the embedded world. In this paper, we propose a compilation approach based on the closed-world assumption (CWA) that should enable OO technologies such as .NET on small embedded systems. Our implementation is based on a type analysis algorithm, which extends RTA so that it eliminates some subtype tests due to array covariance, and coloring, which maintain single subtyping invariants under the CWA. The impact of our global optimizations has been evaluated on embedded applications written in C#. Preliminary results show a noticeable reduction of the code size, class hierarchy and object mechanisms such as virtual calls and subtype tests
In-place Graph Rewriting with Interaction Nets
An algorithm is in-place, or runs in-situ, when it does not need any
additional memory to execute beyond a small constant amount. There are many
algorithms that are efficient because of this feature, therefore it is an
important aspect of an algorithm. In most programming languages, it is not
obvious when an algorithm can run in-place, and moreover it is often not clear
that the implementation respects that idea. In this paper we study interaction
nets as a formalism where we can see directly, visually, that an algorithm is
in-place, and moreover the implementation will respect that it is in-place. Not
all algorithms can run in-place however. We can nevertheless still use the same
language, but now we can annotate parts of the algorithm that can run in-place.
We suggest an annotation for rules, and give an algorithm to find this
automatically through analysis of the interaction rules.Comment: In Proceedings TERMGRAPH 2016, arXiv:1609.0301
Compiler-assisted Adaptive Program Scheduling in big.LITTLE Systems
Energy-aware architectures provide applications with a mix of low (LITTLE)
and high (big) frequency cores. Choosing the best hardware configuration for a
program running on such an architecture is difficult, because program parts
benefit differently from the same hardware configuration. State-of-the-art
techniques to solve this problem adapt the program's execution to dynamic
characteristics of the runtime environment, such as energy consumption and
throughput. We claim that these purely dynamic techniques can be improved if
they are aware of the program's syntactic structure. To support this claim, we
show how to use the compiler to partition source code into program phases:
regions whose syntactic characteristics lead to similar runtime behavior. We
use reinforcement learning to map pairs formed by a program phase and a
hardware state to the configuration that best fit this setup. To demonstrate
the effectiveness of our ideas, we have implemented the Astro system. Astro
uses Q-learning to associate syntactic features of programs with hardware
configurations. As a proof of concept, we provide evidence that Astro
outperforms GTS, the ARM-based Linux scheduler tailored for heterogeneous
architectures, on the parallel benchmarks from Rodinia and Parsec
FastDepth: Fast Monocular Depth Estimation on Embedded Systems
Depth sensing is a critical function for robotic tasks such as localization,
mapping and obstacle detection. There has been a significant and growing
interest in depth estimation from a single RGB image, due to the relatively low
cost and size of monocular cameras. However, state-of-the-art single-view depth
estimation algorithms are based on fairly complex deep neural networks that are
too slow for real-time inference on an embedded platform, for instance, mounted
on a micro aerial vehicle. In this paper, we address the problem of fast depth
estimation on embedded systems. We propose an efficient and lightweight
encoder-decoder network architecture and apply network pruning to further
reduce computational complexity and latency. In particular, we focus on the
design of a low-latency decoder. Our methodology demonstrates that it is
possible to achieve similar accuracy as prior work on depth estimation, but at
inference speeds that are an order of magnitude faster. Our proposed network,
FastDepth, runs at 178 fps on an NVIDIA Jetson TX2 GPU and at 27 fps when using
only the TX2 CPU, with active power consumption under 10 W. FastDepth achieves
close to state-of-the-art accuracy on the NYU Depth v2 dataset. To the best of
the authors' knowledge, this paper demonstrates real-time monocular depth
estimation using a deep neural network with the lowest latency and highest
throughput on an embedded platform that can be carried by a micro aerial
vehicle.Comment: Accepted for presentation at ICRA 2019. 8 pages, 6 figures, 7 table
- …