30,291 research outputs found

    Efficient compilation of .NET programs for embedded systems

    Full text link
    International audienceThe overhead associated with object-oriented languages has been the major drawback in their adoption by the embedded world. In this paper, we propose a compilation approach based on the closed-world assumption (CWA) that should enable OO technologies such as .NET on small embedded systems. Our implementation is based on a type analysis algorithm, which extends RTA so that it eliminates some subtype tests due to array covariance, and coloring, which maintain single subtyping invariants under the CWA. The impact of our global optimizations has been evaluated on embedded applications written in C#. Preliminary results show a noticeable reduction of the code size, class hierarchy and object mechanisms such as virtual calls and subtype tests

    In-place Graph Rewriting with Interaction Nets

    Full text link
    An algorithm is in-place, or runs in-situ, when it does not need any additional memory to execute beyond a small constant amount. There are many algorithms that are efficient because of this feature, therefore it is an important aspect of an algorithm. In most programming languages, it is not obvious when an algorithm can run in-place, and moreover it is often not clear that the implementation respects that idea. In this paper we study interaction nets as a formalism where we can see directly, visually, that an algorithm is in-place, and moreover the implementation will respect that it is in-place. Not all algorithms can run in-place however. We can nevertheless still use the same language, but now we can annotate parts of the algorithm that can run in-place. We suggest an annotation for rules, and give an algorithm to find this automatically through analysis of the interaction rules.Comment: In Proceedings TERMGRAPH 2016, arXiv:1609.0301

    Compiler-assisted Adaptive Program Scheduling in big.LITTLE Systems

    Full text link
    Energy-aware architectures provide applications with a mix of low (LITTLE) and high (big) frequency cores. Choosing the best hardware configuration for a program running on such an architecture is difficult, because program parts benefit differently from the same hardware configuration. State-of-the-art techniques to solve this problem adapt the program's execution to dynamic characteristics of the runtime environment, such as energy consumption and throughput. We claim that these purely dynamic techniques can be improved if they are aware of the program's syntactic structure. To support this claim, we show how to use the compiler to partition source code into program phases: regions whose syntactic characteristics lead to similar runtime behavior. We use reinforcement learning to map pairs formed by a program phase and a hardware state to the configuration that best fit this setup. To demonstrate the effectiveness of our ideas, we have implemented the Astro system. Astro uses Q-learning to associate syntactic features of programs with hardware configurations. As a proof of concept, we provide evidence that Astro outperforms GTS, the ARM-based Linux scheduler tailored for heterogeneous architectures, on the parallel benchmarks from Rodinia and Parsec

    FastDepth: Fast Monocular Depth Estimation on Embedded Systems

    Full text link
    Depth sensing is a critical function for robotic tasks such as localization, mapping and obstacle detection. There has been a significant and growing interest in depth estimation from a single RGB image, due to the relatively low cost and size of monocular cameras. However, state-of-the-art single-view depth estimation algorithms are based on fairly complex deep neural networks that are too slow for real-time inference on an embedded platform, for instance, mounted on a micro aerial vehicle. In this paper, we address the problem of fast depth estimation on embedded systems. We propose an efficient and lightweight encoder-decoder network architecture and apply network pruning to further reduce computational complexity and latency. In particular, we focus on the design of a low-latency decoder. Our methodology demonstrates that it is possible to achieve similar accuracy as prior work on depth estimation, but at inference speeds that are an order of magnitude faster. Our proposed network, FastDepth, runs at 178 fps on an NVIDIA Jetson TX2 GPU and at 27 fps when using only the TX2 CPU, with active power consumption under 10 W. FastDepth achieves close to state-of-the-art accuracy on the NYU Depth v2 dataset. To the best of the authors' knowledge, this paper demonstrates real-time monocular depth estimation using a deep neural network with the lowest latency and highest throughput on an embedded platform that can be carried by a micro aerial vehicle.Comment: Accepted for presentation at ICRA 2019. 8 pages, 6 figures, 7 table
    • …
    corecore