93 research outputs found

    GCC-Plugin for Automated Accelerator Generation and Integration on Hybrid FPGA-SoCs

    Full text link
    In recent years, architectures combining a reconfigurable fabric and a general purpose processor on a single chip became increasingly popular. Such hybrid architectures allow extending embedded software with application specific hardware accelerators to improve performance and/or energy efficiency. Aiding system designers and programmers at handling the complexity of the required process of hardware/software (HW/SW) partitioning is an important issue. Current methods are often restricted, either to bare-metal systems, to subsets of mainstream programming languages, or require special coding guidelines, e.g., via annotations. These restrictions still represent a high entry barrier for the wider community of programmers that new hybrid architectures are intended for. In this paper we revisit HW/SW partitioning and present a seamless programming flow for unrestricted, legacy C code. It consists of a retargetable GCC plugin that automatically identifies code sections for hardware acceleration and generates code accordingly. The proposed workflow was evaluated on the Xilinx Zynq platform using unmodified code from an embedded benchmark suite.Comment: Presented at Second International Workshop on FPGAs for Software Programmers (FSP 2015) (arXiv:1508.06320

    Non-Intrusive Online Timing Analysis of Large Embedded Applications

    Get PDF
    A thorough understanding of the timing behavior of embedded systems software has become very important. With the advent of ever more complex embedded software e.g. in autonomous driving, the size of this software is growing at a fast pace. Execution time profiles (ETP) have proven to be a useful way to understand the timing behavior of embedded software. Collecting these ETPs was either limited to small applications or required multiple runs of the same software for calibration processes. In this contribution, we present a novel method for collecting ETPs in a single shot of the software at very high quality even for large applications

    X-Rel: Energy-Efficient and Low-Overhead Approximate Reliability Framework for Error-Tolerant Applications Deployed in Critical Systems

    Full text link
    Triple Modular Redundancy (TMR) is one of the most common techniques in fault-tolerant systems, in which the output is determined by a majority voter. However, the design diversity of replicated modules and/or soft errors that are more likely to happen in the nanoscale era may affect the majority voting scheme. Besides, the significant overheads of the TMR scheme may limit its usage in energy consumption and area-constrained critical systems. However, for most inherently error-resilient applications such as image processing and vision deployed in critical systems (like autonomous vehicles and robotics), achieving a given level of reliability has more priority than precise results. Therefore, these applications can benefit from the approximate computing paradigm to achieve higher energy efficiency and a lower area. This paper proposes an energy-efficient approximate reliability (X-Rel) framework to overcome the aforementioned challenges of the TMR systems and get the full potential of approximate computing without sacrificing the desired reliability constraint and output quality. The X-Rel framework relies on relaxing the precision of the voter based on a systematical error bounding method that leverages user-defined quality and reliability constraints. Afterward, the size of the achieved voter is used to approximate the TMR modules such that the overall area and energy consumption are minimized. The effectiveness of employing the proposed X-Rel technique in a TMR structure, for different quality constraints as well as with various reliability bounds are evaluated in a 15-nm FinFET technology. The results of the X-Rel voter show delay, area, and energy consumption reductions of up to 86%, 87%, and 98%, respectively, when compared to those of the state-of-the-art approximate TMR voters.Comment: This paper has been published in IEEE Transactions on Very Large Scale Integration (VLSI) System

    Continuous Non-Intrusive Hybrid WCET Estimation Using Waypoint Graphs

    Get PDF
    Traditionally, the Worst-Case Execution Time (WCET) of Embedded Software has been estimated using analytical approaches. This is effective, if good models of the processor/System-on-Chip (SoC) architecture exist. Unfortunately, modern high performance SoCs often contain unpredictable and/or undocumented components that influence the timing behaviour. Thus, analytical results for such processors are unrealistically pessimistic. One possible alternative approach seems to be hybrid WCET analysis, where measurement data together with an analytical approach is used to estimate worst-case behaviour. Previously, we demonstrated how continuous evaluation of basic block trace data can be used to produce detailed statistics of basic blocks in embedded software. In the meantime it has become clear that the trace data provided by modern SoCs delivers a different type of information. In this contribution, we show that even under realistic conditions, a meaningful analysis can be conducted with the trace data

    Concuerda: una solucion alternativa

    Get PDF
    52 p.Concuerda: una solución alterativa se plantea inicialmente como una indagación, en lo que concierne a soluciones que involucren el atar. Este primer punto comprende maneras cotidianas tales como: el calzado, juegos, alimentos, etc., así como también se hace alusión a los modos encontrados en el bordemar de la Región del Maule. Además se muestran ejemplos de escala mundial tales como Ia técnica scout o Ia destreza nautica. Luego de esto se hace una comparación de los casos del Maule con las soluciones convencionales para desembocar en un ítem de estudio de métodos en directa conexión con el atar desarrollados por arquitectos como Gaudí o los contemporáneos NOX. Como punto intermedio, luego de Ia indagación, aparece la fase de experimentación, la que se muestra en cuatro casos desarrollados empíricamente, tendientes a desarrollar una propuesta final. Estos anteproyectos son: el prototipo habitacional nómade, Ia texred, patio 1014 y el aeroancia. Finalmente se recogen los dos puntos anteriores (Investigación y Experimentación) y se aplica la carga de conocimiento, tanto adquirida como desarrollada, a una intervención arquitectónica ubicada en el frontis de la Biblioteca Regional del Maule que da cuenta de las temáticas anteriormente expuestas

    Context-Aware Technology Mapping in Genetic Design Automation

    Get PDF
    Genetic design automation (GDA) tools hold promise to speed-up circuit design in synthetic biology. Their widespread adoption is hampered by their limited predictive power, resulting in frequent deviations between the in silico and in vivo performance of a genetic circuit. Context effects, i.e., the change in overall circuit functioning, due to the intracellular environment of the host and due to cross-talk among circuits components are believed to be a major source for the aforementioned deviations. Incorporating these effects in computational models of GDA tools is challenging but is expected to boost their predictive power and hence their deployment. Using fine-grained thermodynamic models of promoter activity, we show in this work how to account for two major components of cellular context effects: (i) crosstalk due to limited specificity of used regulators and (ii) titration of circuit regulators to off-target binding sites on the host genome. We show how we can compensate the incurred increase in computational complexity through dedicated branch-and-bound techniques during the technology mapping process. Using the synthesis of several combinational logic circuits based on Cello’s device library as a case study, we analyze the effect of different intensities and distributions of crosstalk on circuit performance and on the usability of a given device library

    FPGA-Based Neural Thrust Controller for UAVs

    Full text link
    The advent of unmanned aerial vehicles (UAVs) has improved a variety of fields by providing a versatile, cost-effective and accessible platform for implementing state-of-the-art algorithms. To accomplish a broader range of tasks, there is a growing need for enhanced on-board computing to cope with increasing complexity and dynamic environmental conditions. Recent advances have seen the application of Deep Neural Networks (DNNs), particularly in combination with Reinforcement Learning (RL), to improve the adaptability and performance of UAVs, especially in unknown environments. However, the computational requirements of DNNs pose a challenge to the limited computing resources available on many UAVs. This work explores the use of Field Programmable Gate Arrays (FPGAs) as a viable solution to this challenge, offering flexibility, high performance, energy and time efficiency. We propose a novel hardware board equipped with an Artix-7 FPGA for a popular open-source micro-UAV platform. We successfully validate its functionality by implementing an RL-based low-level controller using real-world experiments

    Cellular Automata Applications in Shortest Path Problem

    Full text link
    Cellular Automata (CAs) are computational models that can capture the essential features of systems in which global behavior emerges from the collective effect of simple components, which interact locally. During the last decades, CAs have been extensively used for mimicking several natural processes and systems to find fine solutions in many complex hard to solve computer science and engineering problems. Among them, the shortest path problem is one of the most pronounced and highly studied problems that scientists have been trying to tackle by using a plethora of methodologies and even unconventional approaches. The proposed solutions are mainly justified by their ability to provide a correct solution in a better time complexity than the renowned Dijkstra's algorithm. Although there is a wide variety regarding the algorithmic complexity of the algorithms suggested, spanning from simplistic graph traversal algorithms to complex nature inspired and bio-mimicking algorithms, in this chapter we focus on the successful application of CAs to shortest path problem as found in various diverse disciplines like computer science, swarm robotics, computer networks, decision science and biomimicking of biological organisms' behaviour. In particular, an introduction on the first CA-based algorithm tackling the shortest path problem is provided in detail. After the short presentation of shortest path algorithms arriving from the relaxization of the CAs principles, the application of the CA-based shortest path definition on the coordinated motion of swarm robotics is also introduced. Moreover, the CA based application of shortest path finding in computer networks is presented in brief. Finally, a CA that models exactly the behavior of a biological organism, namely the Physarum's behavior, finding the minimum-length path between two points in a labyrinth is given.Comment: To appear in the book: Adamatzky, A (Ed.) Shortest path solvers. From software to wetware. Springer, 201

    Fast Fitting of the Dynamic Memdiode Model to the Conduction Characteristics of RRAM Devices Using Convolutional Neural Networks

    Get PDF
    In this paper, the use of Artificial Neural Networks (ANNs) in the form of Convolutional Neural Networks (AlexNET) for the fast and energy-efficient fitting of the Dynamic Memdiode Model (DMM) to the conduction characteristics of bipolar-type resistive switching (RS) devices is investigated. Despite an initial computationally intensive training phase the ANNs allow obtaining a mapping between the experimental Current-Voltage (I-V) curve and the corresponding DMM parameters without incurring a costly iterative process as typically considered in error minimization-based optimization algorithms. In order to demonstrate the fitting capabilities of the proposed approach, a complete set of I-Vs obtained from Y₂O₃-based RRAM devices, fabricated with different oxidation conditions and measured with different current compliances, is considered. In this way, in addition to the intrinsic RS variability, extrinsic variation is achieved by means of external factors (oxygen content and damage control during the set process). We show that the reported method provides a significant reduction of the fitting time (one order of magnitude), especially in the case of large data sets. This issue is crucial when the extraction of the model parameters and their statistical characterization are required

    Automatic generation of two phased models with CDL

    No full text
    • …
    corecore