461,813 research outputs found

    parMERASA Multi-Core Execution of Parallelised Hard Real-Time Applications Supporting Analysability

    Get PDF
    International audienceEngineers who design hard real-time embedded systems express a need for several times the performance available today while keeping safety as major criterion. A breakthrough in performance is expected by parallelizing hard real-time applications and running them on an embedded multi-core processor, which enables combining the requirements for high-performance with timing-predictable execution. parMERASA will provide a timing analyzable system of parallel hard real-time applications running on a scalable multicore processor. parMERASA goes one step beyond mixed criticality demands: It targets future complex control algorithms by parallelizing hard real-time programs to run on predictable multi-/many-core processors. We aim to achieve a breakthrough in techniques for parallelization of industrial hard real-time programs, provide hard real-time support in system software, WCET analysis and verification tools for multi-cores, and techniques for predictable multi-core designs with up to 64 cores

    Towards a verified compiler prototype for the synchronous language SIGNAL

    Get PDF
    International audienceSIGNAL belongs to the synchronous languages family which are widely used in the design of safety-critical real-time systems such as avionics, space systems, and nuclear power plants. This paper reports a compiler prototype for SIGNAL. Compared with the existing SIGNAL compiler, we propose a new intermediate representation (named S-CGA, a variant of clocked guarded actions), to integrate more synchronous programs into our compiler prototype in the future. The front-end of the compiler, i.e., the translation from SIGNAL to S-CGA, is presented. As well, the proof of semantics preservation is mechanized in the theorem prover Coq. Moreover, we present the back-end of the compiler, including sequential code generation and multithreaded code generation with time-predictable properties. With the rising importance of multi-core processors in safety-critical embedded systems or cyber-physical systems (CPS), there is a growing need for model-driven generation of multithreaded code and thus mapping on multi-core. We propose a time-predictable multi-core architecture model in architecture analysis and design language (AADL), and map the multi-threaded code to this model

    WCET Analysis of a Parallel 3D Multigrid Solver Executed on the MERASA Multi-Core

    Get PDF
    To meet performance requirements as well as constraints on cost and power consumption, future embedded systems will be designed with multi-core processors. However, the question of timing analysability is raised with these architectures. In the MERASA project, a WCET-aware multi-core processor has been designed with the appropriate system software. They both guarantee that the WCET of tasks running on different cores can be safely analyzed since their possible interactions can be bounded. Nevertheless, computing the WCET of a parallel application is still not straightforward and a high-level preliminary analysis of the communication and synchronization patterns must be performed. In this paper, we report on our experience in evaluating the WCET of a parallel 3D multigrid solver code and we propose lines for further research on this topic

    parMERASA – multicore execution of parallelised hard real-time applications supporting analysability

    Get PDF
    Abstract-Engineers who design hard real-time embedded systems express a need for several times the performance available today while keeping safety as major criterion. A breakthrough in performance is expected by parallelizing hard real-time applications and running them on an embedded multi-core processor, which enables combining the requirements for high-performance with timing-predictable execution. parMERASA will provide a timing analyzable system of parallel hard real-time applications running on a scalable multicore processor. parMERASA goes one step beyond mixed criticality demands: It targets future complex control algorithms by parallelizing hard real-time programs to run on predictable multi-/many-core processors. We aim to achieve a breakthrough in techniques for parallelization of industrial hard real-time programs, provide hard real-time support in system software, WCET analysis and verification tools for multi-cores, and techniques for predictable multi-core designs with up to 64 cores

    CROSS-LAYER CUSTOMIZATION FOR LOW POWER AND HIGH PERFORMANCE EMBEDDED MULTI-CORE PROCESSORS

    Get PDF
    Due to physical limitations and design difficulties, computer processor architecture has shifted to multi-core and even many-core based approaches in recent years. Such architectures provide potentials for sustainable performance scaling into future peta-scale/exa-scale computing platforms, at affordable power budget, design complexity, and verification efforts. To date, multi-core processor products have been replacing uni-core processors in almost every market segment, including embedded systems, general-purpose desktops and laptops, and super computers. However, many issues still remain with multi-core processor architectures that need to be addressed before their potentials could be fully realized. People in both academia and industry research community are still seeking proper ways to make efficient and effective use of these processors. The issues involve hardware architecture trade-offs, the system software service, the run-time management, and user application design, which demand more research effort into this field. Due to the architectural specialties with multi-core based computers, a Cross-Layer Customization framework is proposed in this work, which combines application specific information and system platform features, along with necessary operating system service support, to achieve exceptional power and performance efficiency for targeted multi-core platforms. Several topics are covered with specific optimization goals, including snoop cache coherence protocol, inter-core communication for producer-consumer applications, synchronization mechanisms, and off-chip memory bandwidth limitations. Analysis of benchmark program execution with conventional mechanisms is made to reveal the overheads in terms of power and performance. Specific customizations are proposed to eliminate such overheads with support from hardware, system software, compiler, and user applications. Experiments show significant improvement on system performance and power efficiency

    Smart Sensor Architectures for Multimedia Sensing in IoMT

    Full text link
    [EN] Today, a wide range of developments and paradigms require the use of embedded systems characterized by restrictions on their computing capacity, consumption, cost, and network connection. The evolution of the Internet of Things (IoT) towards Industrial IoT (IIoT) or the Internet of Multimedia Things (IoMT), its impact within the 4.0 industry, the evolution of cloud computing towards edge or fog computing, also called near-sensor computing, or the increase in the use of embedded vision, are current examples of this trend. One of the most common methods of reducing energy consumption is the use of processor frequency scaling, based on a particular policy. The algorithms to define this policy are intended to obtain good responses to the workloads that occur in smarthphones. There has been no study that allows a correct definition of these algorithms for workloads such as those expected in the above scenarios. This paper presents a method to determine the operating parameters of the dynamic governor algorithm called Interactive, which offers significant improvements in power consumption, without reducing the performance of the application. These improvements depend on the load that the system has to support, so the results are evaluated against three different loads, from higher to lower, showing improvements ranging from 62% to 26%.This work has been supported by the MCyU (Spanish Ministry of Science and Universities) under the project ATLAS (PGC2018-094151-B-I00), which is partially funded by AEI, FEDER and EU.Silvestre-Blanes, J.; Sempere Paya, VM.; Albero Albero, T. (2020). Smart Sensor Architectures for Multimedia Sensing in IoMT. Sensors. 20(5):1-16. https://doi.org/10.3390/s20051400S116205Bangemann, T., Riedl, M., Thron, M., & Diedrich, C. (2016). Integration of Classical Components Into Industrial Cyber–Physical Systems. Proceedings of the IEEE, 104(5), 947-959. doi:10.1109/jproc.2015.2510981Wollschlaeger, M., Sauter, T., & Jasperneite, J. (2017). The Future of Industrial Communication: Automation Networks in the Era of the Internet of Things and Industry 4.0. IEEE Industrial Electronics Magazine, 11(1), 17-27. doi:10.1109/mie.2017.2649104Salehi, M., & Ejlali, A. (2015). A Hardware Platform for Evaluating Low-Energy Multiprocessor Embedded Systems Based on COTS Devices. IEEE Transactions on Industrial Electronics, 62(2), 1262-1269. doi:10.1109/tie.2014.2352215Alvi, S. A., Afzal, B., Shah, G. A., Atzori, L., & Mahmood, W. (2015). Internet of multimedia things: Vision and challenges. Ad Hoc Networks, 33, 87-111. doi:10.1016/j.adhoc.2015.04.006Jridi, M., Chapel, T., Dorez, V., Le Bougeant, G., & Le Botlan, A. (2018). SoC-Based Edge Computing Gateway in the Context of the Internet of Multimedia Things: Experimental Platform. Journal of Low Power Electronics and Applications, 8(1), 1. doi:10.3390/jlpea8010001Memos, V. A., Psannis, K. E., Ishibashi, Y., Kim, B.-G., & Gupta, B. B. (2018). An Efficient Algorithm for Media-based Surveillance System (EAMSuS) in IoT Smart City Framework. Future Generation Computer Systems, 83, 619-628. doi:10.1016/j.future.2017.04.039Chianese, A., Piccialli, F., & Riccio, G. (2015). Designing a Smart Multisensor Framework Based on Beaglebone Black Board. Lecture Notes in Electrical Engineering, 391-397. doi:10.1007/978-3-662-45402-2_60Wang, W., Wang, Q., & Sohraby, K. (2016). Multimedia Sensing as a Service (MSaaS): Exploring Resource Saving Potentials of at Cloud-Edge IoTs and Fogs. IEEE Internet of Things Journal, 1-1. doi:10.1109/jiot.2016.2578722Munir, A., Gordon-Ross, A., & Ranka, S. (2014). Multi-Core Embedded Wireless Sensor Networks: Architecture and Applications. IEEE Transactions on Parallel and Distributed Systems, 25(6), 1553-1562. doi:10.1109/tpds.2013.219Baali, H., Djelouat, H., Amira, A., & Bensaali, F. (2018). Empowering Technology Enabled Care Using IoT and Smart Devices: A Review. IEEE Sensors Journal, 18(5), 1790-1809. doi:10.1109/jsen.2017.2786301Kim, Y. G., Kong, J., & Chung, S. W. (2018). A Survey on Recent OS-Level Energy Management Techniques for Mobile Processing Units. IEEE Transactions on Parallel and Distributed Systems, 29(10), 2388-2401. doi:10.1109/tpds.2018.2822683Chaib Draa, I., Niar, S., Tayeb, J., Grislin, E., & Desertot, M. (2016). Sensing user context and habits for run-time energy optimization. EURASIP Journal on Embedded Systems, 2017(1). doi:10.1186/s13639-016-0036-8Chen, Y.-L., Chang, M.-F., Yu, C.-W., Chen, X.-Z., & Liang, W.-Y. (2018). Learning-Directed Dynamic Voltage and Frequency Scaling Scheme with Adjustable Performance for Single-Core and Multi-Core Embedded and Mobile Systems. Sensors, 18(9), 3068. doi:10.3390/s18093068Tamilselvan, K., & Thangaraj, P. (2020). Pods – A novel intelligent energy efficient and dynamic frequency scalings for multi-core embedded architectures in an IoT environment. Microprocessors and Microsystems, 72, 102907. doi:10.1016/j.micpro.2019.10290

    The case for a Hardware Filesystem

    Get PDF
    As secondary storage devices get faster with flash based solid state drives (SSDs) and emerging technologies like phase change memories (PCM), overheads in system software like operating system (OS) and filesystem become prominent and may limit the potential performance improvements. Moreover, with rapidly increasing on-chip core count, monolithic operating systems will face scalability issues on these many-core chips. Future operating systems are likely to have a distributed nature, with a separation of operating system services amongst cores. Also, general purpose processors are known to be both performance and power inefficient while executing operating system code. In the domain of High Performance Computing with FPGAs too, relying on the OS for file I/O transactions using slow embedded processors, hinders performance. Migrating the filesystem into a dedicated hardware core, has the potential of improving the performance of data-intensive applications by bypassing the OS stack to provide higher bandwdith and reduced latency while accessing disks. To test the feasibility of this idea, an FPGA-based Hardware Filesystem (HWFS) was designed with five basic operations (open, read, write, delete and seek). Furthermore, multi-disk and RAID-0 (striping) support has been implemented as an option in the filesystem. In order to reduce design complexity and facilitate easier testing of the HWFS, a RAM disk was used initially. The filesystem core has been integrated and tested with a hardware application core (BLAST) as well as a multi-node FPGA network to provide remote-disk access. Finally, a SATA IP core was developed and directly integrated with HWFS to test with SSDs. For evaluation, HWFS's performance was compared to an Ext2 filesystem, both on an FPGA-based soft processor as well as a modern AMD Opteron Linux server with sequential and random workloads. Results prove that the Hardware Filesystem and supporting infrastructure provide substantial performance improvement over software only systems. The system is also resource efficient consuming less than 3% of logic and 5% of the Block RAMs of a Xilinx Virtex-6 chip

    Performance analysis techniques for multi-soft-core and many-soft-core systems

    Get PDF
    Multi-soft-core systems are a viable and interesting solution for embedded systems that need a particular tradeoff between performance, flexibility and development speed. As the growing capacity allows it, many-soft-cores are also expected to have relevance to future embedded systems. As a consequence, parallel programming methods and tools will be necessarily embraced as a part of the full system development process. Performance analysis is an important part of the development process for parallel applications. It is usually mandatory when you want to get a desired performance or to verify that the system is meeting some real-time constraints. One of the usual techniques used by the HPC community is the postmortem analysis of application traces. However, this is not easily transported to the embedded systems based on FPGA due to the resource limitations of the platforms. We propose several techniques and some hardware architectural support to be able to generate traces on multiprocessor systems based on FPGAs and use them to optimize the performance of the running applications

    A time-predictable many-core processor design for critical real-time embedded systems

    Get PDF
    Critical Real-Time Embedded Systems (CRTES) are in charge of controlling fundamental parts of embedded system, e.g. energy harvesting solar panels in satellites, steering and breaking in cars, or flight management systems in airplanes. To do so, CRTES require strong evidence of correct functional and timing behavior. The former guarantees that the system operates correctly in response of its inputs; the latter ensures that its operations are performed within a predefined time budget. CRTES aim at increasing the number and complexity of functions. Examples include the incorporation of \smarter" Advanced Driver Assistance System (ADAS) functionality in modern cars or advanced collision avoidance systems in Unmanned Aerial Vehicles (UAVs). All these new features, implemented in software, lead to an exponential growth in both performance requirements and software development complexity. Furthermore, there is a strong need to integrate multiple functions into the same computing platform to reduce the number of processing units, mass and space requirements, etc. Overall, there is a clear need to increase the computing power of current CRTES in order to support new sophisticated and complex functionality, and integrate multiple systems into a single platform. The use of multi- and many-core processor architectures is increasingly seen in the CRTES industry as the solution to cope with the performance demand and cost constraints of future CRTES. Many-cores supply higher performance by exploiting the parallelism of applications while providing a better performance per watt as cores are maintained simpler with respect to complex single-core processors. Moreover, the parallelization capabilities allow scheduling multiple functions into the same processor, maximizing the hardware utilization. However, the use of multi- and many-cores in CRTES also brings a number of challenges related to provide evidence about the correct operation of the system, especially in the timing domain. Hence, despite the advantages of many-cores and the fact that they are nowadays a reality in the embedded domain (e.g. Kalray MPPA, Freescale NXP P4080, TI Keystone II), their use in CRTES still requires finding efficient ways of providing reliable evidence about the correct operation of the system. This thesis investigates the use of many-core processors in CRTES as a means to satisfy performance demands of future complex applications while providing the necessary timing guarantees. To do so, this thesis contributes to advance the state-of-the-art towards the exploitation of parallel capabilities of many-cores in CRTES contributing in two different computing domains. From the hardware domain, this thesis proposes new many-core designs that enable deriving reliable and tight timing guarantees. From the software domain, we present efficient scheduling and timing analysis techniques to exploit the parallelization capabilities of many-core architectures and to derive tight and trustworthy Worst-Case Execution Time (WCET) estimates of CRTES.Los sistemas críticos empotrados de tiempo real (en ingles Critical Real-Time Embedded Systems, CRTES) se encargan de controlar partes fundamentales de los sistemas integrados, e.g. obtención de la energía de los paneles solares en satélites, la dirección y frenado en automóviles, o el control de vuelo en aviones. Para hacerlo, CRTES requieren fuerte evidencias del correcto comportamiento funcional y temporal. El primero garantiza que el sistema funciona correctamente en respuesta de sus entradas; el último asegura que sus operaciones se realizan dentro de unos limites temporales establecidos previamente. El objetivo de los CRTES es aumentar el número y la complejidad de las funciones. Algunos ejemplos incluyen los sistemas inteligentes de asistencia a la conducción en automóviles modernos o los sistemas avanzados de prevención de colisiones en vehiculos aereos no tripulados. Todas estas nuevas características, implementadas en software,conducen a un crecimiento exponencial tanto en los requerimientos de rendimiento como en la complejidad de desarrollo de software. Además, existe una gran necesidad de integrar múltiples funciones en una sóla plataforma para así reducir el número de unidades de procesamiento, cumplir con requisitos de peso y espacio, etc. En general, hay una clara necesidad de aumentar la potencia de cómputo de los actuales CRTES para soportar nueva funcionalidades sofisticadas y complejas e integrar múltiples sistemas en una sola plataforma. El uso de arquitecturas multi- y many-core se ve cada vez más en la industria CRTES como la solución para hacer frente a la demanda de mayor rendimiento y las limitaciones de costes de los futuros CRTES. Las arquitecturas many-core proporcionan un mayor rendimiento explotando el paralelismo de aplicaciones al tiempo que proporciona un mejor rendimiento por vatio ya que los cores se mantienen más simples con respecto a complejos procesadores de un solo core. Además, las capacidades de paralelización permiten programar múltiples funciones en el mismo procesador, maximizando la utilización del hardware. Sin embargo, el uso de multi- y many-core en CRTES también acarrea ciertos desafíos relacionados con la aportación de evidencias sobre el correcto funcionamiento del sistema, especialmente en el ámbito temporal. Por eso, a pesar de las ventajas de los procesadores many-core y del hecho de que éstos son una realidad en los sitemas integrados (por ejemplo Kalray MPPA, Freescale NXP P4080, TI Keystone II), su uso en CRTES aún precisa de la búsqueda de métodos eficientes para proveer evidencias fiables sobre el correcto funcionamiento del sistema. Esta tesis ahonda en el uso de procesadores many-core en CRTES como un medio para satisfacer los requisitos de rendimiento de aplicaciones complejas mientras proveen las garantías de tiempo necesarias. Para ello, esta tesis contribuye en el avance del estado del arte hacia la explotación de many-cores en CRTES en dos ámbitos de la computación. En el ámbito del hardware, esta tesis propone nuevos diseños many-core que posibilitan garantías de tiempo fiables y precisas. En el ámbito del software, la tesis presenta técnicas eficientes para la planificación de tareas y el análisis de tiempo para aprovechar las capacidades de paralelización en arquitecturas many-core, y también para derivar estimaciones de peor tiempo de ejecución (Worst-Case Execution Time, WCET) fiables y precisas
    • …
    corecore