83 research outputs found
FAT-DBT engine (framework for application-tailorcd, co-designcd dynamic binary translation enginc)
Tese de Doutoramento em Engenharia Eletrónica e de Computadores (PDEEC)Dynamic binary translation (DBT) has emerged as an execution engine that monitors,
modifies and possibly optimizes running applications for specific purposes.
DBT is deployed as an execution layer between the application binary and the operating
system or host-machine, which creates opportunities for collecting runtime
information. Initially, DBT supported binary-level compatibility, but based on the
collected runtime information, it also became popular for code instrumentation,
ISA-virtualization and dynamic-optimization purposes.
Building a DBT system brings many challenges, as it involves complex components
integration and requires deep architectural level knowledge. Moreover, DBT incurs
in significant overheads, mainly due to code decoding and translation, as well as
execution along with general functionalities emulation. While initially conceived
bearing in mind high-end architectures for performance demanding applications,
such challenges become even more evident when directing DBT to embedded systems.
The latter makes an effective deployment very challenging due to its complexity,
tight constraints on memory, and limited performance and power. Legacy
support and binary compatibility is a topic of relevant interest in such systems,
due to their broad dissemination among industrial environments and wide utilization
in sensing and monitoring processes, from yearly times, with considerable
maintenance and replacement costs.
To address such issues, this thesis intents to contribute with a solution that leverages
an optimized and accelerated dynamic binary translator targeting resourceconstrained
embedded systems while supporting legacy systems.
The developed work allows to: (1) evaluate the potential of DBT for legacy support
purposes on the resource-constrained embedded systems; (2) achieve a configurable
DBT architecture specialized for resource-constrained embedded systems;
(3) address DBT translation, execution and emulation overheads through the combination
of software and hardware; and (4) promote DBT utilization as a legacy
support tool for the industry as a end-product.A tradução binária dinâmica (TBD) emergiu como um motor de execução que
permite a modificação e possível optimização de código executável para um determinado
propósito. A TBD é integrada nos sistemas como uma camada de execução
entre o código binário executável e o sistema operativo ou a máquina hospedeira
alvo, o que origina oportunidades de recolha de informação de execução.
A criação de um sistema de TBD traz consigo diversos desafios, uma vez que envolve
a integração de componentes complexos e conhecimentos aprofundados das
arquitecturas de processadores envolvidas. Ademais, a utilização de TBD gera diversos
custos computacionais indirectos, maioritariamente devido à descodificação
e tradução de código, bem como emulação de funcionalidades em geral. Considerando
que a TBD foi inicialmente pensada para sistemas de gama alta, os
desafios mencionados tornam-se ainda mais evidentes quando a mesma é aplicada
em sistemas embebidos. Nesta área os limitados recursos de memória e os exigentes
requisitos de desempenho e consumo energético,tornam uma implementação eficiente
de TBD muito difícil de obter. Compatibilidade binária e suporte a código
de legado são tópicos de interesse em sistemas embebidos, justificado pela ampla
disseminação dos mesmos no meio industrial para tarefas de sensorização e monitorização
ao longo dos tempos, reforçado pelos custos de manutenção adjacentes
à sua utilização.
Para endereçar os desafios descritos, nesta tese propõe-se uma solução para potencializar
a tradução binária dinâmica, optimizada e com aceleração, para suporte a
código de legado em sistemas embebidos de baixa gama.
O trabalho permitiu (1) avaliar o potencial da TBD quando aplicada ao suporte
a código de legado em sistemas embebidos de baixa gama; (2) a obtenção de
uma arquitectura de TBD configurável e especializada para este tipo de sistemas;
(3) reduzir os custos computacionais associados à tradução, execução e emulação,
através do uso combinado de software e hardware; (4) e promover a utilização na
industria de TBD como uma ferramenta de suporte a código de legado.This thesis was supported by a PhD scholarship from Fundação para a Ciência e
Tecnologia, SFRH/BD/81681/201
Correct synthesis and integration of compiler-generated function units
PhD ThesisComputer architectures can use custom logic in addition to general pur-
pose processors to improve performance for a variety of applications. The
use of custom logic allows greater parallelism for some algorithms. While
conventional CPUs typically operate on words, ne-grained custom logic
can improve e ciency for many bit level operations. The commodi ca-
tion of eld programmable devices, particularly FPGAs, has improved
the viability of using custom logic in an architecture.
This thesis introduces an approach to reasoning about the correctness of
compilers that generate custom logic that can be synthesized to provide
hardware acceleration for a given application. Compiler intermediate
representations (IRs) and transformations that are relevant to genera-
tion of custom logic are presented. Architectures may vary in the way
that custom logic is incorporated, and suitable abstractions are used in
order that the results apply to compilation for a variety of the design
parameters that are introduced by the use of custom logic
Design and management of image processing pipelines within CPS : Acquired experience towards the end of the FitOptiVis ECSEL Project
Cyber-Physical Systems (CPSs) are dynamic and reactive systems interacting with processes, environment and, sometimes, humans. They are often distributed with sensors and actuators, characterized for being smart, adaptive, predictive and react in real-time. Indeed, image- and video-processing pipelines are a prime source for environmental information for systems allowing them to take better decisions according to what they see. Therefore, in FitOptiVis, we are developing novel methods and tools to integrate complex image- and video-processing pipelines. FitOptiVis aims to deliver a reference architecture for describing and optimizing quality and resource management for imaging and video pipelines in CPSs both at design- and run-time. The architecture is concretized in low-power, high-performance, smart components, and in methods and tools for combined design-time and run-time multi-objective optimization and adaptation within system and environment constraints.Peer reviewe
A Survey and Evaluation of FPGA High-Level Synthesis Tools
High-level synthesis (HLS) is increasingly popular for the design of high-performance and energy-efficient heterogeneous systems, shortening time-to-market and addressing today's system complexity. HLS allows designers to work at a higher-level of abstraction by using a software program to specify the hardware functionality. Additionally, HLS is particularly interesting for designing field-programmable gate array circuits, where hardware implementations can be easily refined and replaced in the target device. Recent years have seen much activity in the HLS research community, with a plethora of HLS tool offerings, from both industry and academia. All these tools may have different input languages, perform different internal optimizations, and produce results of different quality, even for the very same input description. Hence, it is challenging to compare their performance and understand which is the best for the hardware to be implemented. We present a comprehensive analysis of recent HLS tools, as well as overview the areas of active interest in the HLS research community. We also present a first-published methodology to evaluate different HLS tools. We use our methodology to compare one commercial and three academic tools on a common set of C benchmarks, aiming at performing an in-depth evaluation in terms of performance and the use of resources
Combining FPGA prototyping and high-level simulation approaches for Design Space Exploration of MPSoCs
Modern embedded systems are parallel, component-based, heterogeneous and finely tuned on the basis of the workload that must be executed on them. To improve design reuse, Application Specific Instruction-set Processors (ASIPs) are often employed as building blocks in such systems, as a solution capable of satisfying the required functional and physical constraints (e.g. throughput, latency, power or energy consumption etc.), while providing, at
the same time, high flexibility and adaptability. Composing a multi-processor architecture including ASIPs and mapping parallel applications onto it is a design activity that require an extensive Design Space Exploration process (DSE), to result in cost-effective systems. The
work described here aims at defining novel methodologies for the application-driven customizations of such highly heterogeneous embedded systems. The issue is tackled at different levels, integrating different tools.
High-level event-based simulation is a widely used technique that offers speed and flexibility as main points of strength, but needs, as a preliminary input and periodically during the iteration process, calibration data that must be acquired by means of more accurate
evaluation methods. Typically, this calibration is performed using instruction-level cycleaccurate
simulators that, however, turn out to be very slow, especially when complete multiprocessor
systems must be evaluated or when the grain of the calibration is too fine, while FPGA approaches have shown to performbetter for this particular applications.
FPGA-based emulation techniques have been proposed in the recent past as an alternative solution to the software-based simulation approach, but some further steps are needed
before they can be effectively exploitedwithin architectural design space exploration. Firstly,
some kind of technology-awareness must be introduced, to enable the translation of the emulation results into a pre-estimation of a prospective ASIC implementation of the design. Moreover, when performing architectural DSE, a significant number of different candidate design points has to be evaluated and compared. In this case, if no countermeasures are taken, the advantages achievable with FPGAs, in terms of emulation speed, are counterbalanced
by the overhead introduced by the time needed to go through the physical synthesis and implementation flow.
Developed FPGA-based prototyping platform overcomes such limitations, enabling the use of FPGA-based prototyping for micro-architectural design space exploration of ASIP processors. In this approach, to increase the emulation speed-up, two different methods are proposed: the first is based on automatic instantiation of additional hardware modules, able to reconfigure at runtime the prototype, while the second leverages manipulation of application
binary code, compiled for a custom VLIW ASIP architecture, that is transformed into code executable on a different configuration. This allows to prototype a whole set of ASIP
solutions after one single FPGA implementation flow, mitigating the afore-mentioned overhead.A short overview on the tools used throughout the work will also be offered, covering basic aspects of Intel-Silicon Hive ASIP development toolchain, SESAME framework general
description, along with a review of state-of-art simulation and prototyping techniques for complex multi-processor systems. Each proposed approach will be validated through a real-world use case, confirming the validity of this solution
Application specific instruction set processor design for embedded application using the coware tool
An Application Specific Instruction Set Processor (ASIP) is widely used as a System on a Chip(SoC) Component. ASIPs possess an instruction set which is tai-lored to benefit a specific application. Such specialization allows ASIPs to serve as an intermediate between two dominant processor design styles- ASICs which has high processing abilities at the cost of limited programmability and Programmable solu-tions such as FPGAs that provide programming exibility at the cost of less energy eficiency. In this dissertation the goal is to design ASIP, keeping in mind a temper-ature sensor system. The platform used for processor design is LISA 2.0 description language and processor designing environment from CoWare. Coware processor de-signer allows processor architecture to be defined at an abstract level and automatic generation of chain of software tools like assembler, linker and simulator for functional verification followed by RTL level description. RTL level description is used to gen-erate synthesized report of the design using RTL compiler and finally the layout is created using Cadence encounter
Combining FPGA prototyping and high-level simulation approaches for Design Space Exploration of MPSoCs
Modern embedded systems are parallel, component-based, heterogeneous and finely tuned on the basis of the workload that must be executed on them. To improve design reuse, Application Specific Instruction-set Processors (ASIPs) are often employed as building blocks in such systems, as a solution capable of satisfying the required functional and physical constraints (e.g. throughput, latency, power or energy consumption etc.), while providing, at
the same time, high flexibility and adaptability. Composing a multi-processor architecture including ASIPs and mapping parallel applications onto it is a design activity that require an extensive Design Space Exploration process (DSE), to result in cost-effective systems. The
work described here aims at defining novel methodologies for the application-driven customizations of such highly heterogeneous embedded systems. The issue is tackled at different levels, integrating different tools.
High-level event-based simulation is a widely used technique that offers speed and flexibility as main points of strength, but needs, as a preliminary input and periodically during the iteration process, calibration data that must be acquired by means of more accurate
evaluation methods. Typically, this calibration is performed using instruction-level cycleaccurate
simulators that, however, turn out to be very slow, especially when complete multiprocessor
systems must be evaluated or when the grain of the calibration is too fine, while FPGA approaches have shown to performbetter for this particular applications.
FPGA-based emulation techniques have been proposed in the recent past as an alternative solution to the software-based simulation approach, but some further steps are needed
before they can be effectively exploitedwithin architectural design space exploration. Firstly,
some kind of technology-awareness must be introduced, to enable the translation of the emulation results into a pre-estimation of a prospective ASIC implementation of the design. Moreover, when performing architectural DSE, a significant number of different candidate design points has to be evaluated and compared. In this case, if no countermeasures are taken, the advantages achievable with FPGAs, in terms of emulation speed, are counterbalanced
by the overhead introduced by the time needed to go through the physical synthesis and implementation flow.
Developed FPGA-based prototyping platform overcomes such limitations, enabling the use of FPGA-based prototyping for micro-architectural design space exploration of ASIP processors. In this approach, to increase the emulation speed-up, two different methods are proposed: the first is based on automatic instantiation of additional hardware modules, able to reconfigure at runtime the prototype, while the second leverages manipulation of application
binary code, compiled for a custom VLIW ASIP architecture, that is transformed into code executable on a different configuration. This allows to prototype a whole set of ASIP
solutions after one single FPGA implementation flow, mitigating the afore-mentioned overhead.A short overview on the tools used throughout the work will also be offered, covering basic aspects of Intel-Silicon Hive ASIP development toolchain, SESAME framework general
description, along with a review of state-of-art simulation and prototyping techniques for complex multi-processor systems. Each proposed approach will be validated through a real-world use case, confirming the validity of this solution
- …