197 research outputs found

    A Probabilistic Approach for the System-Level Design of Multi-ASIP Platforms

    Get PDF

    10281 Abstracts Collection -- Dynamically Reconfigurable Architectures

    Get PDF
    From 11.07.10 to 16.07.10, Dagstuhl Seminar 10281 ``Dynamically Reconfigurable Architectures \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    O pior caso estático de otimização do tempo de execução utilizando dpso para arquitetura ASIP

    Get PDF
    Introduction: The application of specific instructions significantly improves energy, performance, and code size of configurable processors. The design of these instructions is performed by the conversion of patterns related to application-specific operations into effective complex instructions. This research was presented at the icitkm Conference, University of Delhi, India in 2017. Methods: Static analysis was a prominent research method during late the 1980’s. However, end-to-end measurements consist of a standard approach in industrial settings. Both static analysis tools perform at a high-level in order to determine the program structure, which works on source code, or is executable in a disassembled binary. It is possible to work at a low-level if the real hardware timing information for the executable task has the desired features. Results: We experimented, tested and evaluated using a H.264 encoder application that uses nine cis, covering most of the computation intensive kernels. Multimedia applications are frequently subject to hard real time constraints in the field of computer vision. The H.264 encoder consists of complicated control flow with more number of decisions and nested loops. The parameters evaluated were different numbers of A partitions (300 slices on a Xilinx Virtex 7each), reconfiguration bandwidths, as well as relations of cpu frequency and fabric frequency fCPU/ffabric. ffabric remains constant at 100MHz, and we selected a multiplicity of its values for fCPU that resemble realistic units. Note that while we anticipate the wcet in seconds (wcetcycles/ f CPU) to be lower (better) with higher fCPU, the wcet cycles increase (at a constant ffabric) because hardware cis perform less computations on the reconfigurable fabric within one cpu cycle.    Introducción: la aplicación de instrucciones específicas mejora significativamente la energía, el rendimiento y el tamaño del código de los procesadores configurables. El diseño de estas instrucciones se realiza mediante conversión de patrones relacionados con operaciones específicas de la aplicación con instrucciones complejas y efectivas. Esta investigación se presentó en la Conferencia icitkm, Universidad de Delhi, India en 2017. Métodos: el análisis estático fue un método de investigación prominente durante la década de 1980; sin embargo, las mediciones de extremo a extremo son un enfoque convencional en los entornos industriales. Ambas herramientas de análisis estático se desempeñan a un alto nivel para determinar la estructura del programa que funciona en el código fuente, o que se ejecuta en un binario desmontado. Es posible trabajar a bajo nivel si la información de tiempo de hardware real para la tarea ejecutable presenta las características deseadas.  Introdução: a aplicação de instruções específicas melhora significativamente a energia, o desempenho e o tamanho do código dos processadores configuráveis. O desenho dessas instruções é realizado mediante a conversão de padrões relacionados com operações específicas da aplicação com instruções complexas e efetivas. Esta pesquisa foi apresentada na Conferência icitkm, Universidade de Délhi, Índia em 2017.Métodos: a análise estática foi um método de pesquisa proeminente durante a década de 1980; contudo, as medições de extremo a extremo são uma abordagem convencional nos contextos industriais. Ambas as ferramentas de análise estática se desempenham a um alto nível para determinar a estrutura do programa que funciona no código fonte ou que se executa num binário desmontado. É possível trabalhar a baixo nível se a informação de tempo de hardware real para a tarefa executável apresentar as características desejadas.Resultados: experimentamos, testamos e avaliamos com uma aplicação de codificação H.264 que utiliza nove elementos de configuração e cobre a maioria dos núcleos de cálculo intensivo. As aplicações multimídias estão com frequência sujeitas a duras restrições em tempo real no campo da visão por computador. O codificador H.264 consiste num complicado fluxo de controle com mais número de decisões e circuitos aninhados. Os parâmetros avaliados foram de diferentes números de particiones A (300 cortes num Xilinx Virtex 7 cada um) e largos de banda de reconfiguração, bem como de relações de frequência de cpu e frequência de fabric fcpu/ffabric. ffabric permanece constante a 100MHz. Selecionamos vários de seus valores para fcpu que são semelhantes a unidades realistas. É importante considerar que, ainda quando antecipamos o wcet em segundos (ciclos wcet/ fcpu), para que fossem inferiores (melhores) com fcpu mais alta, os ciclos wcet aumentam (num tecido constante f) porque os ci de hardware realizam menos cálculos no tecido reconfigurável dentro de uma cpu de ciclo.Conclusões: o método é similar à hibridação de árvores e métodos baseados en rotas, os quais são menos precisos, e ao método I pet global, que é mais preciso. A otimização é avaliada com o algoritmo de otimização por enxame de partículas discretas (dpso) para wcet. Para várias aplicações do mundo real que envolvem processadores integrados, a técnica proposta desenvolve conjuntos de instruções melhoradas em comparação com os conjuntos de instruções nativas.Originalidade: para a estimativa de wcet, deve-se considerar a análise de fluxo, a análise de baixo nível e as fases de cálculo do programa. A fase de análise de fluxo ou alto nível de análise ajuda a extrair o comportamento dinâmico do programa que proporciona informação sobre as funções invocadas, sobre o número de iterações de circuito, as dependências entre sentenças if, etc. Isso se deve a que a análise desconhece a rota de execução correspondente ao tempo de execução mais longo.Limitações: essa rota é executada dentro de uma iteração do núcleo que depende da natureza de mb, seja i-mb, seja p-mb, determinada pelo núcleo de estimativa de movimento, quer dizer que sua entrada depende das rotas i-mb e p-mb, que também contêm elementos de configuração separados que conduzem à instabilidade da rota do pior dos casos; em outras palavras, adicionar mais partições à rota atual do pior dos casos pode fazer com que a outra rota se converta no pior dos casos. A tubulação se detém pela demora de reconfiguração e continua ao ingressar no núcleo assim que finaliza o processo de reconfiguraçã

    Instruction-set architecture synthesis for VLIW processors

    Get PDF

    A proposed synthesis method for Application-Specific Instruction Set Processors

    Get PDF
    Due to the rapid technology advancement in integrated circuit era, the need for the high computation performance together with increasing complexity and manufacturing costs has raised the demand for high-performance con fi gurable designs; therefore, the Application-Speci fi c Instruction Set Processors (ASIPs) are widely used in SoC design. The automated generation of software tools for ASIPs is a commonly used technique, but the automated hardware model generation is less frequently applied in terms of fi nal RTL implementations. Contrary to this, the fi nal register-transfer level models are usually created, at least partly, manually. This paper presents a novel approach for automated hardware model generation for ASIPs. The new solution is based on a novel abstract ASIP model and a modeling language (Algorithmic Microarchitecture Description Language, AMDL) optimized for this architecture model. The proposed AMDL-based pre-synthesis method is based on a set of pre-de fi ned VHDL implementation schemes, which ensure the qualities of the automatically generated register-transfer level models in terms of resource requirement and operation frequency. The design framework implementing the algorithms required by the synthesis method is also presented

    Combining FPGA prototyping and high-level simulation approaches for Design Space Exploration of MPSoCs

    Get PDF
    Modern embedded systems are parallel, component-based, heterogeneous and finely tuned on the basis of the workload that must be executed on them. To improve design reuse, Application Specific Instruction-set Processors (ASIPs) are often employed as building blocks in such systems, as a solution capable of satisfying the required functional and physical constraints (e.g. throughput, latency, power or energy consumption etc.), while providing, at the same time, high flexibility and adaptability. Composing a multi-processor architecture including ASIPs and mapping parallel applications onto it is a design activity that require an extensive Design Space Exploration process (DSE), to result in cost-effective systems. The work described here aims at defining novel methodologies for the application-driven customizations of such highly heterogeneous embedded systems. The issue is tackled at different levels, integrating different tools. High-level event-based simulation is a widely used technique that offers speed and flexibility as main points of strength, but needs, as a preliminary input and periodically during the iteration process, calibration data that must be acquired by means of more accurate evaluation methods. Typically, this calibration is performed using instruction-level cycleaccurate simulators that, however, turn out to be very slow, especially when complete multiprocessor systems must be evaluated or when the grain of the calibration is too fine, while FPGA approaches have shown to performbetter for this particular applications. FPGA-based emulation techniques have been proposed in the recent past as an alternative solution to the software-based simulation approach, but some further steps are needed before they can be effectively exploitedwithin architectural design space exploration. Firstly, some kind of technology-awareness must be introduced, to enable the translation of the emulation results into a pre-estimation of a prospective ASIC implementation of the design. Moreover, when performing architectural DSE, a significant number of different candidate design points has to be evaluated and compared. In this case, if no countermeasures are taken, the advantages achievable with FPGAs, in terms of emulation speed, are counterbalanced by the overhead introduced by the time needed to go through the physical synthesis and implementation flow. Developed FPGA-based prototyping platform overcomes such limitations, enabling the use of FPGA-based prototyping for micro-architectural design space exploration of ASIP processors. In this approach, to increase the emulation speed-up, two different methods are proposed: the first is based on automatic instantiation of additional hardware modules, able to reconfigure at runtime the prototype, while the second leverages manipulation of application binary code, compiled for a custom VLIW ASIP architecture, that is transformed into code executable on a different configuration. This allows to prototype a whole set of ASIP solutions after one single FPGA implementation flow, mitigating the afore-mentioned overhead.A short overview on the tools used throughout the work will also be offered, covering basic aspects of Intel-Silicon Hive ASIP development toolchain, SESAME framework general description, along with a review of state-of-art simulation and prototyping techniques for complex multi-processor systems. Each proposed approach will be validated through a real-world use case, confirming the validity of this solution

    ASAM: Automatic Architecture Synthesis and Application Mapping,

    Get PDF
    Abstract -This paper focuses on mastering the automatic architecture synthesis and application mapping for heterogeneous massively-parallel MPSoCs based on customizable applicationspecific instruction-set processors (ASIPs). It presents an overview of the research being currently performed in the scope of the European project ASAM (Architecture Synthesis and Application Mapping) of the ARTEMIS program. The paper briefly presents the results of our analysis of the main problems to be solved and challenges to be faced in the design of such heterogeneous MPSoCs. It explains which system, design, and electronic design automation (EDA) concepts seem to be adequate to resolve the problems and address the challenges. Finally, it introduces and briefly discusses the design-flow and its main stages proposed by the ASAM project consortium to enable an effective and efficient solution of these problems. Index Terms-embedded systems, heterogeneous multiprocessor system-on-chip (MPSoC), customizable ASIPs, architecture synthesis, MPSoC and ASIP design automation

    Design of multimedia processor based on metric computation

    Get PDF
    Media-processing applications, such as signal processing, 2D and 3D graphics rendering, and image compression, are the dominant workloads in many embedded systems today. The real-time constraints of those media applications have taxing demands on today's processor performances with low cost, low power and reduced design delay. To satisfy those challenges, a fast and efficient strategy consists in upgrading a low cost general purpose processor core. This approach is based on the personalization of a general RISC processor core according the target multimedia application requirements. Thus, if the extra cost is justified, the general purpose processor GPP core can be enforced with instruction level coprocessors, coarse grain dedicated hardware, ad hoc memories or new GPP cores. In this way the final design solution is tailored to the application requirements. The proposed approach is based on three main steps: the first one is the analysis of the targeted application using efficient metrics. The second step is the selection of the appropriate architecture template according to the first step results and recommendations. The third step is the architecture generation. This approach is experimented using various image and video algorithms showing its feasibility

    Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

    Get PDF
    The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER
    corecore