10,251 research outputs found

    A software controlled voltage tuning system using multi-purpose ring oscillators

    Full text link
    This paper presents a novel software driven voltage tuning method that utilises multi-purpose Ring Oscillators (ROs) to provide process variation and environment sensitive energy reductions. The proposed technique enables voltage tuning based on the observed frequency of the ROs, taken as a representation of the device speed and used to estimate a safe minimum operating voltage at a given core frequency. A conservative linear relationship between RO frequency and silicon speed is used to approximate the critical path of the processor. Using a multi-purpose RO not specifically implemented for critical path characterisation is a unique approach to voltage tuning. The parameters governing the relationship between RO and silicon speed are obtained through the testing of a sample of processors from different wafer regions. These parameters can then be used on all devices of that model. The tuning method and software control framework is demonstrated on a sample of XMOS XS1-U8A-64 embedded microprocessors, yielding a dynamic power saving of up to 25% with no performance reduction and no negative impact on the real-time constraints of the embedded software running on the processor

    Formal and Informal Methods for Multi-Core Design Space Exploration

    Full text link
    We propose a tool-supported methodology for design-space exploration for embedded systems. It provides means to define high-level models of applications and multi-processor architectures and evaluate the performance of different deployment (mapping, scheduling) strategies while taking uncertainty into account. We argue that this extension of the scope of formal verification is important for the viability of the domain.Comment: In Proceedings QAPL 2014, arXiv:1406.156

    Temperature Regulation in Multicore Processors Using Adjustable-Gain Integral Controllers

    Full text link
    This paper considers the problem of temperature regulation in multicore processors by dynamic voltage-frequency scaling. We propose a feedback law that is based on an integral controller with adjustable gain, designed for fast tracking convergence in the face of model uncertainties, time-varying plants, and tight computing-timing constraints. Moreover, unlike prior works we consider a nonlinear, time-varying plant model that trades off precision for simple and efficient on-line computations. Cycle-level, full system simulator implementation and evaluation illustrates fast and accurate tracking of given temperature reference values, and compares favorably with fixed-gain controllers.Comment: 8 pages, 6 figures, IEEE Conference on Control Applications 2015, Accepted Versio

    GCC-Plugin for Automated Accelerator Generation and Integration on Hybrid FPGA-SoCs

    Full text link
    In recent years, architectures combining a reconfigurable fabric and a general purpose processor on a single chip became increasingly popular. Such hybrid architectures allow extending embedded software with application specific hardware accelerators to improve performance and/or energy efficiency. Aiding system designers and programmers at handling the complexity of the required process of hardware/software (HW/SW) partitioning is an important issue. Current methods are often restricted, either to bare-metal systems, to subsets of mainstream programming languages, or require special coding guidelines, e.g., via annotations. These restrictions still represent a high entry barrier for the wider community of programmers that new hybrid architectures are intended for. In this paper we revisit HW/SW partitioning and present a seamless programming flow for unrestricted, legacy C code. It consists of a retargetable GCC plugin that automatically identifies code sections for hardware acceleration and generates code accordingly. The proposed workflow was evaluated on the Xilinx Zynq platform using unmodified code from an embedded benchmark suite.Comment: Presented at Second International Workshop on FPGAs for Software Programmers (FSP 2015) (arXiv:1508.06320

    Vector support for multicore processors with major emphasis on configurable multiprocessors

    Get PDF
    It recently became increasingly difficult to build higher speed uniprocessor chips because of performance degradation and high power consumption. The quadratically increasing circuit complexity forbade the exploration of more instruction-level parallelism (JLP). To continue raising the performance, processor designers then focused on thread-level parallelism (TLP) to realize a new architecture design paradigm. Multicore processor design is the result of this trend. It has proven quite capable in performance increase and provides new opportunities in power management and system scalability. But current multicore processors do not provide powerful vector architecture support which could yield significant speedups for array operations while maintaining arealpower efficiency. This dissertation proposes and presents the realization of an FPGA-based prototype of a multicore architecture with a shared vector unit (MCwSV). FPGA stands for Filed-Programmable Gate Array. The idea is that rather than improving only scalar or TLP performance, some hardware budget could be used to realize a vector unit to greatly speedup applications abundant in data-level parallelism (DLP). To be realistic, limited by the parallelism in the application itself and by the compiler\u27s vectorizing abilities, most of the general-purpose programs can only be partially vectorized. Thus, for efficient resource usage, one vector unit should be shared by several scalar processors. This approach could also keep the overall budget within acceptable limits. We suggest that this type of vector-unit sharing be established in future multicore chips. The design, implementation and evaluation of an MCwSV system with two scalar processors and a shared vector unit are presented for FPGA prototyping. The MicroBlaze processor, which is a commercial IP (Intellectual Property) core from Xilinx, is used as the scalar processor; in the experiments the vector unit is connected to a pair of MicroBlaze processors through standard bus interfaces. The overall system is organized in a decoupled and multi-banked structure. This organization provides substantial system scalability and better vector performance. For a given area budget, benchmarks from several areas show that the MCwSV system can provide significant performance increase as compared to a multicore system without a vector unit. However, a MCwSV system with two MicroBlazes and a shared vector unit is not always an optimized system configuration for various applications with different percentages of vectorization. On the other hand, the MCwSV framework was designed for easy scalability to potentially incorporate various numbers of scalar/vector units and various function units. Also, the flexibility inherent to FPGAs can aid the task of matching target applications. These benefits can be taken into account to create optimized MCwSV systems for various applications. So the work eventually focused on building an architecture design framework incorporating performance and resource management for application-specific MCwSV (AS-MCwSV) systems. For embedded system design, resource usage, power consumption and execution latency are three metrics to be used in design tradeoffs. The product of these metrics is used here to choose the MCwSV system with the smallest value

    A proposed synthesis method for Application-Specific Instruction Set Processors

    Get PDF
    Due to the rapid technology advancement in integrated circuit era, the need for the high computation performance together with increasing complexity and manufacturing costs has raised the demand for high-performance con fi gurable designs; therefore, the Application-Speci fi c Instruction Set Processors (ASIPs) are widely used in SoC design. The automated generation of software tools for ASIPs is a commonly used technique, but the automated hardware model generation is less frequently applied in terms of fi nal RTL implementations. Contrary to this, the fi nal register-transfer level models are usually created, at least partly, manually. This paper presents a novel approach for automated hardware model generation for ASIPs. The new solution is based on a novel abstract ASIP model and a modeling language (Algorithmic Microarchitecture Description Language, AMDL) optimized for this architecture model. The proposed AMDL-based pre-synthesis method is based on a set of pre-de fi ned VHDL implementation schemes, which ensure the qualities of the automatically generated register-transfer level models in terms of resource requirement and operation frequency. The design framework implementing the algorithms required by the synthesis method is also presented

    MPSoCBench : um framework para avaliação de ferramentas e metodologias para sistemas multiprocessados em chip

    Get PDF
    Orientador: Rodolfo Jardim de AzevedoTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Recentes metodologias e ferramentas de projetos de sistemas multiprocessados em chip (MPSoC) aumentam a produtividade por meio da utilização de plataformas baseadas em simuladores, antes de definir os últimos detalhes da arquitetura. No entanto, a simulação só é eficiente quando utiliza ferramentas de modelagem que suportem a descrição do comportamento do sistema em um elevado nível de abstração. A escassez de plataformas virtuais de MPSoCs que integrem hardware e software escaláveis nos motivou a desenvolver o MPSoCBench, que consiste de um conjunto escalável de MPSoCs incluindo quatro modelos de processadores (PowerPC, MIPS, SPARC e ARM), organizado em plataformas com 1, 2, 4, 8, 16, 32 e 64 núcleos, cross-compiladores, IPs, interconexões, 17 aplicações paralelas e estimativa de consumo de energia para os principais componentes (processadores, roteadores, memória principal e caches). Uma importante demanda em projetos MPSoC é atender às restrições de consumo de energia o mais cedo possível. Considerando que o desempenho do processador está diretamente relacionado ao consumo, há um crescente interesse em explorar o trade-off entre consumo de energia e desempenho, tendo em conta o domínio da aplicação alvo. Técnicas de escalabilidade dinâmica de freqüência e voltagem fundamentam-se em gerenciar o nível de tensão e frequência da CPU, permitindo que o sistema alcance apenas o desempenho suficiente para processar a carga de trabalho, reduzindo, consequentemente, o consumo de energia. Para explorar a eficiência energética e desempenho, foram adicionados recursos ao MPSoCBench, visando explorar escalabilidade dinâmica de voltaegem e frequência (DVFS) e foram validados três mecanismos com base na estimativa dinâmica de energia e taxa de uso de CPUAbstract: Recent design methodologies and tools aim at enhancing the design productivity by providing a software development platform before the definition of the final Multiprocessor System on Chip (MPSoC) architecture details. However, simulation can only be efficiently performed when using a modeling and simulation engine that supports system behavior description at a high abstraction level. The lack of MPSoC virtual platform prototyping integrating both scalable hardware and software in order to create and evaluate new methodologies and tools motivated us to develop the MPSoCBench, a scalable set of MPSoCs including four different ISAs (PowerPC, MIPS, SPARC, and ARM) organized in platforms with 1, 2, 4, 8, 16, 32, and 64 cores, cross-compilers, IPs, interconnections, 17 parallel version of software from well-known benchmarks, and power consumption estimation for main components (processors, routers, memory, and caches). An important demand in MPSoC designs is the addressing of energy consumption constraints as early as possible. Whereas processor performance comes with a high power cost, there is an increasing interest in exploring the trade-off between power and performance, taking into account the target application domain. Dynamic Voltage and Frequency Scaling techniques adaptively scale the voltage and frequency levels of the CPU allowing it to reach just enough performance to process the system workload while meeting throughput constraints, and thereby, reducing the energy consumption. To explore this wide design space for energy efficiency and performance, both for hardware and software components, we provided MPSoCBench features to explore dynamic voltage and frequency scalability (DVFS) and evaluated three mechanisms based on energy estimation and CPU usage rateDoutoradoCiência da ComputaçãoDoutora em Ciência da Computaçã
    corecore