6 research outputs found

    Energy profiling of FPGA-based PHY-layer building blocks encountered in modern wireless communication systems

    Get PDF
    Proceeding at: IEEE 8th Sensor Array and Multichannel Signal Processing Workshop (SAM), took place 2014, Jun, 22-25 in Coruña (españa). The event web site of http://www.gtec.udc.es/sam2014/ .Characterizing the energy cost of different physical (PHY) layer building blocks is becoming increasingly important in modern cellular-based communications, considering the cross sector requirements for performance enhancements and energy savings. This paper presents energy profiling metrics of different PHY-layer FPGA implementations encountered in modern wireless communication systems. The results give an insight of the distribution of the consumed energy in different baseband building blocks or configurations before and after applying power optimizations in the FPGA design and implementation.This work was partially supported by: the Spanish Government under projects TEC2011-29006-C03-01 (GRE3N-PHY), TEC2011-29006-C03-02 (GRE3N-LINKMAC) and TEC2011-29006-C03-03 (GRE3N-SYST); and the European Commission under project NEWCOM# (GA 318306).Publicad

    Vector support for multicore processors with major emphasis on configurable multiprocessors

    Get PDF
    It recently became increasingly difficult to build higher speed uniprocessor chips because of performance degradation and high power consumption. The quadratically increasing circuit complexity forbade the exploration of more instruction-level parallelism (JLP). To continue raising the performance, processor designers then focused on thread-level parallelism (TLP) to realize a new architecture design paradigm. Multicore processor design is the result of this trend. It has proven quite capable in performance increase and provides new opportunities in power management and system scalability. But current multicore processors do not provide powerful vector architecture support which could yield significant speedups for array operations while maintaining arealpower efficiency. This dissertation proposes and presents the realization of an FPGA-based prototype of a multicore architecture with a shared vector unit (MCwSV). FPGA stands for Filed-Programmable Gate Array. The idea is that rather than improving only scalar or TLP performance, some hardware budget could be used to realize a vector unit to greatly speedup applications abundant in data-level parallelism (DLP). To be realistic, limited by the parallelism in the application itself and by the compiler\u27s vectorizing abilities, most of the general-purpose programs can only be partially vectorized. Thus, for efficient resource usage, one vector unit should be shared by several scalar processors. This approach could also keep the overall budget within acceptable limits. We suggest that this type of vector-unit sharing be established in future multicore chips. The design, implementation and evaluation of an MCwSV system with two scalar processors and a shared vector unit are presented for FPGA prototyping. The MicroBlaze processor, which is a commercial IP (Intellectual Property) core from Xilinx, is used as the scalar processor; in the experiments the vector unit is connected to a pair of MicroBlaze processors through standard bus interfaces. The overall system is organized in a decoupled and multi-banked structure. This organization provides substantial system scalability and better vector performance. For a given area budget, benchmarks from several areas show that the MCwSV system can provide significant performance increase as compared to a multicore system without a vector unit. However, a MCwSV system with two MicroBlazes and a shared vector unit is not always an optimized system configuration for various applications with different percentages of vectorization. On the other hand, the MCwSV framework was designed for easy scalability to potentially incorporate various numbers of scalar/vector units and various function units. Also, the flexibility inherent to FPGAs can aid the task of matching target applications. These benefits can be taken into account to create optimized MCwSV systems for various applications. So the work eventually focused on building an architecture design framework incorporating performance and resource management for application-specific MCwSV (AS-MCwSV) systems. For embedded system design, resource usage, power consumption and execution latency are three metrics to be used in design tradeoffs. The product of these metrics is used here to choose the MCwSV system with the smallest value

    Caractérisation automatisée de la consommation de puissance des processeurs pour l'estimation au niveau système

    Get PDF
    RÉSUMÉ De nos jours, la consommation de puissance est une contrainte clé et une métrique de performance essentielle lors du design des systèmes numériques. La dissipation de chaleur excessive sur les circuits intégrés diminue relativement leurs performances. Également, plus que jamais, nous avons le besoin d’augmenter le temps de vie des batteries de nouvelles électroniques portables. Avec les techniques de design classiques, RTL « Register Transfer Level », une estimation de puissance précise est possible seulement aux dernières étapes du processus de développement. Pour remédier à cette problématique, on a récemment proposé dans la littérature de hausser le niveau d’abstraction de la conception de systèmes embarqués à l’aide de la méthodologie de niveau système « Electronic System Level » (ESL). Dans cette perspective, ce travail propose une méthodologie capable de caractériser automatiquement la consommation de puissance des processeurs configurable de type « soft-processors » et de générer un modèle efficace pour l’estimation de l’énergie consommée au niveau système. À l'aide de ce modèle, une étude comparative entre trois techniques d’estimation est donc présentée. Les résultats de cinq programmes tests montrent une estimation de puissance huit mille fois plus rapide que les techniques d’estimation conventionnelles et une erreur moyenne de seulement ±3.98 % pour le processeur LEON3 et de ±10.70 % pour le processeur Microblaze.----------ABSTRACT Nowadays, power consumption is a key constraint and a digital system design essential metric of performance. Excessive heat dissipation of integrated circuits relatively decreases the performance of the system. Also, more than ever, we need to increase the battery lifetime of new portable electronics. With classical design techniques as RTL « Register Transfer Level », precise power estimation is only possible in the final stages of the development process. To solve this problem, the literature recently proposed to raise the abstraction level of embedded systems design, using ESL « Electronic System Level » methodology. In this context, this project proposes a methodology to automatically characterize configurable soft-processors power consumption and generate an effective power model for energy consumption estimation at system level. Using this model, a comparative study between three estimation techniques is also presented. The results of five benchmarks show that our power estimation is eight thousand times faster than conventional estimation techniques and an average error of only ±3.98 % for the LEON3 processor and ±10.70 % for the Microblaze processor

    Placement and routing for reconfigurable systems.

    Get PDF
    Applications using reconfigurable logic have been widely demonstrated to offer better performance over software-based solutions. However, good performance rating is often destroyed by poor reconfiguration latency - time required to reconfigure hardware to perform the new task. Recent research focus on design automation techniques to address reconfiguration latency bottleneck. The contribution to novelty of this thesis is in new placement and routing techniques resulting in minimising reconfiguration latency of reconfigurable systems. This presents a part of design process concerned with positioning and connecting design blocks in a logic gate array. The aim of the research is to optimise the placement and interconnect strategy such that dynamic changes in system functionality can be achieved with minimum delay. A review of previous work in the field is given and the relevant theoretical framework developed. The dynamic reconfiguration problem is analysed for various reconfigurable technologies. Several algorithms are developed and evaluated using a representative set of problem domains to assess their effectiveness. Results obtained with novel placement and routing techniques demonstrate configuration data size reduction leading to significant reconfiguration latency improvements

    FPGA Dynamic Power Minimization through Placement and Routing Constraints

    No full text
    Field-programmable gate arrays (FPGAs) are pervasive in embedded systems requiring low-power utilization. A novel power optimization methodology for reducing the dynamic power consumed by the routing of FPGA circuits by modifying the constraints applied to existing commercial tool sets is presented. The power optimization techniques influence commercial FPGA Place and Route (PAR) tools by translating power goals into standard throughput and placement-based constraints. The Low-Power Intelligent Tool Environment (LITE) is presented, which was developed to support the experimentation of power models and power optimization algorithms. The generated constraints seek to implement one of four power optimization approaches: slack minimization, clock tree paring, N-terminal net colocation, and area minimization. In an experimental study, we optimize dynamic power of circuits mapped into 0.12 µm Xilinx Virtex-II FPGAs. Results show that several optimization algorithms can be combined on a single design, and power is reduced by up to 19.4%, with an average power savings of 10.2%. Copyright © 2006 Li Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1

    Une méthode d'estimation de la consommation de puissance pour un système sur puce

    Get PDF
    RÉSUMÉ Estimer la consommation de puissance le plus tôt possible durant le cycle de développement est important pour pouvoir rencontrer le temps de mise en marché. Pour cela, plusieurs recherches en consommation de puissance se tournent vers l'estimation à haut niveau, comme la Modélisation au Niveau Transactionnel (TLM), pour accélérer l’obtention des estimations de puissance. Ce travail présente une méthodologie à haut-niveau orienté sur les Coeur sous licence (IP) qui effectue une estimation de puissance. La méthode propose une distinction entre l'activité de l'IP concerné et de son implémentation. Ceci permet de facilement créer un modèle qui peut être réutilisé avec différentes fréquences et implémentations. En utilisant l'information obtenue par des mesures d'une description au Niveau Registre (RTL), un modèle peut-être créé pour des simulations haut-niveau permettant d'abstraire l'implémentation. La méthodologie est présentée sur un processeur, une mémoire, un bus, une minuterie et un Contrôleur d'Interruption de Processeur (PIC) de Xilinx. En comparaison à des estimations effectuées au niveau RTL, le modèle permet d'estimer la consommation de puissance avec une précision de 25 ±10% par rapport à une estimation effectuée avec Xpower; et ce avec un facteur accélération de trois ou quatre ordres de grandeur.---------- ABSTRACT Estimating the power consumption of System on Chip as early as possible in the design life cycle is important to meet the time to market requirements. For this purpose, most research is turning toward high-level models, like the Transaction-Level Modeling (TLM), to estimate power earlier. This work presents a high-level Intellectual Property core (IP) oriented power estimation methodology. The methodology separates the activity of the IP from the implementation. This allows the ability to easily create a model that can be used with different frequencies, layout and implementation technology. By using data gathered from the Register-Transfer Level (RTL) a model can be created for high-level simulation that can take into account the technology and characteristics of the Field-Programmable Gate Array (FPGA) device. The methodology is presented in this work for a processor, its local memory IP, counter, Processor Interrupt Controller (PIC) and bus from Xilinx. Compared to estimations made at the RTL level, the resulting model gives accurate results of 25% ±10% compared to a Xpower estimate with three to four order speedups and through different implementations
    corecore