5 research outputs found

    Unidad aritmética en coma flotante para sistemas autoreconfigurables dinámicamente sobre Spartan-3 basados en Microblaze

    Get PDF
    El presente artículo muestra la implementación de una unidad en coma flotante (FPU) que actúa como coprocesador dentro de un sistema auto-reconfigurable dinámicamente. La FPU tiene capacidad para resolver operaciones básicas como la suma, la resta, el producto, el cociente, la raíz cuadrada, la inversa y el cuadrado. Además, dispone de un registro en el que se almacena el último resultado obtenido con la intención de utilizarlo como operador en el siguiente cálculo, de modo que se reducen los accesos a los buses de comunicación en la resolución de las operaciones matemáticas. El diseño emplea Microblaze como microprocesador del sistema y su implementación se ha realizado sobre una FPGA Spartan 3 de bajo coste. El artículo muestra resultados experimentales en relación al área total ocupada, así como los tiempos de ejecución obtenidos con un ejemplo particular basado en un algoritmo de CORDIC resuelto en coma flotante.Peer ReviewedPostprint (author’s final draft

    Hardware Architectures and Implementations for Associative Memories : the Building Blocks of Hierarchically Distributed Memories

    Get PDF
    During the past several decades, the semiconductor industry has grown into a global industry with revenues around $300 billion. Intel no longer relies on only transistor scaling for higher CPU performance, but instead, focuses more on multiple cores on a single die. It has been projected that in 2016 most CMOS circuits will be manufactured with 22 nm process. The CMOS circuits will have a large number of defects. Especially when the transistor goes below sub-micron, the original deterministic circuits will start having probabilistic characteristics. Hence, it would be challenging to map traditional computational models onto probabilistic circuits, suggesting a need for fault-tolerant computational algorithms. Biologically inspired algorithms, or associative memories (AMs)—the building blocks of cortical hierarchically distributed memories (HDMs) discussed in this dissertation, exhibit a remarkable match to the nano-scale electronics, besides having great fault-tolerance ability. Research on the potential mapping of the HDM onto CMOL (hybrid CMOS/nanoelectronic circuits) nanogrids provides useful insight into the development of non-von Neumann neuromorphic architectures and semiconductor industry. In this dissertation, we investigated the implementations of AMs on different hardware platforms, including microprocessor based personal computer (PC), PC cluster, field programmable gate arrays (FPGA), CMOS, and CMOL nanogrids. We studied two types of neural associative memory models, with and without temporal information. In this research, we first decomposed the computational models into basic and common operations, such as matrix-vector inner-product and k-winners-take-all (k-WTA). We then analyzed the baseline performance/price ratio of implementing the AMs with a PC. We continued with a similar performance/price analysis of the implementations on more parallel hardware platforms, such as PC cluster and FPGA. However, the majority of the research emphasized on the implementations with all digital and mixed-signal full-custom CMOS and CMOL nanogrids. In this dissertation, we draw the conclusion that the mixed-signal CMOL nanogrids exhibit the best performance/price ratio over other hardware platforms. We also highlighted some of the trade-offs between dedicated and virtualized hardware circuits for the HDM models. A simple time-multiplexing scheme for the digital CMOS implementations can achieve comparable throughput as the mixed-signal CMOL nanogrids

    Power and Energy Aware Heterogeneous Computing Platform

    Get PDF
    During the last decade, wireless technologies have experienced significant development, most notably in the form of mobile cellular radio evolution from GSM to UMTS/HSPA and thereon to Long-Term Evolution (LTE) for increasing the capacity and speed of wireless data networks. Considering the real-time constraints of the new wireless standards and their demands for parallel processing, reconfigurable architectures and in particular, multicore platforms are part of the most successful platforms due to providing high computational parallelism and throughput. In addition to that, by moving toward Internet-of-Things (IoT), the number of wireless sensors and IP-based high throughput network routers is growing at a rapid pace. Despite all the progression in IoT, due to power and energy consumption, a single chip platform for providing multiple communication standards and a large processing bandwidth is still missing.The strong demand for performing different sets of operations by the embedded systems and increasing the computational performance has led to the use of heterogeneous multicore architectures with the help of accelerators for computationally-intensive data-parallel tasks acting as coprocessors. Currently, highly heterogeneous systems are the most power-area efficient solution for performing complex signal processing systems. Additionally, the importance of IoT has increased significantly the need for heterogeneous and reconfigurable platforms.On the other hand, subsequent to the breakdown of the Dennardian scaling and due to the enormous heat dissipation, the performance of a single chip was obstructed by the utilization wall since all cores cannot be clocked at their maximum operating frequency. Therefore, a thermal melt-down might be happened as a result of high instantaneous power dissipation. In this context, a large fraction of the chip, which is switched-off (Dark) or operated at a very low frequency (Dim) is called Dark Silicon. The Dark Silicon issue is a constraint for the performance of computers, especially when the up-coming IoT scenario will demand a very high performance level with high energy efficiency. Among the suggested solution to combat the problem of Dark-Silicon, the use of application-specific accelerators and in particular Coarse-Grained Reconfigurable Arrays (CGRAs) are the main motivation of this thesis work.This thesis deals with design and implementation of Software Defined Radio (SDR) as well as High Efficiency Video Coding (HEVC) application-specific accelerators for computationally intensive kernels and data-parallel tasks. One of the most important data transmission schemes in SDR due to its ability of providing high data rates is Orthogonal Frequency Division Multiplexing (OFDM). This research work focuses on the evaluation of Heterogeneous Accelerator-Rich Platform (HARP) by implementing OFDM receiver blocks as designs for proof-of-concept. The HARP template allows the designer to instantiate a heterogeneous reconfigurable platform with a very large amount of custom-tailored computational resources while delivering a high performance in terms of many high-level metrics. The availability of this platform lays an excellent foundation to investigate techniques and methods to replace the Dark or Dim part of chip with high-performance silicon dissipating very low power and energy. Furthermore, this research work is also addressing the power and energy issues of the embedded computing systems by tailoring the HARP for self-aware and energy-aware computing models. In this context, the instantaneous power dissipation and therefore the heat dissipation of HARP are mitigated on FPGA/ASIC by using Dynamic Voltage and Frequency Scaling (DVFS) to minimize the dark/dim part of the chip. Upgraded HARP for self-aware and energy-aware computing can be utilized as an energy-efficient general-purpose transceiver platform that is cognitive to many radio standards and can provide high throughput while consuming as little energy as possible. The evaluation of HARP has shown promising results, which makes it a suitable platform for avoiding Dark Silicon in embedded computing platforms and also for diverse needs of IoT communications.In this thesis, the author designed the blocks of OFDM receiver by crafting templatebased CGRA devices and then attached them to HARP’s Network-on-Chip (NoC) nodes. The performance of application-specific accelerators generated from templatebased CGRAs, the performance of the entire platform subsequent to integrating the CGRA nodes on HARP and the NoC traffic are recorded in terms of several highlevel performance metrics. In evaluating HARP on FPGA prototype, it delivers a performance of 0.012 GOPS/mW. Because of the scalability and regularity in HARP, the author considered its value as architectural constant. In addition to showing the gain and the benefits of maximizing the number of reconfigurable processing resources on a platform in comparison to the scaled performance of several state-of-the-art platforms, HARP’s architectural constant ensures application-independent figure of merit. HARP is further evaluated by implementing various sizes of Discrete Cosine transform (DCT) and Discrete Sine Transform (DST) dedicated for HEVC standard, which showed its ability to sustain Full HD 1080p format at 30 fps on FPGA. The author also integrated self-aware computing model in HARP to mitigate the power dissipation of an OFDM receiver. In the case of FPGA implementation, the total power dissipation of the platform showed 16.8% reduction due to employing the Feedback Control System (FCS) technique with Dynamic Frequency Scaling (DFS). Furthermore, by moving to ASIC technology and scaling both frequency and voltage simultaneously, significant dynamic power reduction (up to 82.98%) was achieved, which proved the DFS/DVFS techniques as one step forward to mitigate the Dark Silicon issue

    Hardware/software architectures for iris biometrics

    Get PDF
    Nowadays, the necessity of identifying users of facilities and services has become quite important not only to determine who accesses a system and/or service, but also to determine which privileges should be provided to each user. For achieving such identification, Biometrics is emerging as a technology that provides a high level of security, as well as being convenient and comfortable for the citizen. Most biometric systems are based on computer solutions, where the identification process is performed by servers or workstations, whose cost and processing time make them not feasible for some situations. However, Microelectronics can provide a suitable solution without the need of complex and expensive computer systems. Microelectronics is a subfield of Electronics and as the name suggests, is related to the study, development and/or manufacturing of electronic components, i.e. integrated circuits (ICs). We have focused our research in a concrete field of Microelectronics: hardware/software co-design. This technique is widely used for developing specific and high computational cost devices. Its basis relies on using both hardware and software solutions in an effective way, thus, obtaining a device faster than just a software solution, or smaller devices that use dedicated hardware developed for all the processes. The questions on how we can obtain an effective solution for Biometrics will be solved considering all the different aspects of these systems. In this Thesis, we have made two important contributions: the first one for a verification system based on ID token and secondly, a search engine used for massive recognition systems, both of them related to Iris Biometrics. The first relevant contribution is a biometric system architecture proposal based on ID tokens in a distributed system. In this contribution, we have specified some considerations to be done in the system and describe the different functionalities of the elements which form it, such as the central servers and/or the terminals. The main functionality of the terminal is just left to acquiring the initial biometric raw data, which will be transmitted under security cryptographic methods to the token, where all the biometric process will be performed. The ID token architecture is based on Hardware/software co-design. The architecture proposed, independent of the modality, divides the biometric process into hardware and software in order to achieve further performance functions, more than in the existing tokens. This partition considers not only the decrease of computational time hardware can provide, but also the reduction of area and power consumption, the increase in security levels and the effects on performance in all the design. To prove the proposal made, we have implemented an ID token based on Iris Biometrics following our premises. We have developed different modules for an iris algorithm both in hardware and software platforms to obtain results necessary for an effective combination of same. We have also studied different alternatives for solving the partition problem in the Hardware/software co-design issue, leading to results which point out tabu search as the fastest algorithm for this purpose. Finally, with all the data obtained, we have been able to obtain different architectures according to different constraints. We have presented architectures where the time is a major requirement, and we have obtained 30% less processing time than in all software solutions. Likewise, another solution has been proposed which provides less area and power consumption. When considering the performance as the most important constraint, two architectures have been presented, one which also tries to minimize the processing time and another which reduces hardware area and power consumption. In regard the security we have also shown two architectures considering time and hardware area as secondary requirements. Finally, we have presented an ultimate architecture where all these factors were considered. These architectures have allowed us to study how hardware improves the security against authentication attacks, how the performance is influenced by the lack of floating point operations in hardware modules, how hardware reduces time with software reducing the hardware area and the power consumption. The other singular contribution made is the development of a search engine for massive identification schemes, where time is a major constraint as the comparison should be performed over millions of users. We have initially proposed two implementations: following a centralized architecture, where memories are connected to the microprocessor, although the comparison is performed by a dedicated hardware co-processor, and a second approach, where we have connected the memory driver directly in the hardware coprocessor. This last architecture has showed us the importance of a correct connection between the elements used when time is a major requirement. A graphical representation of the different aspects covered in this Thesis is presented in Fig.1, where the relation between the different topics studied can be seen. The main topics, Biometrics and Hardware/Software Co-design have been studied, where several aspects of them have been described, such as the different Biometric modalities, where we have focussed on Iris Biometrics and the security related to these systems. Hardware/Software Co-design has been studied by presenting different design alternatives and by identifying the most suitable configuration for ID Tokens. All the data obtained from this analysis has allowed us to offer two main proposals: The first focuses on the development of a fast search engine device, and the second combines all the factors related to both sciences with regards ID tokens, where different aspects have been combined in its Hardware/Software Design. Both approaches have been implemented to show the feasibility of our proposal. Finally, as a result of the investigation performed and presented in this thesis, further work and conclusions can be presented as a consequence of the work developed.-----------------------------------------------------------------------------------------Actualmente la identificación usuarios para el acceso a recintos o servicios está cobrando importancia no sólo para poder permitir el acceso, sino además para asignar los correspondientes privilegios según el usuario del que se trate. La Biometría es una tecnología emergente que además de realizar estas funciones de identificación, aporta mayores niveles de seguridad que otros métodos empleados, además de resultar más cómodo para el usuario. La mayoría de los sistemas biométricos están basados en ordenadores personales o servidores, sin embargo, la Microelectrónica puede aportar soluciones adecuadas para estos sistemas, con un menor coste y complejidad. La Microelectrónica es un campo de la Electrónica, que como su nombre sugiere, se basa en el estudio, desarrollo y/o fabricación de componentes electrónicos, también denominados circuitos integrados. Hemos centrado nuestra investigación en un campo específico de la Microelectrónica llamado co-diseño hardware/software. Esta técnica se emplea en el desarrollo de dispositivos específicos que requieren un alto gasto computacional. Se basa en la división de tareas a realizar entre hardware y software, consiguiendo dispositivos más rápidos que aquellos únicamente basados en una de las dos plataformas, y más pequeños que aquellos que se basan únicamente en hardware. Las cuestiones sobre como podemos crear soluciones aplicables a la Biometría son las que intentan ser cubiertas en esta tesis. En esta tesis, hemos propuesto dos importantes contribuciones: una para aquellos sistemas de verificación que se apoyan en dispositivos de identificación y una segunda que propone el desarrollo de un sistema de búsqueda masiva. La primera aportación es la metodología para el desarrollo de un sistema distribuido basado en dispositivos de identificación. En nuestra propuesta, el sistema de identificación está formado por un proveedor central de servicios, terminales y dichos dispositivos. Los terminales propuestos únicamente tienen la función de adquirir la muestra necesaria para la identificación, ya que son los propios dispositivos quienes realizan este proceso. Los dispositivos se apoyan en una arquitectura basada en codiseño hardware/software, donde los procesos biométricos se realizan en una de las dos plataformas, independientemente de la modalidad biométrica que se trate. El reparto de tareas se realiza de tal manera que el diseñador pueda elegir que parámetros le interesa más enfatizar, y por tanto se puedan obtener distintas arquitecturas según se quiera optimizar el tiempo de procesado, el área o consumo, minimizar los errores de identificación o incluso aumentar la seguridad del sistema por medio de la implementación en hardware de aquellos módulos que sean más susceptibles a ser atacados por intrusos. Para demostrar esta propuesta, hemos implementado uno de estos dispositivos basándonos en un algoritmo de reconocimiento por iris. Hemos desarrollado todos los módulos de dicho algoritmo tanto en hardware como en software, para posteriormente realizar combinaciones de ellos, en busca de arquitecturas que cumplan ciertos requisitos. Hemos estudiado igualmente distintas alternativas para la solucionar el problema propuesto, basándonos en algoritmos genéticos, enfriamiento simulado y búsqueda tabú. Con los datos obtenidos del estudio previo y los procedentes de los módulos implementados, hemos obtenido una arquitectura que minimiza el tiempo de ejecución en un 30%, otra que reduce el área y el consumo del dispositivo, dos arquitecturas distintas que evitan la pérdida de precisión y por tanto minimizan los errores en la identificación: una que busca reducir el área al máximo posible y otra que pretende que el tiempo de procesado sea mínimo; dos arquitecturas que buscan aumentar la seguridad, minimizando ya sea el tiempo o el área y por último, una arquitectura donde todos los factores antes nombrados son considerados por igual. La segunda contribución de la tesis se refiere al desarrollo de un motor de búsqueda para identificación masiva. La premisa seguida en esta propuesta es la de minimizar el tiempo lo más posible para que los usuarios no deban esperar mucho tiempo para ser identificados. Para ello hemos propuesto dos alternativas: una arquitectura clásica donde las memorias están conectadas a un microprocesador central, el cual a su vez se comunica con un coprocesador que realiza las funciones de comparación. Una segunda alternativa, donde las memorias se conectan directamente a dicho co-procesador, evitándose el uso del microprocesador en el proceso de comparación. Ambas propuestas son comparadas y analizadas, mostrando la importancia de una correcta y apropiada conexión de los distintos elementos que forman un sistema. La Fig. 2 muestra los distintos temas tratados en esta tesis, señalando la relación existente entre ellos. Los principales temas estudiados son la Biometría y el co-diseño hardware/software, describiendo distintos aspectos de ellos, como las diferentes modalidades biométricas, centrándonos en la Biometría por iris o la seguridad relativa a estos sistemas. En el caso del co-diseño hardware/software se presenta un estado de la técnica donde se comentan diversas alternativas para el desarrollo de sistemas empotrados, el trabajo propuesto por otros autores en el ¶ambito del co-diseño y por último qué características deben cumplir los dispositivos de identificación como sistemas empotrados. Con toda esta información pasamos al desarrollo de las propuestas antes descritas y los desarrollos realizados. Finalmente, conclusiones y trabajo futuro son propuestos a raíz de la investigación realizada

    Libro de Memorias: II Congreso de Microelectrónica Aplicada 2011

    Get PDF
    Este volumen contiene los trabajos presentados para el “Segundo Congreso de Microelectrónica Aplicada (μEA 2011)” que se ha de celebrar en la Ciudad de La Plata durante los días 7, 8 y 9 de Septiembre de 2011. μEA 2011 tiene los siguientes objetivos: • Constituirse en un foro de intercambio de experiencias entre los profesionales y estudiantes de todas las universidades en las áreas de Electrónica. • Comunicar a la sociedad y en nuestro idioma, los logros y resultados obtenidos, en la actividad de investigación dedicada a las aplicaciones de las Micro y Nanotecnologías.. • Incrementar la cooperación entre los grupos industriales y académicos de la Argentina y Latinoamérica con la actividad en el campo de la Microelectrónica y sus Aplicaciones.Centro de Técnicas Analógico-Digitale
    corecore