407 research outputs found

    Multiple bit error correcting architectures over finite fields

    Get PDF
    This thesis proposes techniques to mitigate multiple bit errors in GF arithmetic circuits. As GF arithmetic circuits such as multipliers constitute the complex and important functional unit of a crypto-processor, making them fault tolerant will improve the reliability of circuits that are employed in safety applications and the errors may cause catastrophe if not mitigated. Firstly, a thorough literature review has been carried out. The merits of efficient schemes are carefully analyzed to study the space for improvement in error correction, area and power consumption. Proposed error correction schemes include bit parallel ones using optimized BCH codes that are useful in applications where power and area are not prime concerns. The scheme is also extended to dynamically correcting scheme to reduce decoder delay. Other method that suits low power and area applications such as RFIDs and smart cards using cross parity codes is also proposed. The experimental evaluation shows that the proposed techniques can mitigate single and multiple bit errors with wider error coverage compared to existing methods with lesser area and power consumption. The proposed scheme is used to mask the errors appearing at the output of the circuit irrespective of their cause. This thesis also investigates the error mitigation schemes in emerging technologies (QCA, CNTFET) to compare area, power and delay with existing CMOS equivalent. Though the proposed novel multiple error correcting techniques can not ensure 100% error mitigation, inclusion of these techniques to actual design can improve the reliability of the circuits or increase the difficulty in hacking crypto-devices. Proposed schemes can also be extended to non GF digital circuits

    Low Power Memory/Memristor Devices and Systems

    Get PDF
    This reprint focusses on achieving low-power computation using memristive devices. The topic was designed as a convenient reference point: it contains a mix of techniques starting from the fundamental manufacturing of memristive devices all the way to applications such as physically unclonable functions, and also covers perspectives on, e.g., in-memory computing, which is inextricably linked with emerging memory devices such as memristors. Finally, the reprint contains a few articles representing how other communities (from typical CMOS design to photonics) are fighting on their own fronts in the quest towards low-power computation, as a comparison with the memristor literature. We hope that readers will enjoy discovering the articles within

    Fault and Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices

    Get PDF
    This research addresses design of a reliable computer from unreliable device technologies. A system architecture is developed for a fault and defect tolerant (FDT) computer. Trade-offs between different techniques are studied and yield and hardware cost models are developed. Fault and defect tolerant designs are created for the processor and the cache memory. Simulation results for the content-addressable memory (CAM)-based cache show 90% yield with device failure probabilities of 3 x 10(-6), three orders of magnitude better than non fault tolerant caches of the same size. The entire processor achieves 70% yield with device failure probabilities exceeding 10(-6). The required hardware redundancy is approximately 15 times that of a non-fault tolerant design. While larger than current FT designs, this architecture allows the use of devices much more likely to fail than silicon CMOS. As part of model development, an improved model is derived for NAND Multiplexing. The model is the first accurate model for small and medium amounts of redundancy. Previous models are extended to account for dependence between the inputs and produce more accurate results

    Semiconductor Memory Devices for Hardware-Driven Neuromorphic Systems

    Get PDF
    This book aims to convey the most recent progress in hardware-driven neuromorphic systems based on semiconductor memory technologies. Machine learning systems and various types of artificial neural networks to realize the learning process have mainly focused on software technologies. Tremendous advances have been made, particularly in the area of data inference and recognition, in which humans have great superiority compared to conventional computers. In order to more effectively mimic our way of thinking in a further hardware sense, more synapse-like components in terms of integration density, completeness in realizing biological synaptic behaviors, and most importantly, energy-efficient operation capability, should be prepared. For higher resemblance with the biological nervous system, future developments ought to take power consumption into account and foster revolutions at the device level, which can be realized by memory technologies. This book consists of seven articles in which most recent research findings on neuromorphic systems are reported in the highlights of various memory devices and architectures. Synaptic devices and their behaviors, many-core neuromorphic platforms in close relation with memory, novel materials enabling the low-power synaptic operations based on memory devices are studied, along with evaluations and applications. Some of them can be practically realized due to high Si processing and structure compatibility with contemporary semiconductor memory technologies in production, which provides perspectives of neuromorphic chips for mass production

    Unified turbo/LDPC code decoder architecture for deep-space communications

    Get PDF
    Deep-space communications are characterized by extremely critical conditions; current standards foresee the usage of both turbo and low-density-parity-check (LDPC) codes to ensure recovery from received errors, but each of them displays consistent drawbacks. Code concatenation is widely used in all kinds of communication to boost the error correction capabilities of single codes; serial concatenation of turbo and LDPC codes has been recently proven effective enough for deep space communications, being able to overcome the shortcomings of both code types. This work extends the performance analysis of this scheme and proposes a novel hardware decoder architecture for concatenated turbo and LDPC codes based on the same decoding algorithm. This choice leads to a high degree of datapath and memory sharing; postlayout implementation results obtained with complementary metal-oxide semiconductor (CMOS) 90 nm technology show small area occupation (0.98 mm 2 ) and very low power consumption (2.1 mW)

    Design and implementation of decoders for error correction in high-speed communication systems

    Full text link
    This thesis is focused on the design and implementation of binary low-density parity-check (LDPC) code decoders for high-speed modern communication systems. The basic of LDPC codes and the performance and bottlenecks, in terms of complexity and hardware efficiency, of the main soft-decision and hard-decision decoding algorithms (such as Min-Sum, Optimized 2-bit Min-Sum and Reliability-based iterative Majority-Logic) are analyzed. The complexity and performance of those algorithms are improved to allow efficient hardware architectures. A new decoding algorithm called One-Minimum Min-Sum is proposed. It reduces considerably the complexity of the check node update equations of the Min-Sum algorithm. The second minimum is estimated from the first minimum value by a means of a linear approximation that allows a dynamic adjustment. The Optimized 2-bit Min-Sum algorithm is modified to initialize it with the complete LLR values and to introduce the extrinsic information in the messages sent from the variable nodes. Its variable node equation is reformulated to reduce its complexity. Both algorithms were tested for the (2048,1723) RS-based LDPC code and (16129,15372) LDPC code using an FPGA-based hardware emulator. They exhibit BER performance very close to Min-Sum algorithm and do not introduce early error-floor. In order to show the hardware advantages of the proposed algorithms, hardware decoders were implemented in a 90 nm CMOS process and FPGA devices based on two types of architectures: full-parallel and partial-parallel one with horizontal layered schedule. The results show that the decoders are more area-time efficient than other published decoders and that the low-complexity of the Modified Optimized 2-bit Min-Sum allows the implementation of 10 Gbps decoders in current FPGA devices. Finally, a new hard-decision decoding algorithm, the Historical-Extrinsic Reliability-Based Iterative Decoder, is presented. This algorithm introduces the new idea of considering hard-decision votes as soft-decision to compute the extrinsic information of previous iterations. It is suitable for high-rate codes and improves the BER performance of the previous RBI-MLGD algorithms, with similar complexity.Esta tesis se ha centrado en el diseño e implementación de decodificadores binarios basados en códigos de comprobación de paridad de baja densidad (LDPC) válidos para los sistemas de comunicación modernos de alta velocidad. Los conceptos básicos de códigos LDPC, sus prestaciones y cuellos de botella, en términos de complejidad y eficiencia hardware, fueron analizados para los principales algoritmos de decisión soft y decisión hard (como Min-Sum, Optimized 2-bit Min-Sum y Reliability-based iterative Majority-Logic). La complejidad y prestaciones de estos algoritmos se han mejorado para conseguir arquitecturas hardware eficientes. Se ha propuesto un nuevo algoritmo de decodificación llamado One-Minimum Min-Sum. Éste reduce considerablemente la complejidad de las ecuaciones de actualización del nodo de comprobación del algoritmo Min-Sum. El segundo mínimo se ha estimado a partir del valor del primer mínimo por medio de una aproximación lineal, la cuál permite un ajuste dinámico. El algoritmo Optimized 2-bit Min-Sum se ha modificado para ser inicializado con los valores LLR e introducir la información extrínseca en los mensajes enviados desde los nodos variables. La ecuación del nodo variable de este algoritmo ha sido reformulada para reducir su complejidad. Ambos algoritmos fueron probados para el código (2048,1723) RS-based LDPC y para el código (16129,15372) LDPC utilizando un emulador hardware implementado en un dispositivo FPGA. Éstos han alcanzado unas prestaciones de BER muy cercanas a las del algoritmo Min-Sum evitando, además, la aparición temprana del fenómeno denominado suelo del error. Con el objetivo de mostrar las ventajas hardware de los algoritmos propuestos, los decodificadores se implementaron en hardware utilizando tecnología CMOS de 90 nm y en dispositivos FPGA basados en dos tipos de arquitecturas: completamente paralela y parcialmente paralela utilizando el método de actualización por capas horizontales. Los resultados muestran que los decodificadores propuestos e implementados son más eficientes en área-tiempo que otros decodificadores publicados y que la baja complejidad del algoritmo Modified Optimized 2-bit Min-Sum permite la implementación de decodificadores en los dispositivos FPGA actuales consiguiendo una tasa de 10 Gbps. Finalmente, se ha presentado un nuevo algoritmo de decodificación de decisión hard, el Historical-Extrinsic Reliability-Based Iterative Decoder. Este algoritmo introduce la nueva idea de considerar los votos de decisión hard como decisión soft para calcular la información extrínseca de iteracions anteriores. Este algoritmo es adecuado para códigos de alta velocidad y mejora el rendimiento BER de los algoritmos RBI-MLGD anteriores, con una complejidad similar.Aquesta tesi s'ha centrat en el disseny i implementació de descodificadors binaris basats en codis de comprovació de paritat de baixa densitat (LDPC) vàlids per als sistemes de comunicació moderns d'alta velocitat. Els conceptes bàsics de codis LDPC, les seues prestacions i colls de botella, en termes de complexitat i eficiència hardware, van ser analitzats pels principals algoritmes de decisió soft i decisió hard (com el Min-Sum, Optimized 2-bit Min-Sum y Reliability-based iterative Majority-Logic). La complexitat i prestacions d'aquests algoritmes s'han millorat per aconseguir arquitectures hardware eficients. S'ha proposat un nou algoritme de descodificació anomenat One-Minimum Min-Sum. Aquest redueix considerablement la complexitat de les equacions d'actualització del node de comprovació del algoritme Min-Sum. El segon mínim s'ha estimat a partir del valor del primer mínim per mitjà d'una aproximació lineal, la qual permet un ajust dinàmic. L'algoritme Optimized 2-bit Min-Sum s'ha modificat per ser inicialitzat amb els valors LLR i introduir la informació extrínseca en els missatges enviats des dels nodes variables. L'equació del node variable d'aquest algoritme ha sigut reformulada per reduir la seva complexitat. Tots dos algoritmes van ser provats per al codi (2048,1723) RS-based LDPC i per al codi (16129,15372) LDPC utilitzant un emulador hardware implementat en un dispositiu FPGA. Aquests han aconseguit unes prestacions BER molt properes a les del algoritme Min-Sum evitant, a més, l'aparició primerenca del fenomen denominat sòl de l'error. Per tal de mostrar els avantatges hardware dels algoritmes proposats, els descodificadors es varen implementar en hardware utilitzan una tecnologia CMOS d'uns 90 nm i en dispositius FPGA basats en dos tipus d'arquitectures: completament paral·lela i parcialment paral·lela utilitzant el mètode d'actualització per capes horitzontals. Els resultats mostren que els descodificadors proposats i implementats són més eficients en àrea-temps que altres descodificadors publicats i que la baixa complexitat del algoritme Modified Optimized 2-bit Min-Sum permet la implementació de decodificadors en els dispositius FPGA actuals obtenint una taxa de 10 Gbps. Finalment, s'ha presentat un nou algoritme de descodificació de decisió hard, el Historical-Extrinsic Reliability-Based Iterative Decoder. Aquest algoritme presenta la nova idea de considerar els vots de decisió hard com decisió soft per calcular la informació extrínseca d'iteracions anteriors. Aquest algoritme és adequat per als codis d'alta taxa i millora el rendiment BER dels algoritmes RBI-MLGD anteriors, amb una complexitat similar.Català Pérez, JM. (2017). Design and implementation of decoders for error correction in high-speed communication systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/86152TESI

    Architecture and algorithms for the implementation of digital wireless receivers in FPGA and ASIC: ISDB-T and DVB-S2 cases

    Full text link
    [EN] The first generation of Terrestrial Digital Television(DTV) has been in service for over a decade. In 2013, several countries have already completed the transition from Analog to Digital TV Broadcasting, most of which in Europe. In South America, after several studies and trials, Brazil adopted the Japanese standard with some innovations. Japan and Brazil started Digital Terrestrial Television Broadcasting (DTTB) services in December 2003 and December 2007 respectively, using Integrated Services Digital Broadcasting - Terrestrial (ISDB-T), also known as ARIB STD-B31. In June 2005 the Committee for the Information Technology Area (CATI) of Brazilian Ministry of Science and Technology and Innovation MCTI approved the incorporation of the IC-Brazil Program, in the National Program for Microelectronics (PNM) . The main goals of IC-Brazil are the formal qualification of IC designers, support to the creation of semiconductors companies focused on projects of ICs within Brazil, and the attraction of semiconductors companies focused on the design and development of ICs in Brazil. The work presented in this thesis originated from the unique momentum created by the combination of the birth of Digital Television in Brazil and the creation of the IC-Brazil Program by the Brazilian government. Without this combination it would not have been possible to make these kind of projects in Brazil. These projects have been a long and costly journey, albeit scientifically and technologically worthy, towards a Brazilian DTV state-of-the-art low complexity Integrated Circuit, with good economy scale perspectives, due to the fact that at the beginning of this project ISDB-T standard was not adopted by several countries like DVB-T. During the development of the ISDB-T receiver proposed in this thesis, it was realized that due to the continental dimensions of Brazil, the DTTB would not be enough to cover the entire country with open DTV signal, specially for the case of remote localizations far from the high urban density regions. Then, Eldorado Research Institute and Idea! Electronic Systems, foresaw that, in a near future, there would be an open distribution system for high definition DTV over satellite, in Brazil. Based on that, it was decided by Eldorado Research Institute, that would be necessary to create a new ASIC for broadcast satellite reception. At that time DVB-S2 standard was the strongest candidate for that, and this assumption still stands nowadays. Therefore, it was decided to apply to a new round of resources funding from the MCTI - that was granted - in order to start the new project. This thesis discusses in details the Architecture and Algorithms proposed for the implementation of a low complexity Intermediate Frequency(IF) ISDB-T Receiver on Application Specific Integrated Circuit (ASIC) CMOS. The Architecture proposed here is highly based on the COordinate Rotation Digital Computer (CORDIC) Algorithm, that is a simple and efficient algorithm suitable for VLSI implementations. The receiver copes with the impairments inherent to wireless channels transmission and the receiver crystals. The thesis also discusses the Methodology adopted and presents the implementation results. The receiver performance is presented and compared to those obtained by means of simulations. Furthermore, the thesis also presents the Architecture and Algorithms for a DVB-S2 receiver targeting its ASIC implementation. However, unlike the ISDB-T receiver, only preliminary ASIC implementation results are introduced. This was mainly done in order to have an early estimation of die area to prove that the project in ASIC is economically viable, as well as to verify possible bugs in early stage. As in the case of ISDB-T receiver, this receiver is highly based on CORDIC algorithm and it was prototyped in FPGA. The Methodology used for the second receiver is derived from that used for the ISDB-T receiver, with minor additions given the project characteristics.[ES] La primera generación de Televisión Digital Terrestre(DTV) ha estado en servicio por más de una década. En 2013, varios países completaron la transición de transmisión analógica a televisión digital, la mayoría de ellas en Europa. En América del Sur, después de varios estudios y ensayos, Brasil adoptó el estándar japonés con algunas innovaciones. Japón y Brasil comenzaron a prestar el servicio de Difusión de Televisión Digital Terrestre (DTTB) en diciembre de 2003 y diciembre de 2007 respectivamente, utilizando Radiodifusión Digital de Servicios Integrados Terrestres (ISDB-T), también conocida como ARIB STD-B31. En junio de 2005, el Comité del Área de Tecnología de la Información (CATI) del Ministerio de Ciencia, Tecnología e Innovación de Brasil - MCTI aprobó la incorporación del Programa CI-Brasil, en el Programa Nacional de Microelectrónica (PNM). Los principales objetivos de la CI-Brasil son la formación de diseñadores de CIs, apoyar la creación de empresas de semiconductores enfocadas en proyectos de circuitos integrados dentro de Brasil, y la atracción de empresas de semiconductores interesadas en el diseño y desarrollo de circuitos integrados. El trabajo presentado en esta tesis se originó en el impulso único creado por la combinación del nacimiento de la televisión digital en Brasil y la creación del Programa de CI-Brasil por el gobierno brasileño. Sin esta combinación no hubiera sido posible realizar este tipo de proyectos en Brasil. Estos proyectos han sido un trayecto largo y costoso, aunque meritorio desde el punto de vista científico y tecnológico, hacia un Circuito Integrado brasileño de punta y de baja complejidad para DTV, con buenas perspectivas de economía de escala debido al hecho que al inicio de este proyecto, el estándar ISDB-T no fue adoptado por varios países como DVB-T. Durante el desarrollo del receptor ISDB-T propuesto en esta tesis, se observó que debido a las dimensiones continentales de Brasil, la DTTB no sería suficiente para cubrir todo el país con la señal de televisión digital abierta, especialmente para el caso de localizaciones remotas, apartadas de las regiones de alta densidad urbana. En ese momento, el Instituto de Investigación Eldorado e Idea! Sistemas Electrónicos, previeron que en un futuro cercano habría un sistema de distribución abierto para DTV de alta definición por satélite en Brasil. Con base en eso, el Instituto de Investigación Eldorado decidió que sería necesario crear un nuevo ASIC para la recepción de radiodifusión por satélite, basada el estándar DVB-S2. En esta tesis se analiza en detalle la Arquitectura y algoritmos propuestos para la implementación de un receptor ISDB-T de baja complejidad y frecuencia intermedia (IF) en un Circuito Integrado de Aplicación Específica (ASIC) CMOS. La arquitectura aquí propuesta se basa fuertemente en el algoritmo Computadora Digital para Rotación de Coordenadas (CORDIC), el cual es un algoritmo simple, eficiente y adecuado para implementaciones VLSI. El receptor hace frente a las deficiencias inherentes a las transmisiones por canales inalámbricos y los cristales del receptor. La tesis también analiza la metodología adoptada y presenta los resultados de la implementación. Por otro lado, la tesis también presenta la arquitectura y los algoritmos para un receptor DVB-S2 dirigido a la implementación en ASIC. Sin embargo, a diferencia del receptor ISDB-T, se introducen sólo los resultados preliminares de implementación en ASIC. Esto se hizo principalmente con el fin de tener una estimación temprana del área del die para demostrar que el proyecto en ASIC es económicamente viable, así como para verificar posibles errores en etapa temprana. Como en el caso de receptor ISDB-T, este receptor se basa fuertemente en el algoritmo CORDIC y fue un prototipado en FPGA. La metodología utilizada para el segundo receptor se deriva de la utilizada para el re[CA] La primera generació de Televisió Digital Terrestre (TDT) ha estat en servici durant més d'una dècada. En 2013, diversos països ja van completar la transició de la radiodifusió de televisió analògica a la digital, i la majoria van ser a Europa. A Amèrica del Sud, després de diversos estudis i assajos, Brasil va adoptar l'estàndard japonés amb algunes innovacions. Japó i Brasil van començar els servicis de Radiodifusió de Televisió Terrestre Digital (DTTB) al desembre de 2003 i al desembre de 2007, respectivament, utilitzant la Radiodifusió Digital amb Servicis Integrats de (ISDB-T), coneguda com a ARIB STD-B31. Al juny de 2005, el Comité de l'Àrea de Tecnologia de la Informació (CATI) del Ministeri de Ciència i Tecnologia i Innovació del Brasil (MCTI) va aprovar la incorporació del programa CI Brasil al Programa Nacional de Microelectrònica (PNM). Els principals objectius de CI Brasil són la qualificació formal dels dissenyadors de circuits integrats, el suport a la creació d'empreses de semiconductors centrades en projectes de circuits integrats dins del Brasil i l'atracció d'empreses de semiconductors centrades en el disseny i desenvolupament de circuits integrats. El treball presentat en esta tesi es va originar en l'impuls únic creat per la combinació del naixement de la televisió digital al Brasil i la creació del programa Brasil CI pel govern brasiler. Sense esta combinació no hauria estat possible realitzar este tipus de projectes a Brasil. Estos projectes han suposat un viatge llarg i costós, tot i que digne científicament i tecnològica, cap a un circuit integrat punter de baixa complexitat per a la TDT brasilera, amb bones perspectives d'economia d'escala perquè a l'inici d'este projecte l'estàndard ISDB-T no va ser adoptat per diversos països, com el DVB-T. Durant el desenvolupament del receptor de ISDB-T proposat en esta tesi, va resultar que, a causa de les dimensions continentals de Brasil, la DTTB no seria suficient per cobrir tot el país amb el senyal de TDT oberta, especialment pel que fa a les localitzacions remotes allunyades de les regions d'alta densitat urbana.. En este moment, l'Institut de Recerca Eldorado i Idea! Sistemes Electrònics van preveure que, en un futur pròxim, no hi hauria a Brasil un sistema de distribució oberta de TDT d'alta definició a través de satèl¿lit. D'acord amb això, l'Institut de Recerca Eldorado va decidir que seria necessari crear un nou ASIC per a la recepció de radiodifusió per satèl¿lit. basat en l'estàndard DVB-S2. En esta tesi s'analitza en detall l'arquitectura i els algorismes proposats per l'execució d'un receptor ISDB-T de Freqüència Intermèdia (FI) de baixa complexitat sobre CMOS de Circuit Integrat d'Aplicacions Específiques (ASIC). L'arquitectura ací proposada es basa molt en l'algorisme de l'Ordinador Digital de Rotació de Coordenades (CORDIC), que és un algorisme simple i eficient adequat per implementacions VLSI. El receptor fa front a les deficiències inherents a la transmissió de canals sense fil i els cristalls del receptor. Esta tesi també analitza la metodologia adoptada i presenta els resultats de l'execució. Es presenta el rendiment del receptor i es compara amb els obtinguts per mitjà de simulacions. D'altra banda, esta tesi també presenta l'arquitectura i els algorismes d'un receptor de DVB-S2 de cara a la seua implementació en ASIC. No obstant això, a diferència del receptor ISDB-T, només s'introdueixen resultats preliminars d'implementació en ASIC. Això es va fer principalment amb la finalitat de tenir una estimació primerenca de la zona de dau per demostrar que el projecte en ASIC és econòmicament viable, així com per verificar possibles errors en l'etapa primerenca. Com en el cas del receptor ISDB-T, este receptor es basa molt en l'algorisme CORDIC i va ser un prototip de FPGA. La metodologia utilitzada per al segon receptor es deriva de la utilitzada per al receptor IRodrigues De Lima, E. (2016). Architecture and algorithms for the implementation of digital wireless receivers in FPGA and ASIC: ISDB-T and DVB-S2 cases [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/61967TESI

    In-Memory Computing by Using Nano-ionic Memristive Devices

    Get PDF
    By reaching to the CMOS scaling limitation based on the Moore’s law and due to the increasing disparity between the processing units and memory performance, the quest is continued to find a suitable alternative to replace the conventional technology. The recently discovered two terminal element, memristor, is believed to be one of the most promising candidates for future very large scale integrated systems. This thesis is comprised of two main parts, (Part I) modeling the memristor devices, and (Part II) memristive computing. The first part is presented in one chapter and the second part of the thesis contains five chapters. The basics and fundamentals regarding the memristor functionality and memristive computing are presented in the introduction chapter. A brief detail of these two main parts is as follows: Part I: Modeling- This part presents an accurate model based on the charge transport mechanisms for nanoionic memristor devices. The main current mechanism in metal/insulator/metal (MIM) structures are assessed, a physic-based model is proposed and a SPICE model is presented and tested for four different fabricated devices. An accuracy comparison is done for various models for Ag/TiO2/ITO fabricated device. Also, the functionality of the model is tested for various input signals. Part II: Memristive computing- Memristive computing is about utilizing memristor to perform computational tasks. This part of the thesis is divided into neuromorphic, analog and digital computing schemes with memristor devices. – Neuromorphic computing- Two chapters of this thesis are about biologicalinspired memristive neural networks using STDP-based learning mechanism. The memristive implementation of two well-known spiking neuron models, Hudgkin-Huxley and Morris-Lecar, are assessed and utilized in the proposed memristive network. The synaptic connections are also memristor devices in this design. Unsupervised pattern classification tasks are done to ensure the right functionality of the system. – Analog computing- Memristor has analog memory property as it can be programmed to different memristance values. A novel memristive analog adder is designed by Continuous Valued Number System (CVNS) scheme and its circuit is comprised of addition and modulo blocks. The proposed analog adder design is explained and its functionality is tested for various numbers. It is shown that the CVNS scheme is compatible with memristive design and the environment resolution can be adjusted by the memristance ratio of the memristor devices. – Digital computing- Two chapters are dedicated for digital computing. In the first one, a development over IMPLY-based logic with memristor is provided to implement a 4:2 compressor circuit. In the second chapter, A novel resistive over a novel mirrored memristive crossbar platform. Different logic gates are designed with the proposed memristive logic method and the simulations are provided with Cadence to prove the functionality of the logic. The logic implementation over a mirrored memristive crossbars is also assessed

    Energy Saving and Virtualization Technologies in Switching

    Get PDF
    Switching is the key functionality for many devices like electronic Router and Switch, optical Router, Network on Chips (NoCs) and so on. Basically, switching is responsible for moving data unit from one port/location to another (or multiple) port(s)/location(s). In past years, the high capacity, low delay were the main concerns when designing high-end switching unit. As new demands, requests and technologies emerge, flexibility and low power cost switching design become to weight the same as throughput and delay. On one hand, highly flexible (i.e, programming ability) switching can cope with variable needs stem from new applications (i.e, VoIP) and popular user behavior (i.e, p2p downloading); on the other hand, reduce the energy and power dissipation for switching could not only save bills and build echo system but also expand components life time. Many research efforts have been devoted to increase switching flexibility and reduce its power cost. In this thesis work, we consider to exploit virtualization as the main technique to build flexible software router in the first part, then in the second part we draw our attention on energy saving in NoC (i.e, a switching fabric designed to handle the on chip data transmission) and software router. In the first part of the thesis, we consider the virtualization inside Software Routers (SRs). SR, i.e, routers running in commodity Personal Computers (PCs), become an appealing solution compared to traditional Proprietary Routing Devices (PRD) for various reasons such as cost (the multi-vendor hardware used by SRs can be cheap, while the equipment needed by PRDs is more expensive and their training cost is higher), openness (SRs can make use of a large number of open source networking applications, while PRDs are more closed) and flexibility. The forwarding performance provided by SRs has been an obstacle to their deployment in real networks. For this reason, we proposed to aggregate multiple routing units that form an powerful SR known as the Multistage Software Router (MSR) to overcome the performance limitation for a single SR. Our results show that the throughput can increase almost linearly as the number of the internal routing devices. But some other features related to flexibility (such as power saving, programmability, router migration or easy management) have been investigated less than performance previously. We noticed that virtualization techniques become reality thanks to the quick development of the PC architectures, which are now able to easily support several logical PCs running in parallel on the same hardware. Virtualization could provide many flexible features like hardware and software decoupling, encapsulation of virtual machine state, failure recovery and security, to name a few. Virtualization permits to build multiple SRs inside one physical host and a multistage architecture exploiting only logical devices. By doing so, physical resources can be used in a more efficient way, energy savings features (switching on and off device when needed) can be introduced and logical resources could be rented on-demand instead of being owned. Since virtualization techniques are still difficult to deploy, several challenges need to be faced when trying to integrate them into routers. The main aim of the first part in this thesis is to find out the feasibility of the virtualization approach, to build and test virtualized SR (VSR), to implement the MSR exploiting logical, i.e. virtualized, resources, to analyze virtualized routing performance and to propose improvement techniques to VSR and virtual MSR (VMSR). More specifically, we considered different virtualization solutions like VMware, XEN, KVM to build VSR and VMSR, being VMware a closed source solution but with higher performance and XEN/KVM open source solutions. Firstly we built and tested each single component of our multistage architecture (i.e, back-end router, load balancer )inside the virtual infrastructure, then and we extended the performance experiments with more complex scenarios like multiple Back-end Router (BR) or Load Balancer (LB) which cooperate to route packets. Our results show that virtualization could introduce 40~\% performance penalty compare with the hardware only solution. Keep the performance limitation in mind, we developed the whole VMSR and we obtained low throughput with 64B packet flow as expected. To increase the VMSR throughput, two directions could be considered, the first one is to improve the single component ( i.e, VSR) performance and the other is to work from the topology (i.e, best allocation of the VMs into the hardware ) point of view. For the first method, we considered to tune the VSR inside the KVM and we studied closely such as Linux driver, scheduler, interconnect methodology which could impact the performance significantly with proper configuration; then we proposed two ways for the VMs allocation into physical servers to enhance the VMSR performance. Our results show that with good tuning and allocation of VMs, we could minimize the virtualization penalty and get reasonable throughput for running SRs inside virtual infrastructure and add flexibility functionalities into SRs easily. In the second part of the thesis, we consider the energy efficient switching design problem and we focus on two main architecture, the NoC and MSR. As many research works suggest, the energy cost in the Communication Technologies ( ICT ) is constantly increasing. Among the main ICT sectors, a large portion of the energy consumption is contributed by the telecommunication infrastructure and their devices, i.e, router, switch, cell phone, ip TV settle box, storage home gateway etc. More in detail, the linecards, links, System on Chip (SoC) including the transmitter/receiver on these variate devices are the main power consuming units. We firstly present the work on the power reduction of the data transmission in SoC, which is carried out by the NoC. NoC is an approach to design the communication subsystem between different Processing Units (PEs) in a SoC. PEs could be different elements such as CPU, memory, digital signal/analog signal processor etc. Different PEs performs specific tasks depending on the applications running on the chip. Different tasks need to exchange data information among each other, thus flits ( chopped packet with limited header information ) are generated by PEs. The flits are injected into the NoC by the proper interface and routed until reach the destination PEs. For the whole procedure, the NoC behaves as a packet switch network. Studies show that in general the information processing in the PEs only consume 60~\% energy while the remaining 40~\% are consumed by the NoC. More importantly, as the current network designing principle, the NoC capacity is devised to handle the peak load. This is a clear sign for energy saving when the network load is low. In our work, we considered to exploit Dynamic Voltage and Frequency Scaling (DVFS) technique, which can jointly decrease or increase the system voltage and frequency when necessary, i.e, decrease the voltage and frequency at low load scenario to save energy and reduce power dissipation. More precisely, we studied two different NoC architectures for energy saving, namely single plane chip and multi-plane chip architecture. In both cases we have a very strict constraint to be that all the links and transmitter/receivers on the same plane work at the same frequency/voltage to avoid synchronization problem. This is the main difference with many existing works in the literature which usually assume different links can work at different frequency, that is hard to be implemented in reality. For the single plane NoC, we exploited different routing schemas combined with DVFS to reduce the power for the whole chip. Our results haven been compared with the optimal value obtained by modeling the power saving formally as a quadratic programming problem. Results suggest that just by using simple load balancing routing algorithm, we can save considerable energy for the single chip NoC architecture. Furthermore, we noticed that in the single plane NoC architecture, the bottleneck link could limit the DVFS effectiveness. Then we discovered that multiplane NoC architecture is fairly easy to be implemented and it could help with the energy saving. Thus we focus on the multiplane architecture and we found out that DVFS could be more efficient when we concentrate more traffic into one plane and send the remaining flows to other planes. We compared load concentration and load balancing with different power modeling and all simulation results show that load concentration is better compared with load balancing for multiplan NoC architecture. Finally, we also present one of the the energy efficient MSR design technique, which permits the MSR to follow the day-night traffic pattern more efficiently with our on-line energy saving algorithm
    corecore