26 research outputs found

    Analyse und Erweiterung eines fehler-toleranten NoC fĂŒr SRAM-basierte FPGAs in Weltraumapplikationen

    Get PDF
    Data Processing Units for scientific space mission need to process ever higher volumes of data and perform ever complex calculations. But the performance of available space-qualified general purpose processors is just in the lower three digit megahertz range, which is already insufficient for some applications. As an alternative, suitable processing steps can be implemented in hardware on a space-qualified SRAM-based FPGA. However, suitable devices are susceptible against space radiation. At the Institute for Communication and Network Engineering a fault-tolerant, network-based communication architecture was developed, which enables the construction of processing chains on the basis of different processing modules within suitable SRAM-based FPGAs and allows the exchange of single processing modules during runtime, too. The communication architecture and its protocol shall isolate non SEU mitigated or just partial SEU mitigated modules affected by radiation-induced faults to prohibit the propagation of errors within the remaining System-on-Chip. In the context of an ESA study, this communication architecture was extended with further components and implemented in a representative hardware platform. Based on the acquired experiences during the study, this work analyses the actual fault-tolerance characteristics as well as weak points of this initial implementation. At appropriate locations, the communication architecture was extended with mechanisms for fault-detection and fault-differentiation as well as with a hardware-based monitoring solution. Both, the former measures and the extension of the employed hardware-platform with selective fault-injection capabilities for the emulation of radiation-induced faults within critical areas of a non SEU mitigated processing module, are used to evaluate the effects of radiation-induced faults within the communication architecture. By means of the gathered results, further measures to increase fast detection and isolation of faulty nodes are developed, selectively implemented and verified. In particular, the ability of the communication architecture to isolate network nodes without SEU mitigation could be significantly improved.Instrumentenrechner fĂŒr wissenschaftliche Weltraummissionen mĂŒssen ein immer höheres Datenvolumen verarbeiten und immer komplexere Berechnungen ausfĂŒhren. Die Performanz von verfĂŒgbaren qualifizierten Universalprozessoren liegt aber lediglich im unteren dreistelligen Megahertz-Bereich, was fĂŒr einige Anwendungen bereits nicht mehr ausreicht. Als Alternative bietet sich die Implementierung von entsprechend geeigneten Datenverarbeitungsschritten in Hardware auf einem qualifizierten SRAM-basierten FPGA an. Geeignete Bausteine sind jedoch empfindlich gegenĂŒber der Strahlungsumgebung im Weltraum. Am Institut fĂŒr Datentechnik und Kommunikationsnetze wurde eine fehlertolerante netzwerk-basierte Kommunikationsarchitektur entwickelt, die innerhalb eines geeigneten SRAM-basierten FPGAs Datenverarbeitungsmodule miteinander nach Bedarf zu Verarbeitungsketten verbindet, sowie den Austausch von einzelnen Modulen im Betrieb ermöglicht. Nicht oder nur partiell SEU mitigierte Module sollen bei strahlungsbedingten Fehlern im Modul durch das Protokoll und die Fehlererkennungsmechanismen der Kommunikationsarchitektur isoliert werden, um ein Ausbreiten des Fehlers im restlichen System-on-Chip zu verhindern. Im Kontext einer ESA Studie wurde diese Kommunikationsarchitektur um Komponenten erweitert und auf einer reprĂ€sentativen Hardwareplattform umgesetzt. Basierend auf den gesammelten Erfahrungen aus der Studie, wird in dieser Arbeit eine Analyse der tatsĂ€chlichen Fehlertoleranz-Eigenschaften sowie der Schwachstellen dieser ursprĂŒnglichen Implementierung durchgefĂŒhrt. Die Kommunikationsarchitektur wurde an geeigneten Stellen um Fehlerdetektierungs- und Fehlerunterscheidungsmöglichkeiten erweitert, sowie um eine hardwarebasierte Überwachung ergĂ€nzt. Sowohl diese Maßnahmen, als auch die Erweiterung der Hardwareplattform um gezielte Fehlerinjektions-Möglichkeiten zum Emulieren von strahlungsinduzierten Fehlern in kritischen Komponenten eines nicht SEU mitigierten Prozessierungsmoduls werden genutzt, um die tatsĂ€chlichen auftretenden Effekte in der Kommunikationsarchitektur zu evaluieren. Anhand der Ergebnisse werden weitere Verbesserungsmaßnahmen speziell zur schnellen Detektierung und Isolation von fehlerhaften Knoten erarbeitet, selektiv implementiert und verifiziert. Insbesondere die FĂ€higkeit, fehlerhafte, nicht SEU mitigierte Netzwerkknoten innerhalb der Kommunikationsarchitektur zu isolieren, konnte dabei deutlich verbessert werden

    Social Insect-Inspired Adaptive Hardware

    Get PDF
    Modern VLSI transistor densities allow large systems to be implemented within a single chip. As technologies get smaller, fundamental limits of silicon devices are reached resulting in lower design yields and post-deployment failures. Many-core systems provide a platform for leveraging the computing resource on offer by deep sub-micron technologies and also offer high-level capabilities for mitigating the issues with small feature sizes. However, designing for many-core systems that can adapt to in-field failures and operation variability requires an extremely large multi-objective optimisation space. When a many-core reaches the size supported by the densities of modern technologies (thousands of processing cores), finding design solutions in this problem space becomes extremely difficult. Many biological systems show properties that are adaptive and scalable. This thesis proposes a self-optimising and adaptive, yet scalable, design approach for many-core based on the emergent behaviours of social-insect colonies. In these colonies there are many thousands of individuals with low intelligence who contribute, without any centralised control, to complete a wide range of tasks to build and maintain the colony. The experiments presented translate biological models of social-insect intelligence into simple embedded intelligence circuits. These circuits sense low-level system events and use this manage the parameters of the many-core's Network-on-Chip (NoC) during runtime. Centurion, a 128-node many-core, was created to investigate these models at large scale in hardware. The results show that, by monitoring a small number of signals within each NoC router, task allocation emerges from the social-insect intelligence models that can self-configure to support representative applications. It is demonstrated that emergent task allocation supports fault tolerance with no extra hardware overhead. The response-threshold decision making circuitry uses a negligible amount of hardware resources relative to the size of the many-core and is an ideal technology for implementing embedded intelligence for system runtime management of large-complexity single-chip systems

    Wireless Interconnects for Intra-chip & Inter-chip Transmission

    Get PDF
    With the emergence of Internet of Things and information revolution, the demand of high performance computing systems is increasing. The copper interconnects inside the computing chips have evolved into a sophisticated network of interconnects known as Network on Chip (NoC) comprising of routers, switches, repeaters, just like computer networks. When network on chip is implemented on a large scale like in Multicore Multichip (MCMC) systems for High Performance Computing (HPC) systems, length of interconnects increases and so are the problems like power dissipation, interconnect delays, clock synchronization and electrical noise. In this thesis, wireless interconnects are chosen as the substitute for wired copper interconnects. Wireless interconnects offer easy integration with CMOS fabrication and chip packaging. Using wireless interconnects working at unlicensed mm-wave band (57-64GHz), high data rate of Gbps can be achieved. This thesis presents study of transmission between zigzag antennas as wireless interconnects for Multichip multicores (MCMC) systems and 3D IC. For MCMC systems, a four-chips 16-cores model is analyzed with only four wireless interconnects in three configurations with different antenna orientations and locations. Return loss and transmission coefficients are simulated in ANSYS HFSS. Moreover, wireless interconnects are designed, fabricated and tested on a 6’’ silicon wafer with resistivity of 55Ω-cm using a basic standard CMOS process. Wireless interconnect are designed to work at 30GHz using ANSYS HFSS. The fabricated antennas are resonating around 20GHz with a return loss of less than -10dB. The transmission coefficients between antenna pair within a 20mm x 20mm silicon die is found to be varying between -45dB to -55dB. Furthermore, wireless interconnect approach is extended for 3D IC. Wireless interconnects are implemented as zigzag antenna. This thesis extends the work of analyzing the wireless interconnects in 3D IC with different configurations of antenna orientations and coolants. The return loss and transmission coefficients are simulated using ANSYS HFSS

    Embedded electronic systems driven by run-time reconfigurable hardware

    Get PDF
    Abstract This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen Esta tesis doctoral abarca el diseño de sistemas electrĂłnicos embebidos basados en tecnologĂ­a hardware dinĂĄmicamente reconfigurable –disponible a travĂ©s de dispositivos lĂłgicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguraciĂłn que proporcione a la FPGA la capacidad de reconfiguraciĂłn dinĂĄmica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicaciĂłn particionada en tareas multiplexadas en tiempo y en espacio, optimizando asĂ­ su implementaciĂłn fĂ­sica –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estĂĄtico (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalĂșa el flujo de diseño de dicha tecnologĂ­a a travĂ©s del prototipado de varias aplicaciones de ingenierĂ­a (sistemas de control, coprocesadores aritmĂ©ticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotaciĂłn en la industria.Resum Aquesta tesi doctoral estĂ  orientada al disseny de sistemes electrĂČnics empotrats basats en tecnologia hardware dinĂ micament reconfigurable –disponible mitjançant dispositius lĂČgics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguraciĂł que proporcioni a la FPGA la capacitat de reconfiguraciĂł dinĂ mica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicaciĂł particionada en tasques multiplexades en temps i en espai, optimizant aixĂ­ la seva implementaciĂł fĂ­sica –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potĂšncia dissipada– comparada amb altres alternatives basades en hardware estĂ tic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalĂșa el fluxe de disseny d’aquesta tecnologia a travĂ©s del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmĂštics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotaciĂł a la indĂșstria

    Fault-tolerant satellite computing with modern semiconductors

    Get PDF
    Miniaturized satellites enable a variety space missions which were in the past infeasible, impractical or uneconomical with traditionally-designed heavier spacecraft. Especially CubeSats can be launched and manufactured rapidly at low cost from commercial components, even in academic environments. However, due to their low reliability and brief lifetime, they are usually not considered suitable for life- and safety-critical services, complex multi-phased solar-system-exploration missions, and missions with a longer duration. Commercial electronics are key to satellite miniaturization, but also responsible for their low reliability: Until 2019, there existed no reliable or fault-tolerant computer architectures suitable for very small satellites. To overcome this deficit, a novel on-board-computer architecture is described in this thesis.Robustness is assured without resorting to radiation hardening, but through software measures implemented within a robust-by-design multiprocessor-system-on-chip. This fault-tolerant architecture is component-wise simple and can dynamically adapt to changing performance requirements throughout a mission. It can support graceful aging by exploiting FPGA-reconfiguration and mixed-criticality.  Experimentally, we achieve 1.94W power consumption at 300Mhz with a Xilinx Kintex Ultrascale+ proof-of-concept, which is well within the powerbudget range of current 2U CubeSats. To our knowledge, this is the first COTS-based, reproducible on-board-computer architecture that can offer strong fault coverage even for small CubeSats.European Space AgencyComputer Systems, Imagery and Medi

    Synaptic weight modification and storage in hardware neural networks

    Get PDF
    In 2011 the International Technology Roadmap for Semiconductors, ITRS 2011, outlined how the semiconductor industry should proceed to pursue Moore’s Law past the 18nm generation. It envisioned a concept of ‘More than Moore’, in which existing semiconductor technologies can be exploited to enable the fabrication of diverse systems and in particular systems which integrate non-digital and biologically based functionality. A rapid expansion and growing interest in the fields of microbiology, electrophysiology, and computational neuroscience occurred. This activity has provided significant understanding and insight into the function and structure of the human brain leading to the creation of systems which mimic the operation of the biological nervous system. As the systems expand a need for small area, low power devices which replicate the important biological features of neural networks has been established to implement large scale networks. In this thesis work is presented which focuses on the modification and storage of synaptic weights in hardware neural networks. Test devices were incorporated on 3 chip runs; each chip was fabricated in a 0.35ÎŒm process from Austria MicroSystems (AMS) and used for parameter extraction, in accordance with the theoretical analysis presented. A compact circuit is presented which can implement STDP, and has advantages over current implementations in that the critical timing window for synaptic modification is implemented within the circuit. The duration of the critical timing window is set by the subthreshold current controlled by the voltage, Vleak, applied to transistor Mleak in the circuit. A physical model to predict the time window for plasticity to occur is formulated and the effects of process variations on the window is analysed. The STDP circuit is implemented using two dedicated circuit blocks, one for potentiation and one for depression where each block consists of 4 transistors and a polysilicon capacitor, and an area of 980”m2. SpectreS simulations of the back-annotated layout of the circuit and experimental results indicate that STDP with biologically plausible critical timing windows over the range 10”s to 100ms can be implemented. Theoretical analysis using parameters extracted from MOS test devices is used to describe the operation of each device and circuit presented. Simulation results and results obtained from fabricated devices confirm the validity of these designs and approaches. Both the WP and WD circuits have a power consumption of approximately 2.4mW, during a weight update. If no weight update occurs the resting currents within the device are in the nA range, thus each circuit has a power consumption of approximately 1”W. A floating gate, FG, device fabricated using a standard CMOS process is presented. This device is to be integrated with both the WP and WD STDP circuits. The FG device is designed to store negative charge on a FG to represent the synaptic weight of the associated synapse. Charge is added or removed from the FG via Fowler-Nordheim tunnelling. This thesis outlines the design criteria and theoretical operation of this device. A model of the charge storage characteristics is presented and verified using HFCV and PCV experimental results. Limited precision weights, LPW, and its potential use in hardware neural networks is also considered. LPW offers a potential solution in the quest to design a compact FG device for use with CTS. The algorithms presented in this thesis show that LPW allows for a reduction in the synaptic weight storage device while permitting the network to function as intended
    corecore