25 research outputs found

    Методика проектирования сигнатурного анализатора на клеточных автоматах для встроенного самотестирования

    Get PDF
    Рассмотрены вопросы проектирования сигнатурных анализаторов на клеточных автоматах с расширенным набором правил. Предложена методика получения конфигураций клеточных автоматов для проектирования сигнатурных анализаторов. В классическом подходе используется LFSR и многоканальный сигнатурный анализатор для реализации BIST. Недостатком данного подхода является длинная обратная связь с выхода последнего триггера на вход первого, которая зависит от порождающего полинома. В данной работе предложено использовать в качестве многоканального сигнатурного анализатора клеточный автомат с расширенным набором правил. Это позволяет снизить вероятность необнаружения неисправностей.The issues of designing signature analyzers on cellular automata with superset of rules are considered. The methods of obtaining configurations of cellular automata for designing signature analyzers are proposed. According to the classic approach LFSR and multichannel signa­ture analyzer are used for BIST implementation. A disadvantage of this approach is long feed­back from the output of the last flip-flop to the input of the first one which is dependent on generator polynomial. It is suggested in this work to use cellular automata with superset of rules as a multichannel signature analyzer. This enables to reduce aliasing probability

    Delay Measurements and Self Characterisation on FPGAs

    No full text
    This thesis examines new timing measurement methods for self delay characterisation of Field-Programmable Gate Arrays (FPGAs) components and delay measurement of complex circuits on FPGAs. Two novel measurement techniques based on analysis of a circuit's output failure rate and transition probability is proposed for accurate, precise and efficient measurement of propagation delays. The transition probability based method is especially attractive, since it requires no modifications in the circuit-under-test and requires little hardware resources, making it an ideal method for physical delay analysis of FPGA circuits. The relentless advancements in process technology has led to smaller and denser transistors in integrated circuits. While FPGA users benefit from this in terms of increased hardware resources for more complex designs, the actual productivity with FPGA in terms of timing performance (operating frequency, latency and throughput) has lagged behind the potential improvements from the improved technology due to delay variability in FPGA components and the inaccuracy of timing models used in FPGA timing analysis. The ability to measure delay of any arbitrary circuit on FPGA offers many opportunities for on-chip characterisation and physical timing analysis, allowing delay variability to be accurately tracked and variation-aware optimisations to be developed, reducing the productivity gap observed in today's FPGA designs. The measurement techniques are developed into complete self measurement and characterisation platforms in this thesis, demonstrating their practical uses in actual FPGA hardware for cross-chip delay characterisation and accurate delay measurement of both complex combinatorial and sequential circuits, further reinforcing their positions in solving the delay variability problem in FPGAs

    Robust and reliable hardware accelerator design through high-level synthesis

    Get PDF
    System-on-chip design is becoming increasingly complex as technology scaling enables more and more functionality on a chip. This scaling-driven complexity has resulted in a variety of reliability and validation challenges including logic bugs, hot spots, wear-out, and soft errors. To make matters worse, as we reach the limits of Dennard scaling, efforts to improve system performance and energy efficiency have resulted in the integration of a wide variety of complex hardware accelerators in SoCs. Thus the challenge is to design complex, custom hardware that is efficient, but also correct and reliable. High-level synthesis shows promise to address the problem of complex hardware design by providing a bridge from the high-productivity software domain to the hardware design process. Much research has been done on high-level synthesis efficiency optimizations. This dissertation shows that high-level synthesis also has the power to address validation and reliability challenges through three automated solutions targeting three key stages in the hardware design and use cycle: pre-silicon debugging, post-silicon validation, and post-deployment error detection. Our solution for rapid pre-silicon debugging of accelerator designs is hybrid tracing: comparing a datapath-level trace of hardware execution with a reference software implementation at a fine temporal and spatial granularity to detect logic bugs. An integrated backtrace process delivers source-code meaning to the hardware designer, pinpointing the location of bug activation and providing a strong hint for potential bug fixes. Experimental results show that we are able to detect and aid in localization of logic bugs from both C/C++ specifications as well as the high-level synthesis engine itself. A variation of this solution tailored for rapid post-silicon validation of accelerator designs is hybrid hashing: inserting signature generation logic in a hardware design to create a heavily compressed signature stream that captures the internal behavior of the design at a fine temporal and spatial granularity for comparison with a reference set of signatures generated by high-level simulation to detect bugs. Using hybrid hashing, we demonstrate an improvement in error detection latency (time elapsed from when a bug is activated to when it manifests as an observable failure) of two orders of magnitude and a threefold improvement in bug coverage compared to traditional post-silicon validation techniques. Hybrid hashing also uncovered previously unknown bugs in the CHStone benchmark suite, which is widely used by the HLS community. Hybrid hashing incurs less than 10% area overhead for the accelerator it validates with negligible performance impact, and we also introduce techniques to minimize any possible intrusiveness introduced by hybrid hashing. Finally, our solution for post-deployment error detection is modulo-3 shadow datapaths: performing lightweight shadow computations in modulo-3 space for each main computation. We leverage the binding and scheduling flexibility of high-level synthesis to detect control errors through diverse binding and minimize area cost through intelligent checkpoint scheduling and modulo-3 reducer sharing. We introduce logic and dataflow optimizations to further reduce cost. We evaluated our technique with 12 high-level synthesis benchmarks from the arithmetic-oriented PolyBench benchmark suite using FPGA emulated netlist-level error injection. We observe coverages of 99.1% for stuck-at faults, 99.5% for soft errors, and 99.6% for timing errors with a 25.7% area cost and negligible performance impact. Leveraging a mean error detection latency of 12.75 cycles (4150× faster than end result check) for soft errors, we also explore a rollback recovery method with an additional area cost of 28.0%, observing a 175× increase in reliability against soft errors. While the area cost of our modulo shadow datapaths is much better than traditional modular redundancy approaches, we want to maximize the applicability of our approach. To this end, we take a dive into gate-level architectural design for modulo arithmetic functional units. We introduce new low-cost gate-level architectures for all four key functional units in a shadow datapath: (1) a modulo reduction algorithm that generates architectures consisting entirely of full-adder standard cells; (2) minimum-area modulo adder and subtractor architectures; (3) an array-based modulo multiplier design; and (4) a modulo equality comparator that handles the residue encoding produced by the above. We compare our new functional units to the previous state-of-the-art approach, observing a 12.5% reduction in area and a 47.1% reduction in delay for a 32-bit mod-3 reducer; that our reducer costs, which tend to dominate shadow datapath costs, do not increase with larger modulo bases; and that for modulo-15 and above, all of our modulo functional units have better area and delay then their previous counterparts. We also demonstrate the practicality of our approach by designing a custom shadow datapath for error detection of a multiply accumulate functional unit, which has an area overhead of only 12% for a 32-bit main datapath and 2-bit modulo-3 shadow datapath. Taking our reliability solution further, we look at the bigger picture of modulo shadow datapaths combined with other solutions at different abstraction layers, looking to answer the following question: Given all of the existing reliability improvement techniques for application-specific hardware accelerators, what techniques or combinations of techniques are the most cost-effective? To answer this question, we consider a soft error fault model and empirically evaluate cross-layer combinations of ABFT, EDDI, and modulo shadow datapaths in the context of high-level synthesis; parity in logic synthesis; and flip-flop hardening techniques at the physical design level. We measure the reliability benefit and area, energy, and performance cost of each technique individually and for interesting technique combinations through FPGA emulated fault-injection and physical place-and-route. Our results show that a combination of parity and flip-flop hardening is the most cost-effective in general with an average 1.3% area cost and 5.7% energy cost for a 50× improvement in reliability. The addition of modulo-3 shadow datapaths to this combination provides some additional benefit for some applications, even without considering its combinational logic, stuck-at fault, and timing error protection benefits. We also observe new efficiency challenges for ABFT and EDDI when used for hardware accelerators

    Side Channel Attacks on IoT Applications

    Get PDF

    Nowe ujęcie wybranych zagadnień optymalizacji

    Get PDF
    In solving complex optimization tasks evolutionary algorithms have a leading position. Unusual look at the optimization algorithms presented in the thesis, led to the creation of the new algorithm and work on its development to put its metaphors in a group of artificial life. The resulting algorithms are still the effective optimization algorithms and the proposed approach introduces new properties in their operation. The study presents a new algorithm of observation - as the base algorithm and its metaphors placed in a group of immune algorithm and particle swarm optimization algorithms. Research on the mechanics of these algorithms demonstrated new properties, i.e.: behavior resembling observation, and co-evolution mechanism determines the behavior of independence on influences of the environment. Implementation of the assumptions imposed the need to develop effective mechanism of mutation for immune algorithm. The functions of behavior scenarios were defined for the particle swarm optimization algorithm. A group of immune systems is proposed which is an equivalent to the multi-population system and methods of information exchange between systems in the group are defined. The thesis presents a theoretical background of algorithms’ operation and a simulation study. To check the efficiency of the algorithms the typical test environment for stationary and non-stationary problems were applied. In the study, fractal and multifractal analysis was used and its usefulness was demonstrated in research on behavior of algorithms. Optimization of diagnostic structure of digital circuit is an issue of multimodal optimization and is a particular kind of challenge. A comprehensive approach to test multi-module circuit may lead to new solutions, also in terms of a single module testing. Such concepts are included in this study, basing on an untypical approach to testing multi-module circuit, the conclusion has a strong theoretical base. The original achievements in this dissertation are as follows: a proposal of BIST architecture based on the so-called linear modification, the introduction of the diagnostic structure description, and determination of the theoretical basis of this concept, confirmation of the formulated theoretical basement and simultaneously the verification of the diagnostic efficiency of the proposed solutions by means of simulation methods basing on modeling with using ISCAS’89 benchmark, the demonstration of permanent features of modules during testing, the presentation of a formal description of any diagnostic structure with a description of the optimization framework and the concept of simulation tools used in the current research. Simultaneously, the study shows the original use of a genetic algorithm to give a high efficiency optimization. This part of the study presents a complete system of description of any diagnostic structure with the optimization method. The solutions presented in the dissertation open the way for the further research. This dissertation is composed of two parts, despite of the common basis in a form of evolutionary algorithms, they are present different and closed thematically issues. Keywords: optimization, multi-criteria optimization, multimodal optimization, evolutionary algorithms, genetic algorithms, immune algorithms, particle swarm optimization algorithms, a group of immune system, the algorithm of observation, exchange of genetic material, fractal analysis, multifractal analysis, beset game algorithm, immune algorithm with auto-aggression, stationary problems, non-stationary problems, BIST structure, BIST structure optimization, BIST structure description, multi-modular circuit BIST

    Methodologies for Accelerated Analysis of the Reliability and the Energy Efficiency Levels of Modern Microprocessor Architectures

    Get PDF
    Η εξέλιξη της τεχνολογίας ημιαγωγών, της αρχιτεκτονικής υπολογιστών και της σχεδίασης οδηγεί σε αύξηση της απόδοσης των σύγχρονων μικροεπεξεργαστών, η οποία επίσης συνοδεύεται από αύξηση της ευπάθειας των προϊόντων. Οι σχεδιαστές εφαρμόζουν διάφορες τεχνικές κατά τη διάρκεια της ζωής των ολοκληρωμένων κυκλωμάτων με σκοπό να διασφαλίσουν τα υψηλά επίπεδα αξιοπιστίας των παραγόμενων προϊόντων και να τα προστατέψουν από διάφορες κατηγορίες σφαλμάτων διασφαλίζοντας την ορθή λειτουργία τους. Αυτή η διδακτορική διατριβή προτείνει καινούριες μεθόδους για να διασφαλίσει τα υψηλά επίπεδα αξιοπιστίας και ενεργειακής απόδοσης των σύγχρονων μικροεπεξεργαστών οι οποίες μπορούν να εφαρμοστούν κατά τη διάρκεια του πρώιμου σχεδιαστικού σταδίου, του σταδίου παραγωγής ή του σταδίου της κυκλοφορίας των ολοκληρωμένων κυκλωμάτων στην αγορά. Οι συνεισφορές αυτής της διατριβής μπορούν να ομαδοποιηθούν στις ακόλουθες δύο κατηγορίες σύμφωνα με το στάδιο της ζωής των μικροεπεξεργαστών στο οποίο εφαρμόζονται: • Πρώιμο σχεδιαστικό στάδιο: Η στατιστική εισαγωγή σφαλμάτων σε δομές που είναι μοντελοποιημένες σε προσομοιωτές οι οποίοι στοχεύουν στην μελέτη της απόδοσης είναι μια επιστημονικά καθιερωμένη μέθοδος για την ακριβή μέτρηση της αξιοπιστίας, αλλά υστερεί στον αργό χρόνο εκτέλεσης. Σε αυτή τη διατριβή, αρχικά παρουσιάζουμε ένα νέο πλήρως αυτοματοποιημένο εργαλείο εισαγωγής σφαλμάτων σε μικροαρχιτεκτονικό επίπεδο που στοχεύει στην ακριβή αξιολόγηση της αξιοπιστίας ενός μεγάλου πλήθους μονάδων υλικού σε σχέση με διάφορα μοντέλα σφαλμάτων (παροδικά, διακοπτόμενα, μόνιμα σφάλματα). Στη συνέχεια, χρησιμοποιώντας το ίδιο εργαλείο και στοχεύοντας τα παροδικά σφάλματα, παρουσιάζουμε διάφορες μελέτες σχετιζόμενες με την αξιοπιστία και την απόδοση, οι οποίες μπορούν να βοηθήσουν τις σχεδιαστικές αποφάσεις στα πρώιμα στάδια της ζωής των επεξεργαστών. Τελικά, προτείνουμε δύο μεθοδολογίες για να επιταχύνουμε τα μαζικά πειράματα στατιστικής εισαγωγής σφαλμάτων. Στην πρώτη, επιταχύνουμε τα πειράματα έπειτα από την πραγματική εισαγωγή των σφαλμάτων στις δομές του υλικού. Στη δεύτερη, επιταχύνουμε ακόμη περισσότερο τα πειράματα προτείνοντας τη μεθοδολογία με όνομα MeRLiN, η οποία βασίζεται στη μείωση της αρχικής λίστας σφαλμάτων μέσω της ομαδοποίησής τους σε ισοδύναμες ομάδες έπειτα από κατηγοριοποίηση σύμφωνα με την εντολή που τελικά προσπελαύνει τη δομή που φέρει το σφάλμα. • Παραγωγικό στάδιο και στάδιο κυκλοφορίας στην αγορά: Οι συνεισφορές αυτής της διδακτορικής διατριβής σε αυτά τα στάδια της ζωής των μικροεπεξεργαστών καλύπτουν δύο σημαντικά επιστημονικά πεδία. Αρχικά, χρησιμοποιώντας το ολοκληρωμένο κύκλωμα των 48 πυρήνων με ονομασία Intel SCC, προτείνουμε μια τεχνική επιτάχυνσης του εντοπισμού μονίμων σφαλμάτων που εφαρμόζεται κατά τη διάρκεια λειτουργίας αρχιτεκτονικών με πολλούς πυρήνες, η οποία εκμεταλλεύεται το δίκτυο υψηλής ταχύτητας μεταφοράς μηνυμάτων που διατίθεται στα ολοκληρωμένα κυκλώματα αυτού του είδους. Δεύτερον, προτείνουμε μια λεπτομερή στατιστική μεθοδολογία με σκοπό την ακριβή πρόβλεψη σε επίπεδο συστήματος των ασφαλών ορίων λειτουργίας της τάσης των πυρήνων τύπου ARMv8 που βρίσκονται πάνω στη CPU X-Gene 2.The evolution in semiconductor manufacturing technology, computer architecture and design leads to increase in performance of modern microprocessors, which is also accompanied by increase in products’ vulnerability to errors. Designers apply different techniques throughout microprocessors life-time in order to ensure the high reliability requirements of the delivered products that are defined as their ability to avoid service failures that are more frequent and more severe than is acceptable. This thesis proposes novel methods to guarantee the high reliability and energy efficiency requirements of modern microprocessors that can be applied during the early design phase, the manufacturing phase or after the chips release to the market. The contributions of this thesis can be grouped in the two following categories according to the phase of the CPUs lifecycle that are applied at: • Early design phase: Statistical fault injection using microarchitectural structures modeled in performance simulators is a state-of-the-art method to accurately measure the reliability, but suffers from low simulation throughput. In this thesis, we firstly present a novel fully-automated versatile microarchitecture-level fault injection framework (called MaFIN) for accurate characterization of a wide range of hardware components of an x86-64 microarchitecture with respect to various fault models (transient, intermittent, permanent faults). Next, using the same tool and focusing on transient faults, we present several reliability and performance related studies that can assist design decision in the early design phases. Moreover, we propose two methodologies to accelerate the statistical fault injection campaigns. In the first one, we accelerate the fault injection campaigns after the actual injection of the faults in the simulated hardware structures. In the second, we further accelerate the microarchitecture level fault injection campaigns by proposing MeRLiN a fault pre-processing methodology that is based on the pruning of the initial fault list by grouping the faults in equivalent classes according to the instruction access patterns to hardware entries. • Manufacturing phase and release to the market: The contributions of this thesis in these phases of microprocessors life-cycle cover two important aspects. Firstly, using the 48-core Intel’s SCC architecture, we propose a technique to accelerate online error detection of permanent faults for many-core architectures by exploiting their high-speed message passing on-chip network. Secondly, we propose a comprehensive statistical analysis methodology to accurately predict at the system level the safe voltage operation margins of the ARMv8 cores of the X- Gene 2 chip when it operates in scaled voltage conditions

    A DETECTION AND DATA ACQUISITION SYSTEM FOR PRECISION BETA DECAY SPECTROSCOPY

    Get PDF
    Free neutron and nuclear beta decay spectroscopy serves as a robust laboratory for investigations of the Standard Model of Particle Physics. Observables such as decay product angular correlations and energy spectra overconstrain the Standard Model and serve as a sensitive probe for Beyond the Standard Model physics. Improved measurement of these quantities is necessary to complement the TeV scale physics being conducted at the Large Hadron Collider. The UCNB, 45Ca, and Nab experiments aim to improve upon existing measurements of free neutron decay angular correlations and set new limits in the search for exotic couplings in beta decay. To achieve these experimental goals, a highly-pixelated, thick silicon detector with a 100 nm entrance window has been developed for precision beta spectroscopy and the direct detection of 30 keV beta decay protons. The detector has been characterized for its performance in energy reconstruction and particle arrival time determination. A Monte Carlo simulation of signal formation in the silicon detector and propagation through the electronics chain has been written to develop optimal signal analysis algorithms for minimally biased energy and timing extraction. A tagged-electron timing test has been proposed and investigated as a means to assess the validity of these Monte Carlo efforts. A universal platform for data acquisition (DAQ) has been designed and implemented in National Instrument\u27s PXIe-5171R digitizer/FPGA hardware. The DAQ retains a ring buffer of the most recent 400 ms of data in all 256 channels, so that a waveform trace can be returned from any combination of pixels and resolution for complete energy reconstruction. Low-threshold triggers on individual channels were implemented in FPGA as a generic piecewise-polynomial filter for universal, real-time digital signal processing, which allows for arbitrary filter implementation on a pixel-by-pixel basis. This system is universal in the sense that it has complete flexible, complex, and debuggable triggering at both the pixel and global level without recompiling the firmware. The culmination of this work is a system capable of a 10 keV trigger threshold, 3 keV resolution, and maximum 300 ps arrival time systematic, even in the presence of large amplitude noise components

    Embedded System Design

    Get PDF
    A unique feature of this open access textbook is to provide a comprehensive introduction to the fundamental knowledge in embedded systems, with applications in cyber-physical systems and the Internet of things. It starts with an introduction to the field and a survey of specification models and languages for embedded and cyber-physical systems. It provides a brief overview of hardware devices used for such systems and presents the essentials of system software for embedded systems, including real-time operating systems. The author also discusses evaluation and validation techniques for embedded systems and provides an overview of techniques for mapping applications to execution platforms, including multi-core platforms. Embedded systems have to operate under tight constraints and, hence, the book also contains a selected set of optimization techniques, including software optimization techniques. The book closes with a brief survey on testing. This fourth edition has been updated and revised to reflect new trends and technologies, such as the importance of cyber-physical systems (CPS) and the Internet of things (IoT), the evolution of single-core processors to multi-core processors, and the increased importance of energy efficiency and thermal issues

    Cellular Automata

    Get PDF
    Modelling and simulation are disciplines of major importance for science and engineering. There is no science without models, and simulation has nowadays become a very useful tool, sometimes unavoidable, for development of both science and engineering. The main attractive feature of cellular automata is that, in spite of their conceptual simplicity which allows an easiness of implementation for computer simulation, as a detailed and complete mathematical analysis in principle, they are able to exhibit a wide variety of amazingly complex behaviour. This feature of cellular automata has attracted the researchers' attention from a wide variety of divergent fields of the exact disciplines of science and engineering, but also of the social sciences, and sometimes beyond. The collective complex behaviour of numerous systems, which emerge from the interaction of a multitude of simple individuals, is being conveniently modelled and simulated with cellular automata for very different purposes. In this book, a number of innovative applications of cellular automata models in the fields of Quantum Computing, Materials Science, Cryptography and Coding, and Robotics and Image Processing are presented

    Multimodal Wearable Sensors for Human-Machine Interfaces

    Get PDF
    Certain areas of the body, such as the hands, eyes and organs of speech production, provide high-bandwidth information channels from the conscious mind to the outside world. The objective of this research was to develop an innovative wearable sensor device that records signals from these areas more conveniently than has previously been possible, so that they can be harnessed for communication. A novel bioelectrical and biomechanical sensing device, the wearable endogenous biosignal sensor (WEBS), was developed and tested in various communication and clinical measurement applications. One ground-breaking feature of the WEBS system is that it digitises biopotentials almost at the point of measurement. Its electrode connects directly to a high-resolution analog-to-digital converter. A second major advance is that, unlike previous active biopotential electrodes, the WEBS electrode connects to a shared data bus, allowing a large or small number of them to work together with relatively few physical interconnections. Another unique feature is its ability to switch dynamically between recording and signal source modes. An accelerometer within the device captures real-time information about its physical movement, not only facilitating the measurement of biomechanical signals of interest, but also allowing motion artefacts in the bioelectrical signal to be detected. Each of these innovative features has potentially far-reaching implications in biopotential measurement, both in clinical recording and in other applications. Weighing under 0.45 g and being remarkably low-cost, the WEBS is ideally suited for integration into disposable electrodes. Several such devices can be combined to form an inexpensive digital body sensor network, with shorter set-up time than conventional equipment, more flexible topology, and fewer physical interconnections. One phase of this study evaluated areas of the body as communication channels. The throat was selected for detailed study since it yields a range of voluntarily controllable signals, including laryngeal vibrations and gross movements associated with vocal tract articulation. A WEBS device recorded these signals and several novel methods of human-to-machine communication were demonstrated. To evaluate the performance of the WEBS system, recordings were validated against a high-end biopotential recording system for a number of biopotential signal types. To demonstrate an application for use by a clinician, the WEBS system was used to record 12‑lead electrocardiogram with augmented mechanical movement information
    corecore