28 research outputs found

    Automated Design of Approximate Accelerators

    Get PDF
    In den letzten zehn Jahren hat das BedĂŒrfnis nach Recheneffizienz die Entwicklung neuer GerĂ€te, Architekturen und Entwurfstechniken motiviert. Approximate Computing hat sich als modernes, energieeffizientes Entwurfsparadigma fĂŒr Anwendungen herausgestellt, die eine inhĂ€rente Fehlertoleranz aufweisen. Wenn die Genauigkeit der Ergebnisse in aktuellen Anwendungen wie Bildverarbeitung, Computer Vision und maschinellem Lernen auf ein akzeptables Maß reduziert wird, können Einsparungen im Schaltungsbereich, bei der Schaltkreisverzögerung und beim Stromverbrauch erzielt werden. Mit dem Aufkommen dieses Approximate Computing Paradigmas wurden in der Literatur viele approximierte Funktionseinheiten angegeben, insbesondere approximierte Addierer und Multiplizierer. FĂŒr eine Vielzahl solcher approximierter Schaltkreise und unter BerĂŒcksichtigung ihrer Verwendung als Bausteine fĂŒr den Entwurf von approximierten Beschleunigern fĂŒr fehlertolerante Anwendungen, ergibt sich eine Herausforderung: die Auswahl dieser approximierten Schaltkreise fĂŒr eine bestimmte Anwendung, die die erforderlichen Ressourcen minimieren und gleichzeitig eine definierte Genauigkeit erfĂŒllen. Diese Dissertation schlĂ€gt automatisierte Methoden zum Entwerfen und Implementieren von approximierten Beschleunigern vor, die aus approximierten arithmetischen Schaltungen aufgebaut sind. Um dies zu erreichen, befasst sich diese Dissertation mit folgenden Herausforderungen und liefert die nachfolgenden neuartigen BeitrĂ€ge: In der Literatur wurden viele approximierte Addierer und Multiplizierer vorgestellt, indem entweder approximierte EntwĂŒrfe aus genauen Implementierungen wie dem Ripple-Carry-Addierer vorgeschlagen oder durch Approximate Logic Synthesis (ALS) Methoden generiert wurden. Ein reprĂ€sentativer Satz dieser approximierten Komponenten ist erforderlich, um approximierte Beschleuniger zu bauen. In diesem Sinne prĂ€sentiert diese Dissertation zwei AnsĂ€tze, um solche approximierte arithmetische Schaltungen zu erstellen. ZunĂ€chst wird AUGER vorgestellt, ein Tool, mit dem Register-Transfer Level (RTL) Beschreibungen fĂŒr einen breiten Satz von approximierten Addierern und Multiplizierer fĂŒr unterschiedliche Datenbitbreiten- und Genauigkeitskonfigurationen generiert werden können. Mit AUGER kann eine Design Space Exploration (DSE) von approximierten Komponenten durchgefĂŒhrt werden, um diejenigen zu finden, die fĂŒr eine gegebene Bitbreite, einen gegebenen Approximationsbereich und eine gegebene Schaltungsmetrik Pareto-optimal sind. Anschließend wird AxLS vorgestellt, ein Framework fĂŒr ALS, das die Implementierung modernster Methoden und den Vorschlag neuartiger Methoden ermöglicht, um strukturelle Netzlistentransformationen durchzufĂŒhren und approximierte arithmetische Schaltungen aus genauen Schaltungen zu generieren. DarĂŒber hinaus bieten beide Werkzeuge eine Fehlercharakterisierung in Form einer Fehlerverteilung und Schaltungseigenschaften (FlĂ€che, Schaltkreisverzögerung und Leistung) fĂŒr jede von ihnen erzeugte approximierte Schaltung. Diese Informationen sind fĂŒr das Untersuchungsziel dieser Dissertation von wesentlicher Bedeutung. Trotz der Fehlertoleranz mĂŒssen approximierte Beschleuniger so ausgelegt sein, dass sie Genauigkeitsvorgaben erfĂŒllen. FĂŒr den Entwurf solcher Beschleuniger unter Verwendung von approximierten arithmetischen Schaltungen ist es daher unerlĂ€sslich zu bewerten, wie sich die durch approximierte Schaltungen verursachten Fehler durch andere Berechnungen ausbreiten, entweder genau oder ungenau, und sich schließlich am Ausgang ansammeln. Diese Dissertation schlĂ€gt analytische Modelle vor, um die Fehlerpropagation durch genaue und approximierte Berechnungen zu beschreiben. Mit ihnen wird eine automatisierte, compilerbasierte Methodik vorgeschlagen, um die Fehlerpropagation auf approximierten Beschleunigerdesigns abzuschĂ€tzen. Diese Methode ist in ein Tool, CEDA, integriert, um schnelle, simulationsfreie GenauigkeitsschĂ€tzungen von approximierten Beschleunigermodellen durchzufĂŒhren, die unter Verwendung von C-Code beschrieben wurden. Beim Entwurf von approximierten Beschleunigern benötigen sich wiederholende Simulationen auf Gate-Level und die Schaltungssynthese viel Zeit, um viele oder sogar alle möglichen Kombinationen fĂŒr einen gegebenen Satz von approximierten arithmetischen Schaltungen zu untersuchen. Andererseits basieren aktuelle Trends beim Entwerfen von Beschleunigern auf High-Level Synthesis (HLS) Werkzeugen. In dieser Dissertation werden analytische Modelle zur SchĂ€tzung der erforderlichen Rechenressourcen vorgestellt, wenn approximierte Addierer und Multiplizierer in Konstruktionen von approximierten Beschleunigern verwendet werden. DarĂŒber hinaus werden diese Modelle zusammen mit den vorgeschlagenen analytischen Modellen zur GenauigkeitsschĂ€tzung in eine DSE-Methodik fĂŒr fehlertolerante Anwendungen, DSEwam, integriert, um Pareto-optimale oder nahezu Pareto-optimale Lösungen fĂŒr approximierte Beschleuniger zu identifizieren. DSEwam ist in ein HLS-Tool integriert, um automatisch RTL-Beschreibungen von approximierten Beschleunigern aus C-Sprachbeschreibungen fĂŒr eine bestimmte Fehlerschwelle und ein bestimmtes Minimierungsziel zu generieren. Die Verwendung von approximierten Beschleunigern muss sicherstellen, dass Fehler, die aufgrund von approximierten Berechnungen erzeugt werden, innerhalb eines definierten Maximalwerts fĂŒr eine gegebene Genauigkeitsmetrik bleiben. Die Fehler, die durch approximierte Beschleuniger erzeugt werden, hĂ€ngen jedoch von den Eingabedaten ab, die hinsichtlich der fĂŒr das Design verwendeten Daten unterschiedlich sein können. In dieser Dissertation wird ECAx vorgestellt, eine automatisierte Methode zur Untersuchung und Anwendung feinkörniger Fehlerkorrekturen mit geringem Overhead in approximierten Beschleunigern, um die Kosten fĂŒr die Fehlerkorrektur auf Softwareebene (wie es in der Literatur gemacht wird) zu senken. Dies erfolgt durch selektive Korrektur der signifikantesten Fehler (in Bezug auf ihre GrĂ¶ĂŸenordnung), die von approximierten Komponenten erzeugt werden, ohne die Vorteile der Approximationen zu verlieren. Die experimentelle Auswertung zeigt Beschleunigungsverbesserungen fĂŒr die Anwendung im Austausch fĂŒr einen leicht gestiegenen FlĂ€chen- und Leistungsverbrauch im approximierten Beschleunigerdesign

    A Review of Bayesian Methods in Electronic Design Automation

    Full text link
    The utilization of Bayesian methods has been widely acknowledged as a viable solution for tackling various challenges in electronic integrated circuit (IC) design under stochastic process variation, including circuit performance modeling, yield/failure rate estimation, and circuit optimization. As the post-Moore era brings about new technologies (such as silicon photonics and quantum circuits), many of the associated issues there are similar to those encountered in electronic IC design and can be addressed using Bayesian methods. Motivated by this observation, we present a comprehensive review of Bayesian methods in electronic design automation (EDA). By doing so, we hope to equip researchers and designers with the ability to apply Bayesian methods in solving stochastic problems in electronic circuits and beyond.Comment: 24 pages, a draft version. We welcome comments and feedback, which can be sent to [email protected]

    PDNPulse: Sensing PCB Anomaly with the Intrinsic Power Delivery Network

    Full text link
    The ubiquitous presence of printed circuit boards (PCBs) in modern electronic systems and embedded devices makes their integrity a top security concern. To take advantage of the economies of scale, today's PCB design and manufacturing are often performed by suppliers around the globe, exposing them to many security vulnerabilities along the segmented PCB supply chain. Moreover, the increasing complexity of the PCB designs also leaves ample room for numerous sneaky board-level attacks to be implemented throughout each stage of a PCB's lifetime, threatening many electronic devices. In this paper, we propose PDNPulse, a power delivery network (PDN) based PCB anomaly detection framework that can identify a wide spectrum of board-level malicious modifications. PDNPulse leverages the fact that the PDN's characteristics are inevitably affected by modifications to the PCB, no matter how minuscule. By detecting changes to the PDN impedance profile and using the Frechet distance-based anomaly detection algorithms, PDNPulse can robustly and successfully discern malicious modifications across the system. Using PDNPulse, we conduct extensive experiments on seven commercial-off-the-shelf PCBs, covering different design scales, different threat models, and seven different anomaly types. The results confirm that PDNPulse creates an effective security asymmetry between attack and defense

    Formal connectivity verification of clock and reset signals in ultra-low-power SoC designs

    Get PDF
    Abstract. This thesis investigates the usage of formal connectivity verification on clock and reset signal connectivity in ultra-low-power SoC designs. The origin of power consumption in CMOS circuits is explained, and the conflict between dynamic and static power on system parameter level is introduced. Common power reduction techniques are introduced and explained in some detail. Overview of functional verification and its role in the design flow is presented. The main classification of functional verification into logic simulation and formal verification is discussed, and details of both are explained and compared. Challenges rising from low power design methodologies are introduced. Detailed view of connectivity and integration in SoC designs is provided, and a specified method of verifying connectivity is introduced in the form of formal connectivity verification. The practical part of the thesis starts with an explanation of the verification goal and requirements for achieving it. Structure of the design environment used in the verification task is explained, and the different stages that the verification was conducted on. Creation of used connectivity properties and the used process flow for the chosen software tool is presented. The process of confirming falsified properties as design bugs is introduced. The results of the verification task are presented, providing the total target amount for each verification stage, as well as the found bugs. The found bugs and their circumstances are explained. Comparison is made between the conventional method of verifying connectivity and the investigated formal method. Results show a great decrease in overall work effort, resourcing and time spent on the connectivity verification.Formaali liitettÀvyysverifiointi kello- ja reset-signaaleille ultra-matalan tehonkulutuksen jÀrjestelmÀpiireissÀ. TiivistelmÀ. TÀmÀ diplomityö tutkii formaalin liitettÀvyysverifionnin kÀyttöÀ kello- ja reset-signaalien yhteyksille ultra-matalan tehonkulutuksen jÀrjestelmÀpiireissÀ. Tehonkulutuksen lÀhteet CMOS piireissÀ selitetÀÀn, ja esitetÀÀn konflikti dynaamisen ja staattisen tehonkulutuksen vÀlillÀ systeemin parametritasolla. Tavanomaisia tehonkulutusta vÀhentÀviÀ tekniikoita esitellÀÀn ja selitetÀÀn jossain mÀÀrin. Funktionaalisen verifioinnin yleiskatsaus ja asema suunnitteluvuossa esitellÀÀn. Funktionaalisen verifioinnin pÀÀjaottelua logiikkasimulaatioon ja formaaliin verifiointiin kÀsitellÀÀn, ja molempien yksityiskohtia selitetÀÀn ja vertaillaan. Matalan tehonkulutuksen metodologioiden aiheuttamat ongelmat esitetÀÀn. Yksityiskohtainen kuvaus liitettÀvyydestÀ ja integroinnista jÀrjestelmÀpiireissÀ selitetÀÀn, ja eritelty metodi liitettÀvyyden verifioimiselle esitellÀÀn formaalin liitettÀvyysverifionnin muodossa. KÀytÀnnön osuus diplomityöstÀ alkaa verifoinnin tavoitteen ja vaatimusten esittelemisellÀ. KÀytetyn mallin rakenne ja verifiointitehtÀvÀ selitetÀÀn, sekÀ eri tasot joilla verifiointi suoritettiin. LiitettÀvyys-ominaisuuksien luominen, sekÀ kÀytetty prosessivuo valitulle työkalulle esitetÀÀn. VÀÀriksi todistettujen ominaisuuksien varmistaminen suunnitteluvirheiksi esitellÀÀn. Tulokset verifointitehtÀvÀstÀ esitellÀÀn, kÀsitellen verifioinnin kohteiden kokonaista lukumÀÀrÀÀ molemmilla verifiointitasoilla, sekÀ niistÀ löydettyjen virheiden mÀÀrÀÀ. Löydetyt suunnitteluvirheet ja niiden seikkaperÀt selitetÀÀn. Vertailua tehdÀÀn perinteisen liitettÀvyyden verifionnin metodin ja tutkitun formaalin metodin vÀlillÀ. Tulokset osoittavat suuren sÀÀstön kokonaisessa työmÀÀrÀssÀ, resurssoinnissa sekÀ liitettÀvyyden verifiointiin kulutetussa ajassa

    Module-lattice KEM Over a Ring of Dimension 128 for Embedded Systems

    Get PDF
    Following the development of quantum computing, the demand for post-quantum alternatives to current cryptosystems has firmly increased recently. The main disadvantage of those schemes is the amount of resources needed to implement them in comparison to their classical counterpart. In conjunction with the growth of the Internet of Things, it is crucial to know if post-quantum algorithms can evolve in constraint environments without incurring an unacceptable performance penalty. In this paper, we propose an instantiation of a module-lattice-based KEM working over a ring of dimension 128 using a limited amount of memory at runtime. It can be seen as a lightweight version of Kyber or a module version of Frodo. We propose parameters targeting popular 8-bit AVR microcontrollers and security level 1 of NIST. Our implementation fits in around 2 KB of RAM while still providing reasonable efficiency and 128 bits of security, but at the cost of a reduced correctness

    Comparing the Effectiveness of Support Vector Machines and Convolutional Neural Networks for Determining User Intent in Conversational Agents

    Get PDF
    Over the last ïŹfty years, conversational agent systems have evolved in their ability to understand natural language input. In recent years Natural Language Processing (NLP) and Machine Learning (ML) have allowed computer systems to make great strides in the area of natural language understanding. However, little research has been carried out in these areas within the context of conversational systems. This paper identiïŹes Convolutional Neural Network (CNN) and Support Vector Machine (SVM) as the two ML algorithms with the best record of performance in ex isting NLP literature, with CNN indicated as generating the better results of the two. A comprehensive experiment is deïŹned where the results of SVM models utilising sev eral kernels are compared to the results of a selection of CNN models. To contextualise the experiment to conversational agents a dataset based on conversational interactions is used. A state of the art NLP pipeline is also created to work with both algorithms in the context of the agent dataset. By conducting a detailed statistical analysis of the results, this paper proposes to provide an extensive indicator as to which algo rithm oïŹ€ers better performance for agent-based systems. Ultimately the experimental results indicate that CNN models do not necessarily generate better results than SVM models. In fact, the SVM model utilising a Radial Basis Function kernel generates statistically better results than all other models considered under these experimental conditions

    Integrated Circuits Parasitic Capacitance Extraction Using Machine Learning and its Application to Layout Optimization

    Get PDF
    The impact of parasitic elements on the overall circuit performance keeps increasing from one technology generation to the next. In advanced process nodes, the parasitic effects dominate the overall circuit performance. As a result, the accuracy requirements of parasitic extraction processes significantly increased, especially for parasitic capacitance extraction. Existing parasitic capacitance extraction tools face many challenges to cope with such new accuracy requirements that are set by semiconductor foundries (\u3c 5% error). Although field-solver methods can meet such requirements, they are very slow and have a limited capacity. The other alternative is the rule-based parasitic capacitance extraction methods, which are faster and have a high capacity; however, they cannot consistently provide good accuracy as they use a pre-characterized library of capacitance formulas that cover a limited number of layout patterns. On the other hand, the new parasitic extraction accuracy requirements also added more challenges on existing parasitic-aware routing optimization methods, where simplified parasitic models are used to optimize layouts. This dissertation provides new solutions for interconnect parasitic capacitance extraction and parasitic-aware routing optimization methodologies in order to cope with the new accuracy requirements of advanced process nodes as follows. First, machine learning compact models are developed in rule-based extractors to predict parasitic capacitances of cross-section layout patterns efficiently. The developed models mitigate the problems of the pre-characterized library approach, where each compact model is designed to extract parasitic capacitances of cross-sections of arbitrary distributed metal polygons that belong to a specific set of metal layers (i.e., layer combination) efficiently. Therefore, the number of covered layout patterns significantly increased. Second, machine learning compact models are developed to predict parasitic capacitances of middle-end-of-line (MEOL) layers around FINFETs and MOSFETs. Each compact model extracts parasitic capacitances of 3D MEOL patterns of a specific device type regardless of its metal polygons distribution. Therefore, the developed MEOL models can replace field-solvers in extracting MEOL patterns. Third, a novel accuracy-based hybrid parasitic capacitance extraction method is developed. The proposed hybrid flow divides a layout into windows and extracts the parasitic capacitances of each window using one of three parasitic capacitance extraction methods that include: 1) rule-based; 2) novel deep-neural-networks-based; and 3) field-solver methods. This hybrid methodology uses neural-networks classifiers to determine an appropriate extraction method for each window. Moreover, as an intermediate parasitic capacitance extraction method between rule-based and field-solver methods, a novel deep-neural-networks-based extraction method is developed. This intermediate level of accuracy and speed is needed since using only rule-based and field-solver methods (for hybrid extraction) results in using field-solver most of the time for any required high accuracy extraction. Eventually, a parasitic-aware layout routing optimization and analysis methodology is implemented based on an incremental parasitic extraction and a fast optimization methodology. Unlike existing flows that do not provide a mechanism to analyze the impact of modifying layout geometries on a circuit performance, the proposed methodology provides novel sensitivity circuit models to analyze the integrity of signals in layout routes. Such circuit models are based on an accurate matrix circuit representation, a cost function, and an accurate parasitic sensitivity extraction. The circuit models identify critical parasitic elements along with the corresponding layout geometries in a certain route, where they measure the sensitivity of a route’s performance to corresponding layout geometries very fast. Moreover, the proposed methodology uses a nonlinear programming technique to optimize problematic routes with pre-determined degrees of freedom using the proposed circuit models. Furthermore, it uses a novel incremental parasitic extraction method to extract parasitic elements of modified geometries efficiently, where the incremental extraction is used as a part of the routing optimization process to improve the optimization runtime and increase the optimization accuracy

    Discrete Gate Sizing Methodologies for Delay, Area and Power Optimization

    Get PDF
    The modeling of an individual gate and the optimization of circuit performance has long been a critical issue in the VLSI industry. In this work, we first study of the gate sizing problem for today\u27s industrial designs, and explore the contributions and limitations of all the existing approaches, which mainly suffer from producing only continuous solutions, using outdated timing models or experiencing performance inefficiency. In this dissertation, we present our new discrete gate sizing technique which optimizes different aspects of circuit performance, including delay, area and power consumption. And our method is fast and efficient as it applies the local search instead of global exhaustive search during gate size selection process, which greatly reduces the search space and improves the computation complexity. In addition to that, it is also flexible with different timing models, and it is able to deal with the constraints of input/output slew and output load capacitance, under which very few previous research works were reported. We then propose a new timing model, which is derived from the classic Elmore delay model, but takes the features of modern timing models from standard cell library. With our new timing model, we are able to formulate the combinatorial discrete sizing problem as a simplified mathematical expression and apply it to existing Lagrangian relaxation method, which is shown to converge to optimal solution. We demonstrate that the classic Elmore delay model based gate sizing approaches can still be valid. Therefore, our work might provide a new look into the numerous Elmore delay model based research works in various areas (such as placement, routing, layout, buffer insertion, timing analysis, etc.)

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems
    corecore