207 research outputs found

    Machine Learning for Resource-Constrained Computing Systems

    Get PDF
    Die verfügbaren Ressourcen in Informationsverarbeitungssystemen wie Prozessoren sind in der Regel eingeschränkt. Das umfasst z. B. die elektrische Leistungsaufnahme, den Energieverbrauch, die Wärmeabgabe oder die Chipfläche. Daher ist die Optimierung der Verwaltung der verfügbaren Ressourcen von größter Bedeutung, um Ziele wie maximale Performanz zu erreichen. Insbesondere die Ressourcenverwaltung auf der Systemebene hat über die (dynamische) Zuweisung von Anwendungen zu Prozessorkernen und über die Skalierung der Spannung und Frequenz (dynamic voltage and frequency scaling, DVFS) einen großen Einfluss auf die Performanz, die elektrische Leistung und die Temperatur während der Ausführung von Anwendungen. Die wichtigsten Herausforderungen bei der Ressourcenverwaltung sind die hohe Komplexität von Anwendungen und Plattformen, unvorhergesehene (zur Entwurfszeit nicht bekannte) Anwendungen oder Plattformkonfigurationen, proaktive Optimierung und die Minimierung des Laufzeit-Overheads. Bestehende Techniken, die auf einfachen Heuristiken oder analytischen Modellen basieren, gehen diese Herausforderungen nur unzureichend an. Aus diesem Grund ist der Hauptbeitrag dieser Dissertation der Einsatz maschinellen Lernens (ML) für Ressourcenverwaltung. ML-basierte Lösungen ermöglichen die Bewältigung dieser Herausforderungen durch die Vorhersage der Auswirkungen potenzieller Entscheidungen in der Ressourcenverwaltung, durch Schätzung verborgener (unbeobachtbarer) Eigenschaften von Anwendungen oder durch direktes Lernen einer Ressourcenverwaltungs-Strategie. Diese Dissertation entwickelt mehrere neuartige ML-basierte Ressourcenverwaltung-Techniken für verschiedene Plattformen, Ziele und Randbedingungen. Zunächst wird eine auf Vorhersagen basierende Technik zur Maximierung der Performanz von Mehrkernprozessoren mit verteiltem Last-Level Cache und limitierter Maximaltemperatur vorgestellt. Diese verwendet ein neuronales Netzwerk (NN) zur Vorhersage der Auswirkungen potenzieller Migrationen von Anwendungen zwischen Prozessorkernen auf die Performanz. Diese Vorhersagen erlauben die Bestimmung der bestmöglichen Migration und ermöglichen eine proaktive Verwaltung. Das NN ist so trainiert, dass es mit unbekannten Anwendungen und verschiedenen Temperaturlimits zurechtkommt. Zweitens wird ein Boosting-Verfahren zur Maximierung der Performanz homogener Mehrkernprozessoren mit limitierter Maximaltemperatur mithilfe von DVFS vorgestellt. Dieses basiert auf einer neuartigen {Boostability}-Metrik, die die Abhängigkeiten von Performanz, elektrischer Leistung und Temperatur auf Spannungs/Frequenz-Änderungen in einer Metrik vereint. % ignorerepeated Die Abhängigkeiten von Performanz und elektrischer Leistung hängen von der Anwendung ab und können zur Laufzeit nicht direkt beobachtet (gemessen) werden. Daher wird ein NN verwendet, um diese Werte für unbekannte Anwendungen zu schätzen und so die Komplexität der Boosting-Optimierung zu bewältigen. Drittens wird eine Technik zur Temperaturminimierung von heterogenen Mehrkernprozessoren mit Quality of Service-Zielen vorgestellt. Diese verwendet Imitationslernen, um eine Migrationsstrategie von Anwendungen aus optimalen Orakel-Demonstrationen zu lernen. Dafür wird ein NN eingesetzt, um die Komplexität der Plattform und des Anwendungsverhaltens zu bewältigen. Die Inferenz des NNs wird mit Hilfe eines vorhandenen generischen Beschleunigers, einer Neural Processing Unit (NPU), beschleunigt. Auch die ML Algorithmen selbst müssen auch mit begrenzten Ressourcen ausgeführt werden. Zuletzt wird eine Technik für ressourcenorientiertes Training auf verteilten Geräten vorgestellt, um einen konstanten Trainingsdurchsatz bei sich schnell ändernder Verfügbarkeit von Rechenressourcen aufrechtzuerhalten, wie es z.~B.~aufgrund von Konflikten bei gemeinsam genutzten Ressourcen der Fall ist. Diese Technik verwendet Structured Dropout, welches beim Training zufällige Teile des NNs auslässt. Dadurch können die erforderlichen Ressourcen für das Training dynamisch angepasst werden -- mit vernachlässigbarem Overhead, aber auf Kosten einer langsameren Trainingskonvergenz. Die Pareto-optimalen Dropout-Parameter pro Schicht des NNs werden durch eine Design Space Exploration bestimmt. Evaluierungen dieser Techniken werden sowohl in Simulationen als auch auf realer Hardware durchgeführt und zeigen signifikante Verbesserungen gegenüber dem Stand der Technik, bei vernachlässigbarem Laufzeit-Overhead. Zusammenfassend zeigt diese Dissertation, dass ML eine Schlüsseltechnologie zur Optimierung der Verwaltung der limitierten Ressourcen auf Systemebene ist, indem die damit verbundenen Herausforderungen angegangen werden

    Design Automation and Application for Emerging Reconfigurable Nanotechnologies

    Get PDF
    In the last few decades, two major phenomena have revolutionized the electronic industry – the ever-increasing dependence on electronic circuits and the Complementary Metal Oxide Semiconductor (CMOS) downscaling. These two phenomena have been complementing each other in a way that while electronics, in general, have demanded more computations per functional unit, CMOS downscaling has aptly supported such needs. However, while the computational demand is still rising exponentially, CMOS downscaling is reaching its physical limits. Hence, the need to explore viable emerging nanotechnologies is more imperative than ever. This thesis focuses on streamlining the existing design automation techniques for a class of emerging reconfigurable nanotechnologies. Transistors based on this technology exhibit duality in conduction, i.e. they can be configured dynamically either as a p-type or an n-type device on the application of an external bias. Owing to this dynamic reconfiguration, these transistors are also referred to as Reconfigurable Field-Effect Transistors (RFETs). Exploring and developing new technologies just like CMOS, require tackling two main challenges – first, design automation flow has to be modified to enable tailor- made circuit designs. Second, possible application opportunities should be explored where such technologies can outsmart the existing CMOS technologies. This thesis targets the above two objectives for emerging reconfigurable nanotechnologies by proposing approaches for enabling an Electronic Design Automation (EDA) flow for circuits based on RFETs and exploring hardware security as an application that exploits the transistor-level dynamic reconfiguration offered by this technology. This thesis explains the bottom-up approach adopted to propose a logic synthesis flow by identifying new logic gates and circuit design paradigms that can particularly exploit the dynamic reconfiguration offered by these novel nanotechnologies. This led to the subsequent need of finding natural Boolean logic abstraction for emerging reconfigurable nanotechnologies as it is shown that the existing abstraction of negative unate logic for CMOS technologies is sub-optimal for RFETs-based circuits. In this direction, it has been shown that duality in Boolean logic is a natural abstraction for this technology and can truly represent the duality in conduction offered by individual transistors. Finding this abstraction paved the way for defining suitable primitives and proposing various algorithms for logic synthesis and technology mapping. The following step is to explore compatible physical synthesis flow for emerging reconfigurable nanotechnologies. Using silicon nanowire-based RFETs, .lef and .lib files have been provided which can provide an end-to-end flow to generate .GDSII file for circuits exclusively based on RFETs. Additionally, new approaches have been explored to improve placement and routing for circuits based on reconfigurable nanotechnologies. It has been demonstrated how these approaches led to superior results as compared to the native flow meant for CMOS. Lastly, the unique property of transistor-level reconfiguration offered by RFETs is utilized to implement efficient Intellectual Property (IP) protection schemes against adversarial attacks. The ability to control the conduction of individual transistors can be argued as one of the impactful features of this technology and suitably fits into the paradigm of security measures. Prior security schemes based on CMOS technology often come with large overheads in terms of area, power, and delay. In contrast, RFETs-based hardware security measures such as logic locking, split manufacturing, etc. proposed in this thesis, demonstrate affordable security solutions with low overheads. Overall, this thesis lays a strong foundation for the two main objectives – design automation, and hardware security as an application, to push emerging reconfigurable nanotechnologies for commercial integration. Additionally, contributions done in this thesis are made available under open-source licenses so as to foster new research directions and collaborations.:Abstract List of Figures List of Tables 1 Introduction 1.1 What are emerging reconfigurable nanotechnologies? 1.2 Why does this technology look so promising? 1.3 Electronics Design Automation 1.4 The game of see-saw: key challenges vs benefits for emerging reconfigurable nanotechnologies 1.4.1 Abstracting ambipolarity in logic gate designs 1.4.2 Enabling electronic design automation for RFETs 1.4.3 Enhanced functionality: a suitable fit for hardware security applications 1.5 Research questions 1.6 Entire RFET-centric EDA Flow 1.7 Key Contributions and Thesis Organization 2 Preliminaries 2.1 Reconfigurable Nanotechnology 2.1.1 1D devices 2.1.2 2D devices 2.1.3 Factors favoring circuit-flexibility 2.2 Feasibility aspects of RFET technology 2.3 Logic Synthesis Preliminaries 2.3.1 Circuit Model 2.3.2 Boolean Algebra 2.3.3 Monotone Function and the property of Unateness 2.3.4 Logic Representations 3 Exploring Circuit Design Topologies for RFETs 3.1 Contributions 3.2 Organization 3.3 Related Works 3.4 Exploring design topologies for combinational circuits: functionality-enhanced logic gates 3.4.1 List of Combinational Functionality-Enhanced Logic Gates based on RFETs 3.4.2 Estimation of gate delay using the logical effort theory 3.5 Invariable design of Inverters 3.6 Sequential Circuits 3.6.1 Dual edge-triggered TSPC-based D-flip flop 3.6.2 Exploiting RFET’s ambipolarity for metastability 3.7 Evaluations 3.7.1 Evaluation of combinational logic gates 3.7.2 Novel design of 1-bit ALU 3.7.3 Comparison of the sequential circuit with an equivalent CMOS-based design 3.8 Concluding remarks 4 Standard Cells and Technology Mapping 4.1 Contributions 4.2 Organization 4.3 Related Work 4.4 Standard cells based on RFETs 4.4.1 Interchangeable Pull-Up and Pull-Down Networks 4.4.2 Reconfigurable Truth-Table 4.5 Distilling standard cells 4.6 HOF-based Technology Mapping Flow for RFETs-based circuits 4.6.1 Area adjustments through inverter sharings 4.6.2 Technology Mapping Flow 4.6.3 Realizing Parameters For The Generic Library 4.6.4 Defining RFETs-based Genlib for HOF-based mapping 4.7 Experiments 4.7.1 Experiment 1: Distilling standard-cells from a benchmark suite 4.7.2 Experiment 2A: HOF-based mapping . 4.7.3 Experiment 2B: Using the distilled standard-cells during mapping 4.8 Concluding Remarks 5 Logic Synthesis with XOR-Majority Graphs 5.1 Contributions 5.2 Organization 5.3 Motivation 5.4 Background and Preliminaries 5.4.1 Terminologies 5.4.2 Self-duality in NPN classes 5.4.3 Majority logic synthesis 5.4.4 Earlier work on XMG 5.4.5 Classification of Boolean functions 5.5 Preserving Self-Duality 5.5.1 During logic synthesis 5.5.2 During versatile technology mapping 5.6 Advanced Logic synthesis techniques 5.6.1 XMG resubstitution 5.6.2 Exact XMG rewriting 5.7 Logic representation-agnostic Mapping 5.7.1 Versatile Mapper 5.7.2 Support of supergates 5.8 Creating Self-dual Benchmarks 5.9 Experiments 5.9.1 XMG-based Flow 5.9.2 Experimental Setup 5.9.3 Synthetic self-dual benchmarks 5.9.4 Cryptographic benchmark suite 5.10 Concluding remarks and future research directions 6 Physical synthesis flow and liberty generation 6.1 Contributions 6.2 Organization 6.3 Background and Related Work 6.3.1 Related Works 6.3.2 Motivation 6.4 Silicon Nanowire Reconfigurable Transistors 6.5 Layouts for Logic Gates 6.5.1 Layouts for Static Functional Logic Gates 6.5.2 Layout for Reconfigurable Logic Gate 6.6 Table Model for Silicon Nanowire RFETs 6.7 Exploring Approaches for Physical Synthesis 6.7.1 Using the Standard Place & Route Flow 6.7.2 Open-source Flow 6.7.3 Concept of Driver Cells 6.7.4 Native Approach 6.7.5 Island-based Approach 6.7.6 Utilization Factor 6.7.7 Placement of the Island on the Chip 6.8 Experiments 6.8.1 Preliminary comparison with CMOS technology 6.8.2 Evaluating different physical synthesis approaches 6.9 Results and discussions 6.9.1 Parameters Which Affect The Area 6.9.2 Use of Germanium Nanowires Channels 6.10 Concluding Remarks 7 Polymporphic Primitives for Hardware Security 7.1 Contributions 7.2 Organization 7.3 The Shift To Explore Emerging Technologies For Security 7.4 Background 7.4.1 IP protection schemes 7.4.2 Preliminaries 7.5 Security Promises 7.5.1 RFETs for logic locking (transistor-level locking) 7.5.2 RFETs for split manufacturing 7.6 Security Vulnerabilities 7.6.1 Realization of short-circuit and open-circuit scenarios in an RFET-based inverter 7.6.2 Circuit evaluation on sub-circuits 7.6.3 Reliability concerns: A consequence of short-circuit scenario 7.6.4 Implication of the proposed security vulnerability 7.7 Analytical Evaluation 7.7.1 Investigating the security promises 7.7.2 Investigating the security vulnerabilities 7.8 Concluding remarks and future research directions 8 Conclusion 8.1 Concluding Remarks 8.2 Directions for Future Work Appendices A Distilling standard-cells B RFETs-based Genlib C Layout Extraction File (.lef) for Silicon Nanowire-based RFET D Liberty (.lib) file for Silicon Nanowire-based RFET

    Machine Learning-Aided Operations and Communications of Unmanned Aerial Vehicles: A Contemporary Survey

    Full text link
    The ongoing amalgamation of UAV and ML techniques is creating a significant synergy and empowering UAVs with unprecedented intelligence and autonomy. This survey aims to provide a timely and comprehensive overview of ML techniques used in UAV operations and communications and identify the potential growth areas and research gaps. We emphasise the four key components of UAV operations and communications to which ML can significantly contribute, namely, perception and feature extraction, feature interpretation and regeneration, trajectory and mission planning, and aerodynamic control and operation. We classify the latest popular ML tools based on their applications to the four components and conduct gap analyses. This survey also takes a step forward by pointing out significant challenges in the upcoming realm of ML-aided automated UAV operations and communications. It is revealed that different ML techniques dominate the applications to the four key modules of UAV operations and communications. While there is an increasing trend of cross-module designs, little effort has been devoted to an end-to-end ML framework, from perception and feature extraction to aerodynamic control and operation. It is also unveiled that the reliability and trust of ML in UAV operations and applications require significant attention before full automation of UAVs and potential cooperation between UAVs and humans come to fruition.Comment: 36 pages, 304 references, 19 Figure

    Cellular, Wide-Area, and Non-Terrestrial IoT: A Survey on 5G Advances and the Road Towards 6G

    Full text link
    The next wave of wireless technologies is proliferating in connecting things among themselves as well as to humans. In the era of the Internet of things (IoT), billions of sensors, machines, vehicles, drones, and robots will be connected, making the world around us smarter. The IoT will encompass devices that must wirelessly communicate a diverse set of data gathered from the environment for myriad new applications. The ultimate goal is to extract insights from this data and develop solutions that improve quality of life and generate new revenue. Providing large-scale, long-lasting, reliable, and near real-time connectivity is the major challenge in enabling a smart connected world. This paper provides a comprehensive survey on existing and emerging communication solutions for serving IoT applications in the context of cellular, wide-area, as well as non-terrestrial networks. Specifically, wireless technology enhancements for providing IoT access in fifth-generation (5G) and beyond cellular networks, and communication networks over the unlicensed spectrum are presented. Aligned with the main key performance indicators of 5G and beyond 5G networks, we investigate solutions and standards that enable energy efficiency, reliability, low latency, and scalability (connection density) of current and future IoT networks. The solutions include grant-free access and channel coding for short-packet communications, non-orthogonal multiple access, and on-device intelligence. Further, a vision of new paradigm shifts in communication networks in the 2030s is provided, and the integration of the associated new technologies like artificial intelligence, non-terrestrial networks, and new spectra is elaborated. Finally, future research directions toward beyond 5G IoT networks are pointed out.Comment: Submitted for review to IEEE CS&

    one6G white paper, 6G technology overview:Second Edition, November 2022

    Get PDF
    6G is supposed to address the demands for consumption of mobile networking services in 2030 and beyond. These are characterized by a variety of diverse, often conflicting requirements, from technical ones such as extremely high data rates, unprecedented scale of communicating devices, high coverage, low communicating latency, flexibility of extension, etc., to non-technical ones such as enabling sustainable growth of the society as a whole, e.g., through energy efficiency of deployed networks. On the one hand, 6G is expected to fulfil all these individual requirements, extending thus the limits set by the previous generations of mobile networks (e.g., ten times lower latencies, or hundred times higher data rates than in 5G). On the other hand, 6G should also enable use cases characterized by combinations of these requirements never seen before, e.g., both extremely high data rates and extremely low communication latency). In this white paper, we give an overview of the key enabling technologies that constitute the pillars for the evolution towards 6G. They include: terahertz frequencies (Section 1), 6G radio access (Section 2), next generation MIMO (Section 3), integrated sensing and communication (Section 4), distributed and federated artificial intelligence (Section 5), intelligent user plane (Section 6) and flexible programmable infrastructures (Section 7). For each enabling technology, we first give the background on how and why the technology is relevant to 6G, backed up by a number of relevant use cases. After that, we describe the technology in detail, outline the key problems and difficulties, and give a comprehensive overview of the state of the art in that technology. 6G is, however, not limited to these seven technologies. They merely present our current understanding of the technological environment in which 6G is being born. Future versions of this white paper may include other relevant technologies too, as well as discuss how these technologies can be glued together in a coherent system

    Beyond Transmitting Bits: Context, Semantics, and Task-Oriented Communications

    Full text link
    Communication systems to date primarily aim at reliably communicating bit sequences. Such an approach provides efficient engineering designs that are agnostic to the meanings of the messages or to the goal that the message exchange aims to achieve. Next generation systems, however, can be potentially enriched by folding message semantics and goals of communication into their design. Further, these systems can be made cognizant of the context in which communication exchange takes place, providing avenues for novel design insights. This tutorial summarizes the efforts to date, starting from its early adaptations, semantic-aware and task-oriented communications, covering the foundations, algorithms and potential implementations. The focus is on approaches that utilize information theory to provide the foundations, as well as the significant role of learning in semantics and task-aware communications.Comment: 28 pages, 14 figure

    Effects of age and stimulation strategies on cochlear implantation and a clinically feasible method for sound localization latency

    Get PDF
    Treating prelingual deafness with cochlear implants paves the way for spoken language development. Previous studies have shown that providing the intervention at six to 11 months is better than at 12-17 months. However, interventions at even earlier ages have not been researched to the same extent, for example by comparing five to eight months with nine to 11 months. That is why we retrospectively assessed the surgical risks, and analyzed the longitudinal spoken language tests, of 103 children who received their first cochlear implant between five and 30 months of age. This research particularly focused on surgery before 12 months of age (Paper I). Apart from language development, we expected that early implants would provide access to the interaural time differences that are crucial for localizing low frequency sounds. We were interested to examine this in combination with novel sound processing strategies with stimulation patterns that convey the fine structure of sounds. Therefore, in addition to the retrospective analysis, we studied the relationships between stimulation strategies, lateralization of interaural time differences and horizontal sound localization in 30 children (Paper II). Then we decided to develop a method to objectively assess sound localization latency to complement localization accuracy. A method that assesses latency needed to be validated in adults with normal hearing, and in hampered conditions, so that the relationship between accuracy and latency could be clarified. In our study, the gaze patterns from the localization recordings were modelled by optimizing a sigmoid function (Paper III). Furthermore, we addressed the lack of studies on the normal development of sound localization latency of gaze responses in infancy and early childhood (Paper IV). Our study of spoken language development showed the benefit of cochlear implantation before nine months of age, compared to nine to 11 months of age, without increased surgical risks. This finding was strongest when it came to the age at which the child’s language could be understood (Paper I). When our group of 30 subjects underwent tests for interaural time differences, 10 were able to discriminate within the range of naturally occurring differences. Interestingly, the choice of stimulation strategy was a prerequisite for lateralizing natural interaural time differences. However, no relationships between this ability to lateralize and the ability to localize low frequency sounds were found (Paper II). The localization setup meant that detailed investigations of gaze behavior could be carried out. Eight normal hearing adults demonstrated a mean sound localization latency of 280 ± 40 milliseconds (ms), with distinct prolongation with unilateral earplugging. It is interesting to observe the similarity in latency, dynamic behavior, and overlap of anatomical structures between the acoustic middle ear reflex and sound localization latency (Paper III). In addition, normal hearing infants showed diminished sound localization latency, from 1000 ms at six months of age down to 500 ms at three years of age (Paper IV). Latency in children with early cochlear implants still needs to be studied. The findings in this thesis have important clinical implications for counseling parents and they provide valuable data to guide clinical choices about the age when cochlear implants are provided and processor programming takes place. The fast, objective and non-invasive method of sound localization latency assessment may further enhance the clinical processes of diagnosing and monitoring interventions in children with hearing impairment
    • …
    corecore