2,960 research outputs found

    Guided rewriting and constraint satisfaction for parallel GPU code generation

    Get PDF
    Graphics Processing Units (GPUs) are notoriously hard to optimise for manually due to their scheduling and memory hierarchies. What is needed are good automatic code generators and optimisers for such parallel hardware. Functional approaches such as Accelerate, Futhark and LIFT leverage a high-level algorithmic Intermediate Representation (IR) to expose parallelism and abstract the implementation details away from the user. However, producing efficient code for a given accelerator remains challenging. Existing code generators depend on the user input to choose a subset of hard-coded optimizations or automated exploration of implementation search space. The former suffers from the lack of extensibility, while the latter is too costly due to the size of the search space. A hybrid approach is needed, where a space of valid implementations is built automatically and explored with the aid of human expertise. This thesis presents a solution combining user-guided rewriting and automatically generated constraints to produce high-performance code. The first contribution is an automatic tuning technique to find a balance between performance and memory consumption. Leveraging its functional patterns, the LIFT compiler is empowered to infer tuning constraints and limit the search to valid tuning combinations only. Next, the thesis reframes parallelisation as a constraint satisfaction problem. Parallelisation constraints are extracted automatically from the input expression, and a solver is used to identify valid rewriting. The constraints truncate the search space to valid parallel mappings only by capturing the scheduling restrictions of the GPU in the context of a given program. A synchronisation barrier insertion technique is proposed to prevent data races and improve the efficiency of the generated parallel mappings. The final contribution of this thesis is the guided rewriting method, where the user encodes a design space of structural transformations using high-level IR nodes called rewrite points. These strongly typed pragmas express macro rewrites and expose design choices as explorable parameters. The thesis proposes a small set of reusable rewrite points to achieve tiling, cache locality, data reuse and memory optimisation. A comparison with the vendor-provided handwritten kernel ARM Compute Library and the TVM code generator demonstrates the effectiveness of this thesis' contributions. With convolution as a use case, LIFT-generated direct and GEMM-based convolution implementations are shown to perform on par with the state-of-the-art solutions on a mobile GPU. Overall, this thesis demonstrates that a functional IR yields well to user-guided and automatic rewriting for high-performance code generation

    IoT Transmission Technologies for Distributed Measurement Systems in Critical Environments

    Get PDF
    Distributed measurement systems are spread in the most diverse application scenarios, and Internet of Things (IoT) transmission equipment is usually the enabling technologies for such measurement systems that need to feature wireless connectivity to ensure pervasiveness. Because wireless measurement systems have been deployed for the last years even in critical environments, assessing transmission technologies performances in such contexts is fundamental. Indeed, they are the most challenging ones for wireless data transmission due to their intrinsic attenuation capabilities. Several scenarios in which measurement systems can be deployed are analysed. Firstly, marine contexts are treated by considering above-the-sea wireless links. Such setting can be experienced in whichever application requiring remote monitoring of facilities and assets that are offshore installed. Some instances are offshore sea farming plants, or remote video monitoring systems installed on seamark buoys. Secondly, wireless communications taking place from the underground to the aboveground are covered. This scenario is typical of precision agriculture applications, where the accurate measurement of underground physical parameters is needed to be remotely sent to optimise crops reducing the wastefulness of fundamental resources (e.g., irrigation water). Thirdly, wireless communications occurring from the underwater to the abovewater are addressed. Such situation is inevitable for all those infrastructures monitoring conservation status of underwater species like algae, seaweeds and reef. Then, wireless links happening traversing metal surfaces and structures are tackled. Such context is commonly encountered in asset tracking and monitoring (e.g., containers), or in smart metering applications (e.g., utility meters). Lastly, sundry harsh environments that are typical of industrial monitoring (e.g., vibrating machineries, harsh temperature and humidity rooms, corrosive atmospheres) are tested to validate pervasive measurement infrastructures even in such contexts that are usually experienced in Industrial Internet of Things (IIoT) applications. The performances of wireless measurement systems in such scenarios are tested by sorting out ad-hoc measurement campaigns. Finally, IoT measurement infrastructures respectively deployed in above-the-sea and underground-to-aboveground settings are described to provide real applications in which such facilities can be effectively installed. Nonetheless, the aforementioned application scenarios are only some amid their sundry variety. Indeed, nowadays distributed pervasive measurement systems have to be thought in a broad way, resulting in countless instances: predictive maintenance, smart healthcare, smart cities, industrial monitoring, or smart agriculture, etc. This Thesis aims at showing distributed measurement systems in critical environments to set up pervasive monitoring infrastructures that are enabled by IoT transmission technologies. At first, they are presented, and then the harsh environments are introduced, along with the relative theoretical analysis modelling path loss in such conditions. It must be underlined that this Thesis aims neither at finding better path loss models with respect to the existing ones, nor at improving them. Indeed, path loss models are exploited as they are, in order to derive estimates of losses to understand the effectiveness of the deployed infrastructure. In fact, some transmission tests in those contexts are described, along with providing examples of these types of applications in the field, showing the measurement infrastructures and the relative critical environments serving as deployment sites. The scientific relevance of this Thesis is evident since, at the moment, the literature lacks a comparative study like this, showing both transmission performances in critical environments, and the deployment of real IoT distributed wireless measurement systems in such contexts

    Microcredentials to support PBL

    Get PDF

    Design and Real-World Evaluation of Dependable Wireless Cyber-Physical Systems

    Get PDF
    The ongoing effort for an efficient, sustainable, and automated interaction between humans, machines, and our environment will make cyber-physical systems (CPS) an integral part of the industry and our daily lives. At their core, CPS integrate computing elements, communication networks, and physical processes that are monitored and controlled through sensors and actuators. New and innovative applications become possible by extending or replacing static and expensive cable-based communication infrastructures with wireless technology. The flexibility of wireless CPS is a key enabler for many envisioned scenarios, such as intelligent factories, smart farming, personalized healthcare systems, autonomous search and rescue, and smart cities. High dependability, efficiency, and adaptivity requirements complement the demand for wireless and low-cost solutions in such applications. For instance, industrial and medical systems should work reliably and predictably with performance guarantees, even if parts of the system fail. Because emerging CPS will feature mobile and battery-driven devices that can execute various tasks, the systems must also quickly adapt to frequently changing conditions. Moreover, as applications become ever more sophisticated, featuring compact embedded devices that are deployed densely and at scale, efficient designs are indispensable to achieve desired operational lifetimes and satisfy high bandwidth demands. Meeting these partly conflicting requirements, however, is challenging due to imperfections of wireless communication and resource constraints along several dimensions, for example, computing, memory, and power constraints of the devices. More precisely, frequent and correlated message losses paired with very limited bandwidth and varying delays for the message exchange significantly complicate the control design. In addition, since communication ranges are limited, messages must be relayed over multiple hops to cover larger distances, such as an entire factory. Although the resulting mesh networks are more robust against interference, efficient communication is a major challenge as wireless imperfections get amplified, and significant coordination effort is needed, especially if the networks are dynamic. CPS combine various research disciplines, which are often investigated in isolation, ignoring their complex interaction. However, to address this interaction and build trust in the proposed solutions, evaluating CPS using real physical systems and wireless networks paired with formal guarantees of a system’s end-to-end behavior is necessary. Existing works that take this step can only satisfy a few of the abovementioned requirements. Most notably, multi-hop communication has only been used to control slow physical processes while providing no guarantees. One of the reasons is that the current communication protocols are not suited for dynamic multi-hop networks. This thesis closes the gap between existing works and the diverse needs of emerging wireless CPS. The contributions address different research directions and are split into two parts. In the first part, we specifically address the shortcomings of existing communication protocols and make the following contributions to provide a solid networking foundation: • We present Mixer, a communication primitive for the reliable many-to-all message exchange in dynamic wireless multi-hop networks. Mixer runs on resource-constrained low-power embedded devices and combines synchronous transmissions and network coding for a highly scalable and topology-agnostic message exchange. As a result, it supports mobile nodes and can serve any possible traffic patterns, for example, to efficiently realize distributed control, as required by emerging CPS applications. • We present Butler, a lightweight and distributed synchronization mechanism with formally guaranteed correctness properties to improve the dependability of synchronous transmissions-based protocols. These protocols require precise time synchronization provided by a specific node. Upon failure of this node, the entire network cannot communicate. Butler removes this single point of failure by quickly synchronizing all nodes in the network without affecting the protocols’ performance. In the second part, we focus on the challenges of integrating communication and various control concepts using classical time-triggered and modern event-based approaches. Based on the design, implementation, and evaluation of the proposed solutions using real systems and networks, we make the following contributions, which in many ways push the boundaries of previous approaches: • We are the first to demonstrate and evaluate fast feedback control over low-power wireless multi-hop networks. Essential for this achievement is a novel co-design and integration of communication and control. Our wireless embedded platform tames the imperfections impairing control, for example, message loss and varying delays, and considers the resulting key properties in the control design. Furthermore, the careful orchestration of control and communication tasks enables real-time operation and makes our system amenable to an end-to-end analysis. Due to this, we can provably guarantee closed-loop stability for physical processes with linear time-invariant dynamics. • We propose control-guided communication, a novel co-design for distributed self-triggered control over wireless multi-hop networks. Self-triggered control can save energy by transmitting data only when needed. However, there are no solutions that bring those savings to multi-hop networks and that can reallocate freed-up resources, for example, to other agents. Our control system informs the communication system of its transmission demands ahead of time so that communication resources can be allocated accordingly. Thus, we can transfer the energy savings from the control to the communication side and achieve an end-to-end benefit. • We present a novel co-design of distributed control and wireless communication that resolves overload situations in which the communication demand exceeds the available bandwidth. As systems scale up, featuring more agents and higher bandwidth demands, the available bandwidth will be quickly exceeded, resulting in overload. While event-triggered control and self-triggered control approaches reduce the communication demand on average, they cannot prevent that potentially all agents want to communicate simultaneously. We address this limitation by dynamically allocating the available bandwidth to the agents with the highest need. Thus, we can formally prove that our co-design guarantees closed-loop stability for physical systems with stochastic linear time-invariant dynamics.:Abstract Acknowledgements List of Abbreviations List of Figures List of Tables 1 Introduction 1.1 Motivation 1.2 Application Requirements 1.3 Challenges 1.4 State of the Art 1.5 Contributions and Road Map 2 Mixer: Efficient Many-to-All Broadcast in Dynamic Wireless Mesh Networks 2.1 Introduction 2.2 Overview 2.3 Design 2.4 Implementation 2.5 Evaluation 2.6 Discussion 2.7 Related Work 3 Butler: Increasing the Availability of Low-Power Wireless Communication Protocols 3.1 Introduction 3.2 Motivation and Background 3.3 Design 3.4 Analysis 3.5 Implementation 3.6 Evaluation 3.7 Related Work 4 Feedback Control Goes Wireless: Guaranteed Stability over Low-Power Multi-Hop Networks 4.1 Introduction 4.2 Related Work 4.3 Problem Setting and Approach 4.4 Wireless Embedded System Design 4.5 Control Design and Analysis 4.6 Experimental Evaluation 4.A Control Details 5 Control-Guided Communication: Efficient Resource Arbitration and Allocation in Multi-Hop Wireless Control Systems 5.1 Introduction 5.2 Problem Setting 5.3 Co-Design Approach 5.4 Wireless Communication System Design 5.5 Self-Triggered Control Design 5.6 Experimental Evaluation 6 Scaling Beyond Bandwidth Limitations: Wireless Control With Stability Guarantees Under Overload 6.1 Introduction 6.2 Problem and Related Work 6.3 Overview of Co-Design Approach 6.4 Predictive Triggering and Control System 6.5 Adaptive Communication System 6.6 Integration and Stability Analysis 6.7 Testbed Experiments 6.A Proof of Theorem 4 6.B Usage of the Network Bandwidth for Control 7 Conclusion and Outlook 7.1 Contributions 7.2 Future Directions Bibliography List of Publication

    Accurate quantum transport modelling and epitaxial structure design of high-speed and high-power In0.53Ga0.47As/AlAs double-barrier resonant tunnelling diodes for 300-GHz oscillator sources

    Get PDF
    Terahertz (THz) wave technology is envisioned as an appealing and conceivable solution in the context of several potential high-impact applications, including sixth generation (6G) and beyond consumer-oriented ultra-broadband multi-gigabit wireless data-links, as well as highresolution imaging, radar, and spectroscopy apparatuses employable in biomedicine, industrial processes, security/defence, and material science. Despite the technological challenges posed by the THz gap, recent scientific advancements suggest the practical viability of THz systems. However, the development of transmitters (Tx) and receivers (Rx) based on compact semiconductor devices operating at THz frequencies is urgently demanded to meet the performance requirements calling from emerging THz applications. Although several are the promising candidates, including high-speed III-V transistors and photo-diodes, resonant tunnelling diode (RTD) technology offers a compact and high performance option in many practical scenarios. However, the main weakness of the technology is currently represented by the low output power capability of RTD THz Tx, which is mainly caused by the underdeveloped and non-optimal device, as well as circuit, design implementation approaches. Indeed, indium phosphide (InP) RTD devices can nowadays deliver only up to around 1 mW of radio-frequency (RF) power at around 300 GHz. In the context of THz wireless data-links, this severely impacts the Tx performance, limiting communication distance and data transfer capabilities which, at the current time, are of the order of few tens of gigabit per second below around 1 m. However, recent research studies suggest that several milliwatt of output power are required to achieve bit-rate capabilities of several tens of gigabits per second and beyond, and to reach several metres of communication distance in common operating conditions. Currently, the shortterm target is set to 5−10 mW of output power at around 300 GHz carrier waves, which would allow bit-rates in excess of 100 Gb/s, as well as wireless communications well above 5 m distance, in first-stage short-range scenarios. In order to reach it, maximisation of the RTD highfrequency RF power capability is of utmost importance. Despite that, reliable epitaxial structure design approaches, as well as accurate physical-based numerical simulation tools, aimed at RF power maximisation in the 300 GHz-band are lacking at the current time. This work aims at proposing practical solutions to address the aforementioned issues. First, a physical-based simulation methodology was developed to accurately and reliably simulate the static current-voltage (IV ) characteristic of indium gallium arsenide/aluminium arsenide (In-GaAs/AlAs) double-barrier RTD devices. The approach relies on the non-equilibrium Green’s function (NEGF) formalism implemented in Silvaco Atlas technology computer-aided design (TCAD) simulation package, requires low computational budget, and allows to correctly model In0.53Ga0.47As/AlAs RTD devices, which are pseudomorphically-grown on lattice-matched to InP substrates, and are commonly employed in oscillators working at around 300 GHz. By selecting the appropriate physical models, and by retrieving the correct materials parameters, together with a suitable discretisation of the associated heterostructure spatial domain through finite-elements, it is shown, by comparing simulation data with experimental results, that the developed numerical approach can reliably compute several quantities of interest that characterise the DC IV curve negative differential resistance (NDR) region, including peak current, peak voltage, and voltage swing, all of which are key parameters in RTD oscillator design. The demonstrated simulation approach was then used to study the impact of epitaxial structure design parameters, including those characterising the double-barrier quantum well, as well as emitter and collector regions, on the electrical properties of the RTD device. In particular, a comprehensive simulation analysis was conducted, and the retrieved output trends discussed based on the heterostructure band diagram, transmission coefficient energy spectrum, charge distribution, and DC current-density voltage (JV) curve. General design guidelines aimed at enhancing the RTD device maximum RF power gain capability are then deduced and discussed. To validate the proposed epitaxial design approach, an In0.53Ga0.47As/AlAs double-barrier RTD epitaxial structure providing several milliwatt of RF power was designed by employing the developed simulation methodology, and experimentally-investigated through the microfabrication of RTD devices and subsequent high-frequency characterisation up to 110 GHz. The analysis, which included fabrication optimisation, reveals an expected RF power performance of up to around 5 mW and 10 mW at 300 GHz for 25 μm2 and 49 μm2-large RTD devices, respectively, which is up to five times higher compared to the current state-of-the-art. Finally, in order to prove the practical employability of the proposed RTDs in oscillator circuits realised employing low-cost photo-lithography, both coplanar waveguide and microstrip inductive stubs are designed through a full three-dimensional electromagnetic simulation analysis. In summary, this work makes and important contribution to the rapidly evolving field of THz RTD technology, and demonstrates the practical feasibility of 300-GHz high-power RTD devices realisation, which will underpin the future development of Tx systems capable of the power levels required in the forthcoming THz applications

    Towards A Practical High-Assurance Systems Programming Language

    Full text link
    Writing correct and performant low-level systems code is a notoriously demanding job, even for experienced developers. To make the matter worse, formally reasoning about their correctness properties introduces yet another level of complexity to the task. It requires considerable expertise in both systems programming and formal verification. The development can be extremely costly due to the sheer complexity of the systems and the nuances in them, if not assisted with appropriate tools that provide abstraction and automation. Cogent is designed to alleviate the burden on developers when writing and verifying systems code. It is a high-level functional language with a certifying compiler, which automatically proves the correctness of the compiled code and also provides a purely functional abstraction of the low-level program to the developer. Equational reasoning techniques can then be used to prove functional correctness properties of the program on top of this abstract semantics, which is notably less laborious than directly verifying the C code. To make Cogent a more approachable and effective tool for developing real-world systems, we further strengthen the framework by extending the core language and its ecosystem. Specifically, we enrich the language to allow users to control the memory representation of algebraic data types, while retaining the automatic proof with a data layout refinement calculus. We repurpose existing tools in a novel way and develop an intuitive foreign function interface, which provides users a seamless experience when using Cogent in conjunction with native C. We augment the Cogent ecosystem with a property-based testing framework, which helps developers better understand the impact formal verification has on their programs and enables a progressive approach to producing high-assurance systems. Finally we explore refinement type systems, which we plan to incorporate into Cogent for more expressiveness and better integration of systems programmers with the verification process

    Tools for efficient Deep Learning

    Get PDF
    In the era of Deep Learning (DL), there is a fast-growing demand for building and deploying Deep Neural Networks (DNNs) on various platforms. This thesis proposes five tools to address the challenges for designing DNNs that are efficient in time, in resources and in power consumption. We first present Aegis and SPGC to address the challenges in improving the memory efficiency of DL training and inference. Aegis makes mixed precision training (MPT) stabler by layer-wise gradient scaling. Empirical experiments show that Aegis can improve MPT accuracy by at most 4\%. SPGC focuses on structured pruning: replacing standard convolution with group convolution (GConv) to avoid irregular sparsity. SPGC formulates GConv pruning as a channel permutation problem and proposes a novel heuristic polynomial-time algorithm. Common DNNs pruned by SPGC have maximally 1\% higher accuracy than prior work. This thesis also addresses the challenges lying in the gap between DNN descriptions and executables by Polygeist for software and POLSCA for hardware. Many novel techniques, e.g. statement splitting and memory partitioning, are explored and used to expand polyhedral optimisation. Polygeist can speed up software execution in sequential and parallel by 2.53 and 9.47 times on Polybench/C. POLSCA achieves 1.5 times speedup over hardware designs directly generated from high-level synthesis on Polybench/C. Moreover, this thesis presents Deacon, a framework that generates FPGA-based DNN accelerators of streaming architectures with advanced pipelining techniques to address the challenges from heterogeneous convolution and residual connections. Deacon provides fine-grained pipelining, graph-level optimisation, and heuristic exploration by graph colouring. Compared with prior designs, Deacon shows resource/power consumption efficiency improvement of 1.2x/3.5x for MobileNets and 1.0x/2.8x for SqueezeNets. All these tools are open source, some of which have already gained public engagement. We believe they can make efficient deep learning applications easier to build and deploy.Open Acces

    Proximity detection protocols for IoT devices

    Get PDF
    In recent years, we have witnessed the growth of the Internet of Things paradigm, with its increased pervasiveness in our everyday lives. The possible applications are diverse: from a smartwatch able to measure heartbeat and communicate it to the cloud, to the device that triggers an event when we approach an exhibit in a museum. Present in many of these applications is the Proximity Detection task: for instance the heartbeat could be measured only when the wearer is near to a well defined location for medical purposes or the touristic attraction must be triggered only if someone is very close to it. Indeed, the ability of an IoT device to sense the presence of other devices nearby and calculate the distance to them can be considered the cornerstone of various applications, motivating research on this fundamental topic. The energy constraints of the IoT devices are often in contrast with the needs of continuous operations to sense the environment and to achieve high accurate distance measurements from the neighbors, thus making the design of Proximity Detection protocols a challenging task
    • …
    corecore