541 research outputs found

    Guided rewriting and constraint satisfaction for parallel GPU code generation

    Get PDF
    Graphics Processing Units (GPUs) are notoriously hard to optimise for manually due to their scheduling and memory hierarchies. What is needed are good automatic code generators and optimisers for such parallel hardware. Functional approaches such as Accelerate, Futhark and LIFT leverage a high-level algorithmic Intermediate Representation (IR) to expose parallelism and abstract the implementation details away from the user. However, producing efficient code for a given accelerator remains challenging. Existing code generators depend on the user input to choose a subset of hard-coded optimizations or automated exploration of implementation search space. The former suffers from the lack of extensibility, while the latter is too costly due to the size of the search space. A hybrid approach is needed, where a space of valid implementations is built automatically and explored with the aid of human expertise. This thesis presents a solution combining user-guided rewriting and automatically generated constraints to produce high-performance code. The first contribution is an automatic tuning technique to find a balance between performance and memory consumption. Leveraging its functional patterns, the LIFT compiler is empowered to infer tuning constraints and limit the search to valid tuning combinations only. Next, the thesis reframes parallelisation as a constraint satisfaction problem. Parallelisation constraints are extracted automatically from the input expression, and a solver is used to identify valid rewriting. The constraints truncate the search space to valid parallel mappings only by capturing the scheduling restrictions of the GPU in the context of a given program. A synchronisation barrier insertion technique is proposed to prevent data races and improve the efficiency of the generated parallel mappings. The final contribution of this thesis is the guided rewriting method, where the user encodes a design space of structural transformations using high-level IR nodes called rewrite points. These strongly typed pragmas express macro rewrites and expose design choices as explorable parameters. The thesis proposes a small set of reusable rewrite points to achieve tiling, cache locality, data reuse and memory optimisation. A comparison with the vendor-provided handwritten kernel ARM Compute Library and the TVM code generator demonstrates the effectiveness of this thesis' contributions. With convolution as a use case, LIFT-generated direct and GEMM-based convolution implementations are shown to perform on par with the state-of-the-art solutions on a mobile GPU. Overall, this thesis demonstrates that a functional IR yields well to user-guided and automatic rewriting for high-performance code generation

    Evaluating the sustainability and resiliency of local food systems

    Get PDF
    With an ever-rising global population and looming environmental challenges such as climate change and soil degradation, it is imperative to increase the sustainability of food production. The drastic rise in food insecurity during the COVID-19 pandemic has further shown a pressing need to increase the resiliency of food systems. One strategy to reduce the dependence on complex, vulnerable global supply chains is to strengthen local food systems, such as by producing more food in cities. This thesis uses an interdisciplinary, food systems approach to explore aspects of sustainability and resiliency within local food systems. Lifecycle assessment (LCA) was used to evaluate how farm scale, distance to consumer, and management practices influence environmental impacts for different local agriculture models in two case study locations: Georgia, USA and England, UK. Farms were grouped based on urbanisation level and management practices, including: urban organic, peri-urban organic, rural organic, and rural conventional. A total of 25 farms and 40 crop lifecycles were evaluated, focusing on two crops (kale and tomatoes) and including impacts from seedling production through final distribution to the point of sale. Results were extremely sensitive to the allocation of composting burdens (decomposition emissions), with impact variation between organic farms driven mainly by levels of compost use. When composting burdens were attributed to compost inputs, the rural conventional category in the U.S. and the rural organic category in the UK had the lowest average impacts per kg sellable crop produced, including the lowest global warming potential (GWP). However, when subtracting avoided burdens from the municipal waste stream from compost inputs, trends reversed entirely, with urban or peri-urban farm categories having the lowest impacts (often negative) for GWP and marine eutrophication. Overall, farm management practices were the most important factor driving environmental impacts from local food supply chains. A soil health assessment was then performed on a subset of the UK farms to provide insight to ecosystem services that are not captured within LCA frameworks. Better soil health was observed in organically-farmed and uncultivated soils compared to conventionally farmed soils, suggesting higher ecosystem service provisioning as related to improved soil structure, flood mitigation, erosion control, and carbon storage. However, relatively high heavy metal concentrations were seen on urban and peri-urban farms, as well as those located in areas with previous mining activity. This implies that there are important services and disservices on farms that are not captured by LCAs. Zooming out from a focus on food production, a qualitative methodology was used to explore experiences of food insecurity and related health and social challenges during the COVID-19 pandemic. Fourteen individuals receiving emergency food parcels from a community food project in Sheffield, UK were interviewed. Results showed that maintaining food security in times of crisis requires a diverse set of individual, household, social, and place-based resources, which were largely diminished or strained during the pandemic. Drawing upon social capital and community support was essential to cope with a multiplicity of hardship, highlighting a need to develop community food infrastructure that supports ideals of mutual aid and builds connections throughout the food supply chain. Overall, this thesis shows that a range of context-specific solutions are required to build sustainable and resilient food systems. This can be supported by increasing local control of food systems and designing strategies to meet specific community needs, whilst still acknowledging a shared global responsibility to protect ecosystem, human, and planetary health

    Substoichiometric Phases of Hafnium Oxide with Semiconducting Properties

    Get PDF
    Since the dawn of the information age, all developments that provided a significant improvement in information processing and data transmission have been considered as key technologies. The impact of ever new data processing innovations on the economy and almost all areas of our daily lives is unprecedented and a departure from this trend is unimaginable in the near future. Even though the end of Moore's Law has been predicted all too often, the steady exponential growth of computing capacity remains unaffected to this day, due to tremendous commercial pressure. While the minimum physical size of the transistor architecture is a serious constraint, the steady evolution of computing effectiveness is not limited in the predictable future. However, the focus of development will have to expand more strongly to other technological aspects of information processing. For example, the development of new computer paradigms which mark a departure from the digitally dominated van Neumann architecture will play an increasingly significant role. The category of so-called next-generation non-volatile memory technologies, based on various physical principles such as phase transformation, magnetic or ferroelectric properties or ion diffusion, could play a central role here. These memory technologies promise in part strongly pronounced multi-bit properties up to quasi-analog switching behavior. These attributes are of fundamental importance especially for new promising concepts of information processing like in-memory computing and neuromorphic processing. In addition, many next-generation non-volatile memory technologies already show advantages over conventional media such as Flash memory. For example, their application promises significantly reduced energy consumption and their write and especially read speeds are in some cases far superior to conventional technology and could therefore already contribute significant technological improvements to the existing memory hierarchy. However, these alternative concepts are currently still limited in terms of their statistical reliability, among other things. Even though phase change memory in the form of the 3D XPoint, for example, has already been commercialized, the developments have not yet been able to compete due to the enormous commercial pressure in Flash memory research. Nevertheless, the further development of alternative concepts for the next and beyond memory generations is essential and the in-depth research on next-generation non-volatile memory technologies is therefore a hot and extremely important scientific topic. This work focuses on hafnium oxide, a key material in next-generation non-volatile memory research. Hafnium oxide is very well known in the semiconductor industry, as it generated a lot of attention in the course of high-k research due to its excellent dielectric properties and established CMOS compatibility. However, since the growing interest in so-called memristive memory, research efforts have primarily focused on the value of hafnium oxide in the form of resistive random-access memory (RRAM) and, with the discovery of ferroelectricity in HfO₂, ferroelectric resistive random-access memory (FeRAM). RRAM is a next-generation non-volatile memory technology that features a simple metal-insulator-metal (MIM) structure, excellent scalability, and potential 3D integration. In particular, the aforementioned gradual to quasi-continuous switching behavior has been demonstrated on a variety of RRAM systems. A significant change of the switching properties is achievable, for example, by the choice of top and bottom electrodes, the introduction of doping elements, or by designated oxygen deficiency. In particular, the last point is based on the basic physical principle of the hafnium oxide-based RRAM mechanism, in which local oxygen ions are stimulated to diffuse by applying an electrical potential, and a so-called conducting filament is formed by the remaining vacancies, which electrically connects the two electrode sides. The process is characterized by the reversibility of the conducting filament which can be dissolved by a suitable I-V programming (e.g., reversal of the voltage direction). In the literature there are some predictions of sub-stoichiometric hafnium oxide phases, such as Hf₂O₃, HfO or Hf₆O, which could be considered as conducting filament phases, but there is a lack of conclusive experimental results. While there are studies that assign supposed structures in oxygen-deficient hafnium oxide thin films, these assignments are mostly based on references from various stoichiometric hafnium oxide high-temperature phases such as tetragonal t-HfO₂ (P4₂/nmc) or cubic c-HfO₂ (Fm-3m), or high-pressure phases such as orthorhombic o-HfO₂ (Pbca). Furthermore, the structural identification of such thin films proves to be difficult, as they are susceptible to arbitrary texturing and reflection broadening in X-ray diffraction. In addition, such thin films are usually synthesized as phase mixtures with monoclinic hafnium oxide. A further challenge in property determination is given by their usual arrangement in MIM configuration, which is determined by the quality of top and bottom electrodes and their interfaces to the active material. It is therefore a non-trivial task to draw conclusions on individual material properties such as electrical conductivity in such (e.g., oxygen-deficient) RRAM devices. To answer these open questions, this work is primarily devoted to material properties of oxygen-deficient hafnium oxide phases. Therefore, in the first comprehensive study of this work, Molecular-Beam Epitaxy (MBE) was used to synthesize hafnium oxide phases over a wide oxidation range from monoclinic to hexagonal hafnium oxide. The hafnium oxide films were deposited on c-cut sapphire to achieve effective phase selection and identification by epitaxial growth, taking into account the position of relative lattice planes. In addition, the choice of a substrate with a high band gap and optical transparency enabled the direct investigation of both optical and electrical properties by means of UV/Vis transmission spectroscopy and Hall effect measurements. With additional measurements via X-ray diffraction (XRD), X-ray reflectometry (XRR), X-ray photoelectron spectroscopy (XPS) and high-resolution transmission electron microscopy (HRTEM), the oxygen content-dependent changes in crystal as well as band structure could be correlated with electrical properties. Based on these results, a comprehensive band structure model over the entire oxidation range from insulating HfO₂ to metallic Hf was established, highlighting the discovered intermediate key structures of rhombohedral r-HfO₁.₇ and hexagonal hcp-HfO₀.₇. In the second topic of this work, the phase transition from stoichiometric monoclinic to oxygen-deficient rhombohedral hafnium oxide was complemented by DFT calculations in collaboration with the theory group of Prof. Valentí (Frankfurt am Main). A detailed comparison between experimental results and DFT calculations confirms previously assumed mechanisms for phase stabilization. In addition, the comparison shows a remarkable agreement between experimental and theoretical results on the crystal- and band stucture. The calculations allowed to predict the positions of oxygen ions in oxygen-deficient hafnium oxide as well as the associated space group. Also, the investigations provide information on the thermodynamic stability of the corresponding phases. Finally, the orbital-resolved hybridization of valence states influenced by oxygen vacancies is discussed. Another experimental study deals with the reproduction and investigation, of the aforementioned substoichiometric hafnium oxide phases in MIM configuration which is typical for RRAM devices. Special attention was given to the influence of surface oxidation effects. Here, it was found that the oxygen-deficient phases r-HfO₁.₇ and hcp-HfO₀.₇ exhibit high ohmic conductivity as expected, but stable bipolar switching behavior as a result of oxidation in air. Here, the mechanism of this behavior was discussed and the role of the r-HfO₁.₇ and hcp-HfO₀.₇ phases as novel electrode materials in hafnium oxide-based RRAM in particular. In collaboration with the electron microscopy group of Prof. Molina Luna, the studied phases, which have been characterized by rather macroscopic techniques so far, have been analyzed by wide-ranging TEM methodology. The strong oxygen deficiency in combination with the verified electrical conductivity of r-HfO₁.₇ and hcp-HfO₀.₇ shows the importance of the identification of these phases on the nanoscale. Such abilities are essential for the planned characterization of the "conducting-filament" mechanism. Here, the ability to distinguish m-HfO₂, r-HfO₁.₇, and hcp-HfO₀.₇ using high-resolution transmission electron microscopy (HRTEM), Automated Crystal Orientation and Phase Mapping (ACOM), and Electron Energy Loss Spectroscopy (EELS), is demonstrated and the necessity of combined measurements for reliable phase identification was discussed. Finally, a series of monoclinic to rhombohedral hafnium oxide was investigated in a cooperative study with FZ Jülich using scanning probe microscopy. Since recent studies in particular highlight the significance of the microstructure in stoichiometric hafnium oxide-based RRAM, the topological microstructure in the region of the phase transition to strongly oxygen deficient rhombohedral hafnium oxide was investigated. Special attention was given to the correlation of microstructure and conductivity. In particular, the influences of grain boundaries on electrical properties were discussed. In summary, this work provides comprehensive insights into the nature and properties of sub-stoichiometric hafnium oxide phases and their implications on the research of hafnium oxide-based RRAM technology. Taking into account a wide range of scientific perspectives, both, the validity of obtained results and the wide range of their application is demonstrated. Thus, this dissertation provides a detailed scientific base to the understanding of hafnium oxide-based electronics

    Modelling, Dimensioning and Optimization of 5G Communication Networks, Resources and Services

    Get PDF
    This reprint aims to collect state-of-the-art research contributions that address challenges in the emerging 5G networks design, dimensioning and optimization. Designing, dimensioning and optimization of communication networks resources and services have been an inseparable part of telecom network development. The latter must convey a large volume of traffic, providing service to traffic streams with highly differentiated requirements in terms of bit-rate and service time, required quality of service and quality of experience parameters. Such a communication infrastructure presents many important challenges, such as the study of necessary multi-layer cooperation, new protocols, performance evaluation of different network parts, low layer network design, network management and security issues, and new technologies in general, which will be discussed in this book

    A multi-level functional IR with rewrites for higher-level synthesis of accelerators

    Get PDF
    Specialised accelerators deliver orders of magnitude higher energy-efficiency than general-purpose processors. Field Programmable Gate Arrays (FPGAs) have become the substrate of choice, because the ever-changing nature of modern workloads, such as machine learning, demands reconfigurability. However, they are notoriously hard to program directly using Hardware Description Languages (HDLs). Traditional High-Level Synthesis (HLS) tools improve productivity, but come with their own problems. They often produce sub-optimal designs and programmers are still required to write hardware-specific code, thus development cycles remain long. This thesis proposes Shir, a higher-level synthesis approach for high-performance accelerator design with a hardware-agnostic programming entry point, a multi-level Intermediate Representation (IR), a compiler and rewrite rules for optimisation. First, a novel, multi-level functional IR structure for accelerator design is described. The IRs operate on different levels of abstraction, cleanly separating different hardware concerns. They enable the expression of different forms of parallelism and standard memory features, such as asynchronous off-chip memories or synchronous on-chip buffers, as well as arbitration of such shared resources. Exposing these features at the IR level is essential for achieving high performance. Next, mechanical lowering procedures are introduced to automatically compile a program specification through Shir’s functional IRs until low-level HDL code for FPGA synthesis is emitted. Each lowering step gradually adds implementation details. Finally, this thesis presents rewrite rules for automatic optimisations around parallelisation, buffering and data reshaping. Reshaping operations pose a challenge to functional approaches in particular. They introduce overheads that compromise performance or even prevent the generation of synthesisable hardware designs altogether. This fundamental issue is solved by the application of rewrite rules. The viability of this approach is demonstrated by running matrix multiplication and 2D convolution on an Intel Arria 10 FPGA. A limited design space exploration is conducted, confirming the ability of the IR to exploit various hardware features. Using rewrite rules for optimisation, it is possible to generate high-performance designs that are competitive with highly tuned OpenCL implementations and that outperform hardware-agnostic OpenCL code. The performance impact of the optimisations is further evaluated showing that they are essential to achieving high performance, and in many cases also necessary to produce hardware that fits the resource constraints

    Applications

    Get PDF
    Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications

    AI/ML Algorithms and Applications in VLSI Design and Technology

    Full text link
    An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

    LIPIcs, Volume 277, GIScience 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 277, GIScience 2023, Complete Volum
    corecore