371 research outputs found

    A Compiler Target Model for Line Associative Registers

    Get PDF
    LARs (Line Associative Registers) are very wide tagged registers, used for both register-wide SWAR (SIMD Within a Register )operations and scalar operations on arbitrary fields. LARs include a large data field, type tags, source addresses, and a dirty bit, which allow them to not only replace both caches and registers in the conventional memory hierarchy, but improve on both their functions. This thesis details a LAR-based architecture, and describes the design of a compiler which can generate code for a LAR-based design. In particular, type conversion, alignment, and register allocation are discussed in detail

    Doctor of Philosophy

    Get PDF
    dissertationThe increase in computational power of supercomputers is enabling complex scientific phenomena to be simulated at ever-increasing resolution and fidelity. With these simulations routinely producing large volumes of data, performing efficient I/O at this scale has become a very difficult task. Large-scale parallel writes are challenging due to the complex interdependencies between I/O middleware and hardware. Analytic-appropriate reads are traditionally hindered by bottlenecks in I/O access. Moreover, the two components of I/O, data generation from simulations (writes) and data exploration for analysis and visualization (reads), have substantially different data access requirements. Parallel writes, performed on supercomputers, often deploy aggregation strategies to permit large-sized contiguous access. Analysis and visualization tasks, usually performed on computationally modest resources, require fast access to localized subsets or multiresolution representations of the data. This dissertation tackles the problem of parallel I/O while bridging the gap between large-scale writes and analytics-appropriate reads. The focus of this work is to develop an end-to-end adaptive-resolution data movement framework that provides efficient I/O, while supporting the full spectrum of modern HPC hardware. This is achieved by developing technology for highly scalable and tunable parallel I/O, applicable to both traditional parallel data formats and multiresolution data formats, which are directly appropriate for analysis and visualization. To demonstrate the efficacy of the approach, a novel library (PIDX) is developed that is highly tunable and capable of adaptive-resolution parallel I/O to a multiresolution data format. Adaptive resolution storage and I/O, which allows subsets of a simulation to be accessed at varying spatial resolutions, can yield significant improvements to both the storage performance and I/O time. The library provides a set of parameters that controls the storage format and the nature of data aggregation across he network; further, a machine learning-based model is constructed that tunes these parameters for the maximum throughput. This work is empirically demonstrated by showing parallel I/O scaling up to 768K cores within a framework flexible enough to handle adaptive resolution I/O

    Bridging the Scalability Gap by Exploiting Error Tolerance for Emerging Applications

    Full text link
    In recent years, there has been a surge in demand for intelligent applications. These emerging applications are powered by algorithms from domains such as computer vision, image processing, pattern recognition, and machine learning. Across these algorithms, there exist two key computational characteristics. First, the computational demands they place on computing infrastructure is large, with the potential to substantially outstrip existing compute resources. Second, they are necessarily resilient to errors due to their inputs and outputs being inherently noisy and imprecise. Despite the staggering computational requirements and resilience of intelligent applications, current infrastructure uses conventional software and hardware methodologies. These systems needlessly consume resources for every bit of precision and arithmetic. To address this inefficiency and help bridge the performance gap caused by intelligent applications, this dissertation investigates exploiting error tolerance across the hardware-software stack. Specifically, we propose (1) statistical machinery to guarantee that accuracy is not compromised when removing work or precision, (2) a GPU optimization framework for work skipping and bottleneck mitigation, and (3) exploration of unconventional numerical representations to steer future hardware designs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144025/1/parkerhh_1.pd

    Γ (Gamma): cloud-based analog circuit design system

    Get PDF
    Includes bibliographical references.2016 Summer.With ever increasing demand for lower power consumption, lower cost, and higher performance, designing analog circuits to meet design specifications has become an increasing challenging task, On one hand, analog circuit designers must have intimate knowledge about the underlining silicon process technology's capability to achieve the desired specifications. On the other hand, they must understand the impact of tweaking circuits to satisfy a given specification on all circuit performance parameters. Analog designers have traditionally learned to tackle design problems with numerous circuit simulations using accurate circuit simulators such as SPICE, and have increasingly relied on trial-and-error approaches to reach a converging point. However, the increased complexity with each generation of silicon technology and high dimensionality of searching for solutions, even for some simple analog circuits, have made trial-and-error approaches extremely inefficient, causing long design cycles and often missed market opportunities. Novel rapid and accurate circuit evaluation methods that are tightly integrated with circuit search and optimization methods are needed to aid design productivity. Furthermore, the current design environment with fully distributed licensing and supporting structures is cumbersome at best to allow efficient and up-to-date support for design engineers. With increasing support and licensing costs, fewer and fewer design centers can afford it. Cloud-based software as a service (SaaS) model provides new opportunities for CAD applications. It enables immediate software delivery and update to customers at very low cost. SaaS tools benefit from fast feedback and sharing channels between users and developers and run on hardware resources tailored and provided for them by software vendors. However, web-based tools must perform in a very short turn-around schedule and be always responsive. A new class of analog design tools is presented in this dissertation. The tools provide effective design aid to analog circuit designers with a dash-board control of many important circuit parameters. Fast and accurate circuit evaluations are achieved using a novel lookup-table transistor models (LUT) with novel built-in features tightly integrated with the search engine to achieve desired speed and accuracy. This enables circuit evaluation time several orders faster than SPICE simulations. The proposed architecture for analog design attempts to break the traditional analog design flow using SPICE based trial-and-error methods by providing designers with useful information about the effects of prior design decisions they have made and potential next steps they can take to meet specifications. Benefiting from the advantages offered by web-hosted architectures, the proposed architecture incorporates SaaS as its operating model. The application of the proposed architecture is illustrated by an analog circuit sizer and optimizer. The Γ (Gamma) sizer and optimizer show how web-based design-decision supporting tool can help analog circuit designers to reduce design time and achieve high quality circuit

    AI and IoT Meet Mobile Machines

    Get PDF
    Infrastructure construction is society's cornerstone and economics' catalyst. Therefore, improving mobile machinery's efficiency and reducing their cost of use have enormous economic benefits in the vast and growing construction market. In this thesis, I envision a novel concept smart working site to increase productivity through fleet management from multiple aspects and with Artificial Intelligence (AI) and Internet of Things (IoT)

    Using Software-Defined Networking and Openflow Switching to Reroute Network Traffic Dynamically Based on Traffic Volume Measurements

    Get PDF
    Traditional switching and routing have been very effective for network packet delivery but does create some constraints. for example, all packets from a given source to a given destination must always take the same path. Within a traditional Ethernet network, a tree topology must be used. Software-Defined Networking (SDN) has the potential to bypass this tree-topology limitation by placing the control of the switches and their forwarding tables under a central device called a controller. SDN also allows for sets of controllers. the controller can identify individual network flows and issue commands to the switches to, in effect, assign individual flows to specific paths. This allows different flows between the same source and destination to take different paths. in this project we use SDN to assign TCP connections to specific paths through a network. Different connections between the same pair of endpoints can be assigned different paths. Different directions of the same TCP connection (different TCP flows ) can be assigned different paths. Paths are chosen by the controller, with full knowledge of the network topology, so there is no need for restrictions on topological loops. Unlike with Ethernet link aggregation, our approach does not require that the propagation delays on different links are equal, or even are similar. Each TCP flow gets a single path, which eliminates link-related packet reordering. One application of this is to achieve static load balancing. We create a specific topology in which there are multiple trunk lines between two host clusters; we can then spread the traffic load between the two host clusters evenly over the trunk lines. We are also able to achieve dynamic load balancing by periodically reassigning the TCP flows to different paths through the trunk lines. This distributes the traffic evenly over the trunk lines. for this portion of the project we assumed that individual TCP connections were rate limited, with the rate varying with time, so we could measure the per-connection bandwidths and assume these values would remain in effect for a reasonable interval. We create the networks and switches using the Mininet emulation environment

    Stress Testing Control Loops in Cyber-Physical Systems

    Get PDF
    Cyber-Physical Systems (CPSs) are often safety-critical and deployed in uncertain environments. Identifying scenarios where CPSs do not comply with requirements is fundamental but difficult due to the multidisciplinary nature of CPSs. We investigate the testing of control-based CPSs, where control and software engineers develop the software collaboratively. Control engineers make design assumptions during system development to leverage control theory and obtain guarantees on CPS behaviour. In the implemented system, however, such assumptions are not always satisfied, and their falsification can lead to guarantees loss. We define stress testing of control-based CPSs as generating tests to falsify such design assumptions. We highlight different types of assumptions, focusing on the use of linearised physics models. To generate stress tests falsifying such assumptions, we leverage control theory to qualitatively characterise the input space of a control-based CPS. We propose a novel test parametrisation for control-based CPSs and use it with the input space characterisation to develop a stress testing approach. We evaluate our approach on three case study systems, including a drone, a continuous-current motor (in five configurations), and an aircraft.Our results show the effectiveness of the proposed testing approach in falsifying the design assumptions and highlighting the causes of assumption violations.Comment: Accepted for publication in August 2023 on the ACM Transactions on Software Engineering and Methodology (TOSEM

    Optimisation of residential battery integrated photovoltaics system: analyses and new machine learning methods

    Get PDF
    Modelling and optimisation of battery integrated photovoltaics (PV) systems require a certain amount of high-quality input PV and load data. Despite the recent rollouts of smart meters, the amount of accessible proprietary load and PV data is still limited. This thesis addresses this data shortage issue by performing data analyses and proposing novel data extrapolation, interpolation, and synthesis models. First, a sensitivity analysis is conducted to investigate the impacts of applying PV and load data with various temporal resolutions in PV-battery optimisation models. The explored data granularities range from 5-second to hourly, and the analysis indicates 5-minute to be the most suitable for the proprietary data, achieving a good balance between accuracy and computational cost. A data extrapolation model is then proposed using net meter data clustering, which can extrapolate a month of 5-minute net/gross meter data to a year of data. This thesis also develops two generative adversarial networks (GANs) based models: a deep convolutional generative adversarial network (DCGAN) model which can generate PV and load power from random noises; a super resolution generative adversarial network (SRGAN) model which synthetically interpolates 5-minute load and PV power data from 30-minute/hourly data. All the developed approaches have been validated using a large amount of real-time residential PV and load data and a battery size optimisation model as the end-use application of the extrapolated, interpolated, and synthetic datasets. The results indicate that these models lead to optimisation results with a satisfactory level of accuracy, and at the same time, outperform other comparative approaches. These newly proposed approaches can potentially assist researchers, end-users, installers and utilities with their battery sizing and scheduling optimisation analyses, with no/minimal requirements on the granularity and amount of the available input data
    • …
    corecore