26 research outputs found

    Fully Synthesizable Low-Area Digital-to-Analog Converter With Graceful Degradation and Dynamic Power-Resolution Scaling

    Get PDF
    In this paper, a fully synthesizable digital-to-analog converter (DAC) is proposed. Based on a digital standard cell approach, the proposed DAC allows very low design effort, enables digital-like shrinkage across CMOS generations, low area at down-scaled technologies, and operation down to near-threshold voltages. The proposed DAC can operate at supply voltages that are significantly lower and/or at clock frequencies that are significantly greater than the intended design point, at the expense of moderate resolution degradation. In a 12-bit 40-nm testchip, graceful degradation of 0.3bit/100mV is achieved when V_DD is over-scaled down to 0.8V, and 1.4bit/100mV when further scaled down to 0.6V. The proposed DAC enables dynamic power-resolution tradeoff with 3X (2X) power saving for 1-bit resolution degradation at iso-sample rate (iso-resolution). A 12-bit DAC testchip designed with a fully automated standard cell flow in 40nm consumes 55µW at 27kS/s (9.1µW at 13.5kS/s) at a compact area of 500µm^2 and low voltage of 0.55V

    System Abstractions for Scalable Application Development at the Edge

    Get PDF
    Recent years have witnessed an explosive growth of Internet of Things (IoT) devices, which collect or generate huge amounts of data. Given diverse device capabilities and application requirements, data processing takes place across a range of settings, from on-device to a nearby edge server/cloud and remote cloud. Consequently, edge-cloud coordination has been studied extensively from the perspectives of job placement, scheduling and joint optimization. Typical approaches focus on performance optimization for individual applications. This often requires domain knowledge of the applications, but also leads to application-specific solutions. Application development and deployment over diverse scenarios thus incur repetitive manual efforts. There are two overarching challenges to provide system-level support for application development at the edge. First, there is inherent heterogeneity at the device hardware level. The execution settings may range from a small cluster as an edge cloud to on-device inference on embedded devices, differing in hardware capability and programming environments. Further, application performance requirements vary significantly, making it even more difficult to map different applications to already heterogeneous hardware. Second, there are trends towards incorporating edge and cloud and multi-modal data. Together, these add further dimensions to the design space and increase the complexity significantly. In this thesis, we propose a novel framework to simplify application development and deployment over a continuum of edge to cloud. Our framework provides key connections between different dimensions of design considerations, corresponding to the application abstraction, data abstraction and resource management abstraction respectively. First, our framework masks hardware heterogeneity with abstract resource types through containerization, and abstracts away the application processing pipelines into generic flow graphs. Further, our framework further supports a notion of degradable computing for application scenarios at the edge that are driven by multimodal sensory input. Next, as video analytics is the killer app of edge computing, we include a generic data management service between video query systems and a video store to organize video data at the edge. We propose a video data unit abstraction based on a notion of distance between objects in the video, quantifying the semantic similarity among video data. Last, considering concurrent application execution, our framework supports multi-application offloading with device-centric control, with a userspace scheduler service that wraps over the operating system scheduler

    Runtime Management of Multiprocessor Systems for Fault Tolerance, Energy Efficiency and Load Balancing

    Get PDF
    Efficiency of modern multiprocessor systems is hurt by unpredictable events: aging causes permanent faults that disable components; application spawnings and terminations taking place at arbitrary times, affect energy proportionality, causing energy waste; load imbalances reduce resource utilization, penalizing performance. This thesis demonstrates how runtime management can mitigate the negative effects of unpredictable events, making decisions guided by a combination of static information known in advance and parameters that only become known at runtime. We propose techniques for three different objectives: graceful degradation of aging-prone systems; energy efficiency of heterogeneous adaptive systems; and load balancing by means of work stealing. Managing aging-prone systems for graceful efficiency degradation, is based on a high-level system description that encapsulates hardware reconfigurability and workload flexibility and allows to quantify system efficiency and use it as an objective function. Different custom heuristics, as well as simulated annealing and a genetic algorithm are proposed to optimize this objective function as a response to component failures. Custom heuristics are one to two orders of magnitude faster, provide better efficiency for the first 20% of system lifetime and are less than 13% worse than a genetic algorithm at the end of this lifetime. Custom heuristics occasionally fail to satisfy reconfiguration cost constraints. As all algorithms\u27 execution time scales well with respect to system size, a genetic algorithm can be used as backup in these cases. Managing heterogeneous multiprocessors capable of Dynamic Voltage and Frequency Scaling is based on a model that accurately predicts performance and power: performance is predicted by combining static, application-specific profiling information and dynamic, runtime performance monitoring data; power is predicted using the aforementioned performance estimations and a set of platform-specific, static parameters, determined only once and used for every application mix. Three runtime heuristics are proposed, that make use of this model to perform partial search of the configuration space, evaluating a small set of configurations and selecting the best one. When best-effort performance is adequate, the proposed approach achieves 3% higher energy efficiency compared to the powersave governor and 2x better compared to the interactive and ondemand governors. When individual applications\u27 performance requirements are considered, the proposed approach is able to satisfy them, giving away 18% of system\u27s energy efficiency compared to the powersave, which however misses the performance targets by 23%; at the same time, the proposed approach maintains an efficiency advantage of about 55% compared to the other governors, which also satisfy the requirements. Lastly, to improve load balancing of multiprocessors, a partial and approximate view of the current load distribution among system cores is proposed, which consists of lightweight data structures and is maintained by each core through cheap operations. A runtime algorithm is developed, using this view whenever a core becomes idle, to perform victim core selection for work stealing, also considering system topology and memory hierarchy. Among 12 diverse imbalanced workloads, the proposed approach achieves better performance than random, hierarchical and local stealing for six workloads. Furthermore, it is at most 8% slower among the other six workloads, while competing strategies incur a penalty of at least 89% on some workload

    Data partitioning and load balancing in parallel disk systems

    Get PDF
    Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via inter-request and intra-request parallelism. In this paper we discuss the main issues in performance tuning of such systems, namely striping and load balancing, and show their relationship to response time and throughput. We outline the main components of an intelligent file system that optimizes striping by taking into account the requirements of the applications, and performs load balancing by judicious file allocation and dynamic redistributions of the data when access patterns change. Our system uses simple but effective heuristics that incur only little overhead. We present performance experiments based on synthetic workloads and real-life traces

    The Road to General Intelligence

    Get PDF
    Humans have always dreamed of automating laborious physical and intellectual tasks, but the latter has proved more elusive than naively suspected. Seven decades of systematic study of Artificial Intelligence have witnessed cycles of hubris and despair. The successful realization of General Intelligence (evidenced by the kind of cross-domain flexibility enjoyed by humans) will spawn an industry worth billions and transform the range of viable automation tasks.The recent notable successes of Machine Learning has lead to conjecture that it might be the appropriate technology for delivering General Intelligence. In this book, we argue that the framework of machine learning is fundamentally at odds with any reasonable notion of intelligence and that essential insights from previous decades of AI research are being forgotten. We claim that a fundamental change in perspective is required, mirroring that which took place in the philosophy of science in the mid 20th century. We propose a framework for General Intelligence, together with a reference architecture that emphasizes the need for anytime bounded rationality and a situated denotational semantics. We given necessary emphasis to compositional reasoning, with the required compositionality being provided via principled symbolic-numeric inference mechanisms based on universal constructions from category theory. • Details the pragmatic requirements for real-world General Intelligence. • Describes how machine learning fails to meet these requirements. • Provides a philosophical basis for the proposed approach. • Provides mathematical detail for a reference architecture. • Describes a research program intended to address issues of concern in contemporary AI. The book includes an extensive bibliography, with ~400 entries covering the history of AI and many related areas of computer science and mathematics.The target audience is the entire gamut of Artificial Intelligence/Machine Learning researchers and industrial practitioners. There are a mixture of descriptive and rigorous sections, according to the nature of the topic. Undergraduate mathematics is in general sufficient. Familiarity with category theory is advantageous for a complete understanding of the more advanced sections, but these may be skipped by the reader who desires an overall picture of the essential concepts This is an open access book

    Design for manufacturability : a feature-based agent-driven approach

    Get PDF

    Modelling and performability evaluation of Wireless Sensor Networks

    Get PDF
    This thesis presents generic analytical models of homogeneous clustered Wireless Sensor Networks (WSNs) with a centrally located Cluster Head (CH) coordinating cluster communication with the sink directly or through other intermediate nodes. The focus is to integrate performance and availability studies of WSNs in the presence of sensor nodes and channel failures and repair/replacement. The main purpose is to enhance improvement of WSN Quality of Service (QoS). Other research works also considered in this thesis include modelling of packet arrival distribution at the CH and intermediate nodes, and modelling of energy consumption at the sensor nodes. An investigation and critical analysis of wireless sensor network architectures, energy conservation techniques and QoS requirements are performed in order to improve performance and availability of the network. Existing techniques used for performance evaluation of single and multi-server systems with several operative states are investigated and analysed in details. To begin with, existing approaches for independent (pure) performance modelling are critically analysed with highlights on merits and drawbacks. Similarly, pure availability modelling approaches are also analysed. Considering that pure performance models tend to be too optimistic and pure availability models are too conservative, performability, which is the integration of performance and availability studies is used for the evaluation of the WSN models developed in this study. Two-dimensional Markov state space representations of the systems are used for performability modelling. Following critical analysis of the existing solution techniques, spectral expansion method and system of simultaneous linear equations are developed and used to solving the proposed models. To validate the results obtained with the two techniques, a discrete event simulation tool is explored. In this research, open queuing networks are used to model the behaviour of the CH when subjected to streams of traffic from cluster nodes in addition to dynamics of operating in the various states. The research begins with a model of a CH with an infinite queue capacity subject to failures and repair/replacement. The model is developed progressively to consider bounded queue capacity systems, channel failures and sleep scheduling mechanisms for performability evaluation of WSNs. Using the developed models, various performance measures of the considered system including mean queue length, throughput, response time and blocking probability are evaluated. Finally, energy models considering mean power consumption in each of the possible operative states is developed. The resulting models are in turn employed for the evaluation of energy saving for the proposed case study model. Numerical solutions and discussions are presented for all the queuing models developed. Simulation is also performed in order to validate the accuracy of the results obtained. In order to address issues of performance and availability of WSNs, current research present independent performance and availability studies. The concerns resulting from such studies have therefore remained unresolved over the years hence persistence poor system performance. The novelty of this research is a proposed integrated performance and availability modelling approach for WSNs meant to address challenges of independent studies. In addition, a novel methodology for modelling and evaluation of power consumption is also offered. Proposed model results provide remarkable improvement on system performance and availability in addition to providing tools for further optimisation studies. A significant power saving is also observed from the proposed model results. In order to improve QoS for WSN, it is possible to improve the proposed models by incorporating priority queuing in a mixed traffic environment. A model of multi-server system is also appropriate for addressing traffic routing. It is also possible to extend the proposed energy model to consider other sleep scheduling mechanisms other than On-demand proposed herein. Analysis and classification of possible arrival distribution of WSN packets for various application environments would be a great idea for enabling robust scientific research

    Contributions in statistical process control for high quality products

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Reprogrammable In Vivo Architecture

    Get PDF
    The biological cell is the intricate, yet ubiquitous component of life, able to grow, adapt and reproduce. The genetic material contained within a cell encodes information which directs its development and behaviour, and this information is passed down from one generation of cell to the next. One emerging interest, resulting from collaborations between the disciplines of Molecular Biology and Computer Science, is to encode computational programs, sets of engineered, information processing instructions, in genetic material, to be executed by living cells.So far, the large majority of in vivo computation research has been based on the detection and conditional manipulation of protein concentrations inside cells, which is the biological method of gene expression. In contrast, this thesis describes how a computational program, encoded in genetic material inside a bacterium, can be triggered by external stimuli to reassemble itself in a directed manner to create a newly arranged computational program.In order to investigate the potential utility of in vivo self-arranging programs, software was designed to explore a search space of candidate computational programs, encoded in genetic material, which are able to rearrange themselves; to simulate these candidates and to evaluate their behaviour against a set of criteria. Rearrangements were facilitated by biological catalysts which can selectively sever and rejoin genetic material in a cooperative manner. Their ability to perform compound operations was found to allow for a general purpose mechanismAs a proof of concept, one of the candidate computational programs, a two-colour switch which can be set irreversibly through its rearrangement, was encoded in genetic material. Measurements of in vivo expression were observed resulting from in vitro rearrangement manipulations, to illustrate its operation

    Data partitioning and load balancing in parallel disk systems

    Get PDF
    Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via inter-request and intra-request parallelism. In this paper we discuss the main issues in performance tuning of such systems, namely striping and load balancing, and show their relationship to response time and throughput. We outline the main components of an intelligent file system that optimizes striping by taking into account the requirements of the applications, and performs load balancing by judicious file allocation and dynamic redistributions of the data when access patterns change. Our system uses simple but effective heuristics that incur only little overhead. We present performance experiments based on synthetic workloads and real-life traces
    corecore