4,027 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationPortable electronic devices will be limited to available energy of existing battery chemistries for the foreseeable future. However, system-on-chips (SoCs) used in these devices are under a demand to offer more functionality and increased battery life. A difficult problem in SoC design is providing energy-efficient communication between its components while maintaining the required performance. This dissertation introduces a novel energy-efficient network-on-chip (NoC) communication architecture. A NoC is used within complex SoCs due it its superior performance, energy usage, modularity, and scalability over traditional bus and point-to-point methods of connecting SoC components. This is the first academic research that combines asynchronous NoC circuits, a focus on energy-efficient design, and a software framework to customize a NoC for a particular SoC. Its key contribution is demonstrating that a simple, asynchronous NoC concept is a good match for low-power devices, and is a fruitful area for additional investigation. The proposed NoC is energy-efficient in several ways: simple switch and arbitration logic, low port radix, latch-based router buffering, a topology with the minimum number of 3-port routers, and the asynchronous advantages of zero dynamic power consumption while idle and the lack of a clock tree. The tool framework developed for this work uses novel methods to optimize the topology and router oorplan based on simulated annealing and force-directed movement. It studies link pipelining techniques that yield improved throughput in an energy-efficient manner. A simulator is automatically generated for each customized NoC, and its traffic generators use a self-similar message distribution, as opposed to Poisson, to better match application behavior. Compared to a conventional synchronous NoC, this design is superior by achieving comparable message latency with half the energy

    Improving efficiency and resilience in large-scale computing systems through analytics and data-driven management

    Full text link
    Applications running in large-scale computing systems such as high performance computing (HPC) or cloud data centers are essential to many aspects of modern society, from weather forecasting to financial services. As the number and size of data centers increase with the growing computing demand, scalable and efficient management becomes crucial. However, data center management is a challenging task due to the complex interactions between applications, middleware, and hardware layers such as processors, network, and cooling units. This thesis claims that to improve robustness and efficiency of large-scale computing systems, significantly higher levels of automated support than what is available in today's systems are needed, and this automation should leverage the data continuously collected from various system layers. Towards this claim, we propose novel methodologies to automatically diagnose the root causes of performance and configuration problems and to improve efficiency through data-driven system management. We first propose a framework to diagnose software and hardware anomalies that cause undesired performance variations in large-scale computing systems. We show that by training machine learning models on resource usage and performance data collected from servers, our approach successfully diagnoses 98% of the injected anomalies at runtime in real-world HPC clusters with negligible computational overhead. We then introduce an analytics framework to address another major source of performance anomalies in cloud data centers: software misconfigurations. Our framework discovers and extracts configuration information from cloud instances such as containers or virtual machines. This is the first framework to provide comprehensive visibility into software configurations in multi-tenant cloud platforms, enabling systematic analysis for validating the correctness of software configurations. This thesis also contributes to the design of robust and efficient system management methods that leverage continuously monitored resource usage data. To improve performance under power constraints, we propose a workload- and cooling-aware power budgeting algorithm that distributes the available power among servers and cooling units in a data center, achieving up to 21% improvement in throughput per Watt compared to the state-of-the-art. Additionally, we design a network- and communication-aware HPC workload placement policy that reduces communication overhead by up to 30% in terms of hop-bytes compared to existing policies.2019-07-02T00:00:00

    Towards Optimal Application Mapping for Energy-Efficient Many-Core Platforms

    Get PDF
    Siirretty Doriast

    Doctor of Philosophy

    Get PDF
    dissertationThe embedded system space is characterized by a rapid evolution in the complexity and functionality of applications. In addition, the short time-to-market nature of the business motivates the use of programmable devices capable of meeting the conflicting constraints of low-energy, high-performance, and short design times. The keys to achieving these conflicting constraints are specialization and maximally extracting available application parallelism. General purpose processors are flexible but are either too power hungry or lack the necessary performance. Application-specific integrated circuits (ASICS) efficiently meet the performance and power needs but are inflexible. Programmable domain-specific architectures (DSAs) are an attractive middle ground, but their design requires significant time, resources, and expertise in a variety of specialties, which range from application algorithms to architecture and ultimately, circuit design. This dissertation presents CoGenE, a design framework that automates the design of energy-performance-optimal DSAs for embedded systems. For a given application domain and a user-chosen initial architectural specification, CoGenE consists of a a Compiler to generate execution binary, a simulator Generator to collect performance/energy statistics, and an Explorer that modifies the current architecture to improve energy-performance-area characteristics. The above process repeats automatically until the user-specified constraints are achieved. This removes or alleviates the time needed to understand the application, manually design the DSA, and generate object code for the DSA. Thus, CoGenE is a new design methodology that represents a significant improvement in performance, energy dissipation, design time, and resources. This dissertation employs the face recognition domain to showcase a flexible architectural design methodology that creates "ASIC-like" DSAs. The DSAs are instruction set architecture (ISA)-independent and achieve good energy-performance characteristics by coscheduling the often conflicting constraints of data access, data movement, and computation through a flexible interconnect. This represents a significant increase in programming complexity and code generation time. To address this problem, the CoGenE compiler employs integer linear programming (ILP)-based 'interconnect-aware' scheduling techniques for automatic code generation. The CoGenE explorer employs an iterative technique to search the complete design space and select a set of energy-performance-optimal candidates. When compared to manual designs, results demonstrate that CoGenE produces superior designs for three application domains: face recognition, speech recognition and wireless telephony. While CoGenE is well suited to applications that exhibit a streaming behavior, multithreaded applications like ray tracing present a different but important challenge. To demonstrate its generality, CoGenE is evaluated in designing a novel multicore N-wide SIMD architecture, known as StreamRay, for the ray tracing domain. CoGenE is used to synthesize the SIMD execution cores, the compiler that generates the application binary, and the interconnection subsystem. Further, separating address and data computations in space reduces data movement and contention for resources, thereby significantly improving performance compared to existing ray tracing approaches

    Integration of tools for the Design and Assessment of High-Performance, Highly Reliable Computing Systems (DAHPHRS), phase 1

    Get PDF
    Systems for Space Defense Initiative (SDI) space applications typically require both high performance and very high reliability. These requirements present the systems engineer evaluating such systems with the extremely difficult problem of conducting performance and reliability trade-offs over large design spaces. A controlled development process supported by appropriate automated tools must be used to assure that the system will meet design objectives. This report describes an investigation of methods, tools, and techniques necessary to support performance and reliability modeling for SDI systems development. Models of the JPL Hypercubes, the Encore Multimax, and the C.S. Draper Lab Fault-Tolerant Parallel Processor (FTPP) parallel-computing architectures using candidate SDI weapons-to-target assignment algorithms as workloads were built and analyzed as a means of identifying the necessary system models, how the models interact, and what experiments and analyses should be performed. As a result of this effort, weaknesses in the existing methods and tools were revealed and capabilities that will be required for both individual tools and an integrated toolset were identified

    Comparing energy and latency of asynchronous and synchronous NoCs for embedded SoCs

    Get PDF
    Journal ArticlePower consumption of on-chip interconnects is a primary concern for many embedded system-on-chip (SoC) applications. In this paper, we compare energy and performance characteristics of asynchronous (clockless) and synchronous network on-chip implementations, optimized for a number of SoC designs. We adapted the COSI-2.0 framework with ORION 2.0 router and wire models for synchronous network generation. Our own tool, ANetGen, specifies the asynchronous network by determining the topology with simulated-annealing and router locations with force-directed placement. It uses energy and delay models from our 65 nm bundled-data router design. SystemC simulations varied traffic burstiness using the self-similar b-model. Results show that the asynchronous network provided lower median and maximum message latency, especially under bursty traffic, and used far less router energy with a slight overhead for the interrouter wires

    A Comprehensive Analysis of Literature Reported Mac and Phy Enhancements of Zigbee and its Alliances

    Get PDF
    Wireless communication is one of the most required technologies by the common man. The strength of this technology is rigorously progressing towards several novel directions in establishing personal wireless networks mounted over on low power consuming systems. The cutting-edge communication technologies like bluetooth, WIFI and ZigBee significantly play a prime role to cater the basic needs of any individual. ZigBee is one such evolutionary technology steadily getting its popularity in establishing personal wireless networks which is built on small and low-power digital radios. Zigbee defines the physical and MAC layers built on IEEE standard. This paper presents a comprehensive survey of literature reported MAC and PHY enhancements of ZigBee and its contemporary technologies with respect to performance, power consumption, scheduling, resource management and timing and address binding. The work also discusses on the areas of ZigBee MAC and PHY towards their design for specific applications

    Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

    Get PDF
    Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

    Energy Efficient and Cooperative Solutions for Next-Generation Wireless Networks

    Get PDF
    Energy efficiency is increasingly important for next-generation wireless systems due to the limited battery resources of mobile clients. While fourth generation cellular standards emphasize low client battery consumption, existing techniques do not explicitly focus on reducing power that is consumed when a client is actively communicating with the network. Based on high data rate demands of modern multimedia applications, active mode power consumption is expected to become a critical consideration for the development and deployment of future wireless technologies. Another reason for focusing more attention on energy efficient studies is given by the relatively slow progress in battery technology and the growing quality of service requirements of multimedia applications. The disproportion between demanded and available battery capacity is becoming especially significant for small-scale mobile client devices, where wireless power consumption dominates within the total device power budget. To compensate for this growing gap, aggressive improvements in all aspects of wireless system design are necessary. Recent work in this area indicates that joint link adaptation and resource allocation techniques optimizing energy efficient metrics can provide a considerable gain in client power consumption. Consequently, it is crucial to adapt state-of-the-art energy efficient approaches for practical use, as well as to illustrate the pros and cons associated with applying power-bandwidth optimization to improve client energy efficiency and develop insights for future research in this area. This constitutes the first objective of the present research. Together with energy efficiency, next-generation cellular technologies are emphasizing stronger support for heterogeneous multimedia applications. Since the integration of diverse services within a single radio platform is expected to result in higher operator profits and, at the same time, reduce network management expenses, intensive research efforts have been invested into design principles of such networks. However, as wireless resources are limited and shared by clients, service integration may become challenging. A key element in such systems is the packet scheduler, which typically helps ensure that the individual quality of service requirements of wireless clients are satisfied. In contrastingly different distributed wireless environments, random multiple access protocols are beginning to provide mechanisms for statistical quality of service assurance. However, there is currently a lack of comprehensive analytical frameworks which allow reliable control of the quality of service parameters for both cellular and local area networks. Providing such frameworks is therefore the second objective of this thesis. Additionally, the study addresses the simultaneous operation of a cellular and a local area network in spectrally intense metropolitan deployments and solves some related problems. Further improving the performance of battery-driven mobile clients, cooperative communications are sought as a promising and practical concept. In particular, they are capable of mitigating the negative effects of fading in a wireless channel and are thus expected to enhance next-generation cellular networks in terms of client spectral and energy efficiencies. At the cell edges or in areas missing any supportive relaying infrastructure, client-based cooperative techniques are becoming even more important. As such, a mobile client with poor channel quality may take advantage of neighboring clients which would relay data on its behalf. The key idea behind the concept of client relay is to provide flexible and distributed control over cooperative communications by the wireless clients themselves. By contrast to fully centralized control, this is expected to minimize overhead protocol signaling and hence ensure simpler implementation. Compared to infrastructure relay, client relay will also be cheaper to deploy. Developing the novel concept of client relay, proposing simple and feasible cooperation protocols, and analyzing the basic trade-offs behind client relay functionality become the third objective of this research. Envisioning the evolution of cellular technologies beyond their fourth generation, it appears important to study a wireless network capable of supporting machine-to-machine applications. Recent standardization documents cover a plethora of machine-to-machine use cases, as they also outline the respective technical requirements and features according to the application or network environment. As follows from this activity, a smart grid is one of the primary machine-to-machine use cases that involves meters autonomously reporting usage and alarm information to the grid infrastructure to help reduce operational cost, as well as regulate a customer's utility usage. The preliminary analysis of the reference smart grid scenario indicates weak system architecture components. For instance, the large population of machine-to-machine devices may connect nearly simultaneously to the wireless infrastructure and, consequently, suffer from excessive network entry delays. Another concern is the performance of cell-edge machine-to-machine devices with weak wireless links. Therefore, mitigating the above architecture vulnerabilities and improving the performance of future smart grid deployments is the fourth objective of this thesis. Summarizing, this thesis is generally aimed at the improvement of energy efficient properties of mobile devices in next-generation wireless networks. The related research also embraces a novel cooperation technique where clients may assist each other to increase per-client and network-wide performance. Applying the proposed solutions, the operation time of mobile clients without recharging may be increased dramatically. Our approach incorporates both analytical and simulation components to evaluate complex interactions between the studied objectives. It brings important conclusions about energy efficient and cooperative client behaviors, which is crucial for further development of wireless communications technologies
    • …
    corecore