Search CORE

161 research outputs found

Energy Measurements of High Performance Computing Systems: From Instrumentation to Analysis

Author: Ilsche Thomas
Publication venue
Publication date: 31/07/2020
Field of study

Energy efficiency is a major criterion for computing in general and High Performance Computing in particular. When optimizing for energy efficiency, it is essential to measure the underlying metric: energy consumption. To fully leverage energy measurements, their quality needs to be well-understood. To that end, this thesis provides a rigorous evaluation of various energy measurement techniques. I demonstrate how the deliberate selection of instrumentation points, sensors, and analog processing schemes can enhance the temporal and spatial resolution while preserving a well-known accuracy. Further, I evaluate a scalable energy measurement solution for production HPC systems and address its shortcomings. Such high-resolution and large-scale measurements present challenges regarding the management of large volumes of generated metric data. I address these challenges with a scalable infrastructure for collecting, storing, and analyzing metric data. With this infrastructure, I also introduce a novel persistent storage scheme for metric time series data, which allows efficient queries for aggregate timelines. To ensure that it satisfies the demanding requirements for scalable power measurements, I conduct an extensive performance evaluation and describe a productive deployment of the infrastructure. Finally, I describe different approaches and practical examples of analyses based on energy measurement data. In particular, I focus on the combination of energy measurements and application performance traces. However, interweaving fine-grained power recordings and application events requires accurately synchronized timestamps on both sides. To overcome this obstacle, I develop a resilient and automated technique for time synchronization, which utilizes crosscorrelation of a specifically influenced power measurement signal. Ultimately, this careful combination of sophisticated energy measurements and application performance traces yields a detailed insight into application and system energy efficiency at full-scale HPC systems and down to millisecond-range regions.:1 Introduction 2 Background and Related Work 2.1 Basic Concepts of Energy Measurements 2.1.1 Basics of Metrology 2.1.2 Measuring Voltage, Current, and Power 2.1.3 Measurement Signal Conditioning and Analog-to-Digital Conversion 2.2 Power Measurements for Computing Systems 2.2.1 Measuring Compute Nodes using External Power Meters 2.2.2 Custom Solutions for Measuring Compute Node Power 2.2.3 Measurement Solutions of System Integrators 2.2.4 CPU Energy Counters 2.2.5 Using Models to Determine Energy Consumption 2.3 Processing of Power Measurement Data 2.3.1 Time Series Databases 2.3.2 Data Center Monitoring Systems 2.4 Influences on the Energy Consumption of Computing Systems 2.4.1 Processor Power Consumption Breakdown 2.4.2 Energy-Efficient Hardware Configuration 2.5 HPC Performance and Energy Analysis 2.5.1 Performance Analysis Techniques 2.5.2 HPC Performance Analysis Tools 2.5.3 Combining Application and Power Measurements 2.6 Conclusion 3 Evaluating and Improving Energy Measurements 3.1 Description of the Systems Under Test 3.2 Instrumentation Points and Measurement Sensors 3.2.1 Analog Measurement at Voltage Regulators 3.2.2 Instrumentation with Hall Effect Transducers 3.2.3 Modular Instrumentation of DC Consumers 3.2.4 Optimal Wiring for Shunt-Based Measurements 3.2.5 Node-Level Instrumentation for HPC Systems 3.3 Analog Signal Conditioning and Analog-to-Digital Conversion 3.3.1 Signal Amplification 3.3.2 Analog Filtering and Analog-To-Digital Conversion 3.3.3 Integrated Solutions for High-Resolution Measurement 3.4 Accuracy Evaluation and Calibration 3.4.1 Synthetic Workloads for Evaluating Power Measurements 3.4.2 Improving and Evaluating the Accuracy of a Single-Node Measuring System 3.4.3 Absolute Accuracy Evaluation of a Many-Node Measuring System 3.5 Evaluating Temporal Granularity and Energy Correctness 3.5.1 Measurement Signal Bandwidth at Different Instrumentation Points 3.5.2 Retaining Energy Correctness During Digital Processing 3.6 Evaluating CPU Energy Counters 3.6.1 Energy Readouts with RAPL 3.6.2 Methodology 3.6.3 RAPL on Intel Sandy Bridge-EP 3.6.4 RAPL on Intel Haswell-EP and Skylake-SP 3.7 Conclusion 4 A Scalable Infrastructure for Processing Power Measurement Data 4.1 Requirements for Power Measurement Data Processing 4.2 Concepts and Implementation of Measurement Data Management 4.2.1 Message-Based Communication between Agents 4.2.2 Protocols 4.2.3 Application Programming Interfaces 4.2.4 Efficient Metric Time Series Storage and Retrieval 4.2.5 Hierarchical Timeline Aggregation 4.3 Performance Evaluation 4.3.1 Benchmark Hardware Specifications 4.3.2 Throughput in Symmetric Configuration with Replication 4.3.3 Throughput with Many Data Sources and Single Consumers 4.3.4 Temporary Storage in Message Queues 4.3.5 Persistent Metric Time Series Request Performance 4.3.6 Performance Comparison with Contemporary Time Series Storage Solutions 4.3.7 Practical Usage of MetricQ 4.4 Conclusion 5 Energy Efficiency Analysis 5.1 General Energy Efficiency Analysis Scenarios 5.1.1 Live Visualization of Power Measurements 5.1.2 Visualization of Long-Term Measurements 5.1.3 Integration in Application Performance Traces 5.1.4 Graphical Analysis of Application Power Traces 5.2 Correlating Power Measurements with Application Events 5.2.1 Challenges for Time Synchronization of Power Measurements 5.2.2 Reliable Automatic Time Synchronization with Correlation Sequences 5.2.3 Creating a Correlation Signal on a Power Measurement Channel 5.2.4 Processing the Correlation Signal and Measured Power Values 5.2.5 Common Oversampling of the Correlation Signals at Different Rates 5.2.6 Evaluation of Correlation and Time Synchronization 5.3 Use Cases for Application Power Traces 5.3.1 Analyzing Complex Power Anomalies 5.3.2 Quantifying C-State Transitions 5.3.3 Measuring the Dynamic Power Consumption of HPC Applications 5.4 Conclusion 6 Summary and Outloo

Technische Universität Dresden: Qucosa

Goddard Conference on Mass Storage Systems and Technologies, volume 2

Author: Hariharan P. C.
Kobler Ben
Publication venue
Publication date
Field of study

Papers and viewgraphs from the conference are presented. Discussion topics include the IEEE Mass Storage System Reference Model, data archiving standards, high-performance storage devices, magnetic and magneto-optic storage systems, magnetic and optical recording technologies, high-performance helical scan recording systems, and low end helical scan tape drives. Additional discussion topics addressed the evolution of the identifiable unit for processing (file, granule, data set, or some similar object) as data ingestion rates increase dramatically, and the present state of the art in mass storage technology

NASA Technical Reports Server

Discrete-time queueing models: generalized service mechanisms and correlation effects

Author: Feyaerts Bart
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2014
Field of study

Ghent University Academic Bibliography

Sixth Goddard Conference on Mass Storage Systems and Technologies Held in Cooperation with the Fifteenth IEEE Symposium on Mass Storage Systems

Author: Hariharan P. C.
Kobler Benjamin
Publication venue
Publication date
Field of study

This document contains copies of those technical papers received in time for publication prior to the Sixth Goddard Conference on Mass Storage Systems and Technologies which is being held in cooperation with the Fifteenth IEEE Symposium on Mass Storage Systems at the University of Maryland-University College Inn and Conference Center March 23-26, 1998. As one of an ongoing series, this Conference continues to provide a forum for discussion of issues relevant to the management of large volumes of data. The Conference encourages all interested organizations to discuss long term mass storage requirements and experiences in fielding solutions. Emphasis is on current and future practical solutions addressing issues in data management, storage systems and media, data acquisition, long term retention of data, and data distribution. This year's discussion topics include architecture, tape optimization, new technology, performance, standards, site reports, vendor solutions. Tutorials will be available on shared file systems, file system backups, data mining, and the dynamics of obsolescence

NASA Technical Reports Server

Modelling and performability evaluation of Wireless Sensor Networks

Author: Adero F.
Adero F.
Publication venue
Publication date: 01/01/2016
Field of study

This thesis presents generic analytical models of homogeneous clustered Wireless Sensor Networks (WSNs) with a centrally located Cluster Head (CH) coordinating cluster communication with the sink directly or through other intermediate nodes. The focus is to integrate performance and availability studies of WSNs in the presence of sensor nodes and channel failures and repair/replacement. The main purpose is to enhance improvement of WSN Quality of Service (QoS). Other research works also considered in this thesis include modelling of packet arrival distribution at the CH and intermediate nodes, and modelling of energy consumption at the sensor nodes. An investigation and critical analysis of wireless sensor network architectures, energy conservation techniques and QoS requirements are performed in order to improve performance and availability of the network. Existing techniques used for performance evaluation of single and multi-server systems with several operative states are investigated and analysed in details. To begin with, existing approaches for independent (pure) performance modelling are critically analysed with highlights on merits and drawbacks. Similarly, pure availability modelling approaches are also analysed. Considering that pure performance models tend to be too optimistic and pure availability models are too conservative, performability, which is the integration of performance and availability studies is used for the evaluation of the WSN models developed in this study. Two-dimensional Markov state space representations of the systems are used for performability modelling. Following critical analysis of the existing solution techniques, spectral expansion method and system of simultaneous linear equations are developed and used to solving the proposed models. To validate the results obtained with the two techniques, a discrete event simulation tool is explored. In this research, open queuing networks are used to model the behaviour of the CH when subjected to streams of traffic from cluster nodes in addition to dynamics of operating in the various states. The research begins with a model of a CH with an infinite queue capacity subject to failures and repair/replacement. The model is developed progressively to consider bounded queue capacity systems, channel failures and sleep scheduling mechanisms for performability evaluation of WSNs. Using the developed models, various performance measures of the considered system including mean queue length, throughput, response time and blocking probability are evaluated. Finally, energy models considering mean power consumption in each of the possible operative states is developed. The resulting models are in turn employed for the evaluation of energy saving for the proposed case study model. Numerical solutions and discussions are presented for all the queuing models developed. Simulation is also performed in order to validate the accuracy of the results obtained. In order to address issues of performance and availability of WSNs, current research present independent performance and availability studies. The concerns resulting from such studies have therefore remained unresolved over the years hence persistence poor system performance. The novelty of this research is a proposed integrated performance and availability modelling approach for WSNs meant to address challenges of independent studies. In addition, a novel methodology for modelling and evaluation of power consumption is also offered. Proposed model results provide remarkable improvement on system performance and availability in addition to providing tools for further optimisation studies. A significant power saving is also observed from the proposed model results. In order to improve QoS for WSN, it is possible to improve the proposed models by incorporating priority queuing in a mixed traffic environment. A model of multi-server system is also appropriate for addressing traffic routing. It is also possible to extend the proposed energy model to consider other sleep scheduling mechanisms other than On-demand proposed herein. Analysis and classification of possible arrival distribution of WSN packets for various application environments would be a great idea for enabling robust scientific research

Middlesex University Research Repository

Telecommunications Networks

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

This book guides readers through the basics of rapidly emerging networks to more advanced concepts and future expectations of Telecommunications Networks. It identifies and examines the most pressing research issues in Telecommunications and it contains chapters written by leading researchers, academics and industry professionals. Telecommunications Networks - Current Status and Future Trends covers surveys of recent publications that investigate key areas of interest such as: IMS, eTOM, 3G/4G, optimization problems, modeling, simulation, quality of service, etc. This book, that is suitable for both PhD and master students, is organized into six sections: New Generation Networks, Quality of Services, Sensor Networks, Telecommunications, Traffic Engineering and Routing

Directory of Open Access Books (DOAB)

A PC-based data acquisition system for sub-atomic physics measurements

Author: Chabot Daron
Publication venue: 'University of Saskatchewan Library'
Publication date: 01/01/2008
Field of study

Modern particle physics measurements are heavily dependent upon automated data acquisition systems (DAQ) to collect and process experiment-generated information. One research group from the University of Saskatchewan utilizes a DAQ known as the Lucid data acquisition and analysis system. This thesis examines the project undertaken to upgrade the hardware and software components of Lucid. To establish the effectiveness of the system upgrades, several performance metrics were obtained including the system's dead time and input/output bandwidth.Hardware upgrades to Lucid consisted of replacing its aging digitization equipment with modern, faster-converting Versa-Module Eurobus (VME) technology and replacing the instrumentation processing platform with common, PC hardware. The new processor platform is coupled to the instrumentation modules via a fiber-optic bridging-device, the sis1100/3100 from Struck Innovative Systems.The software systems of Lucid were also modified to follow suit with the new hardware. Originally constructed to utilize a proprietary real-time operating system, the data acquisition application was ported to run under the freely available Real-Time Executive for Multiprocessor Systems (RTEMS). The device driver software provided with sis1100/3100 interface also had to be ported for use under the RTEMS-based system. Performance measurements of the upgraded DAQ indicate that the dead time has been reduced from being on the order of milliseconds to being on the order of several tens of microseconds. This increased capability means that Lucid's users may acquire significantly more data in a shorter period of time, thereby decreasing both the statistical uncertainties and data collection duration associated with a given experiment

eCommons@USASK

University of Saskatchewan Research Archive

Recommended from our members

Construction of a support tool for the design of the activity structures based computer system architectures

Author: Mohamad Sabah Mohamad Amin
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics
Publication date: 01/01/1986
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel University.This thesis is a reapproachment of diverse design concepts, brought to bear upon the computer system engineering problem of identification and control of highly constrained multiprocessing (HCM) computer machines. It contributes to the area of meta/general systems methodology, and brings a new insight into the design formalisms, and results afforded by bringing together various design concepts that can be used for the construction of highly constrained computer system architectures. A unique point of view is taken by assuming the process of identification and control of HCM computer systems to be the process generated by the Activity Structures Methodology (ASM). The research in ASM has emerged from the Neuroscience research, aiming at providing the techniques for combining the diverse knowledge sources that capture the 'deep knowledge' of this application field in an effective formal and computer representable form. To apply the ASM design guidelines in the realm of the distributed computer system design, we provide new design definitions for the identification and control of such machines in terms of realisations. These realisation definitions characterise the various classes of the identification and control problem. The classes covered consist of: 1. the identification of the designer activities, 2. the identification and control of the machine's distributed structures of behaviour, 3. the identification and control of the conversational environment activities (i.e. the randomised/ adaptive activities and interactions of both the user and the machine environments), 4. the identification and control of the substrata needed for the realisation of the machine, and 5. the identification of the admissible design data, both user-oriented and machineoriented, that can force the conversational environment to act in a self-regulating manner. All extent results are considered in this context, allowing the development of both necessary conditions for machine identification in terms of their distributed behaviours as well as the substrata structures of the unknown machine and sufficient conditions in terms of experiments on the unknown machine to achieve the self-regulation behaviour. We provide a detailed description of the design and implementation of the support software tool which can be used for aiding the process of constructing effective, HCM computer systems, based on various classes of identification and control. The design data of a highly constrained system, the NUKE, are used to verify the tool logic as well as the various identification and control procedures. Possible extensions as well as future work implied by the results are considered.Government of Ira

Brunel University Research Archive

Recommended from our members

Algorithms and Experimentation for Future Wireless Networks: From Internet-of-Things to Full-Duplex

Author: Chen Tingjun
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Future and next-generation wireless networks are driven by the rapidly growing wireless traffic stemming from diverse services and applications, such as the Internet-of-Things (IoT), virtual reality, autonomous vehicles, and smart intersections. Many of these applications require massive connectivity between IoT devices as well as wireless access links with ultra-high bandwidth (Gbps or above) and ultra-low latency (10ms or less). Therefore, realizing the vision of future wireless networks requires significant research efforts across all layers of the network stack. In this thesis, we use a cross-layer approach and focus on several critical components of future wireless networks including IoT systems and full-duplex (FD) wireless, and on experimentation with advanced wireless technologies in the NSF PAWR COSMOS testbed. First, we study tracking and monitoring applications in the IoT and focus on ultra-low-power energy harvesting networks. Based on realistic hardware characteristics, we design and optimize Panda, a centralized probabilistic protocol for maximizing the neighbor discovery rate between energy harvesting nodes under a power budget. Via testbed evaluation using commercial off-the-shelf energy harvesting nodes, we show that Panda outperforms existing protocols by up to 3x in terms of the neighbor discovery rate. We further explore this problem and consider a general throughput maximization problem among a set of heterogeneous energy-constrained ultra-low-power nodes. We analytically identify the theoretical fundamental limits of the rate at which data can be exchanged between these nodes, and design the distributed probabilistic protocol, EconCast, which approaches the maximum throughput in the limiting sense. Performance evaluations of EconCast using both simulations and real-world experiments show that it achieves up to an order of magnitude higher throughput than Panda and other known protocols. We then study FD wireless - simultaneous transmission and reception at the same frequency - a key technology that can significantly improve the data rate and reduce communication latency by employing self-interference cancellation (SIC). In particular, we focus on enabling FD on small-form-factor devices leveraging the technique of frequency-domain equalization (FDE). We design, model, and optimize the FDE-based RF canceller, which can achieve >50dB RF SIC across 20MHz bandwidth, and experimentally show that our prototyped FD radios can achieve a link-level throughput gain of 1.85-1.91x. We also focus on combining FD with phased arrays, employing optimized transmit and receive beamforming, where the spatial degrees of freedom in multi-antenna systems are repurposed to achieve wideband RF SIC. Moving up in the network stack, we study heterogeneous networks with half-duplex and FD users, and develop the novel Hybrid-Greedy Maximum Scheduling (H-GMS) algorithm, which achieves throughput optimality in a distributed manner. Analytical and simulation results show that H-GMS achieves 5-10x better delay performance and improved fairness compared with state-of-the-art approaches. Finally, we described experimentation and measurements in the city-scale COSMOS testbed being deployed in West Harlem, New York City. COSMOS' key building blocks include software-defined radios, millimeter-wave radios, a programmable optical network, and edge cloud, and their convergence will enable researchers to remotely explore emerging technologies in a real world environment. We provide a brief overview of the testbed and focus on experimentation with advanced technologies, including the integrating of open-access FD radios in the testbed and a pilot study on converged optical-wireless x-haul networking for cloud radio access networks (C-RANs). We also present an extensive 28GHz channel measurements in the testbed area, which is a representative dense urban canyon environment, and study the corresponding signal-to-noise ratio (SNR) coverage and achievable data rates. The results of this part helped drive and validate the design of the COSMOS testbed, and can inform further deployment and experimentation in the testbed. In this thesis, we make several theoretical and experimental contributions to ultra-low-power energy harvesting networks and the IoT, and FD wireless. We also contribute to the experimentation and measurements in the COSMOS advanced wireless testbed. We believe that these contributions are essential to connect fundamental theory to practical systems, and ultimately to real-world applications, in future wireless networks

Columbia University Academic Commons