Search CORE

14 research outputs found

Towards an Adaptive OS Noise Mitigation Technique for Microbenchmarking on Apple Ipad Devices

Author: Hamilton John R.
Holdsworth Jason
Rehn Adam
Publication venue: AIS Electronic Library (AISeL)
Publication date: 08/12/2014
Field of study

This study investigates levels of Operating System (OS) noise on Apple iPad mobile devices. OS noise causes variations in application performance that interfere with microbenchmark results. OS noise manifests in collected data through extreme outliers and variations in skewness. Using our collected data, we develop an iterative, semi-automated outlier removal process for Apple iPad OS noise profiles. The profiles generated by outlier removal represent the first step toward an adaptive noise mitigation technique, which presents opportunities for use in microbenchmarking across other mobile platforms

AIS Electronic Library (AISeL)

The problems you're having may not be the problems you think you're having: results from a latency study of windows NT

Author: Jones Michael B.
Regehr John
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/1999
Field of study

ManuscriptThis paper is intended to catalyze discussions on two intertwined systems topics. First, it presents early results from a latency study of Windows NT that identifies some specific causes of long thread scheduling latencies, many of which delay the dispatching of runnable threads for tens of milliseconds. Reasons for these delays, including technical, methodological, and economic are presented and possible solutions are discussed. Secondly, and equally importantly, it is intended to serve as a cautionary tale against believing one's own intuition about the causes of poor system performance. We went into this study believing we understood a number of the causes of these delays, with our beliefs informed more by conventional wisdom and hunches than data. In nearly all cases the reasons we discovered via instrumentation and measurement surprised us. In fact, some directly contradicted "facts" we thought we "knew"

The University of Utah: J. Willard Marriott Digital Library

Profiling I/O interrupts in modern architectures

Author: Davis Al
Schaelicke Lambert
Publication venue: University of Utah
Publication date: 01/01/1999
Field of study

Journal ArticleAs applications grow increasingly communication-oriented, interrupt performance quickly becomes a crucial component of high performance I/O system design. At the same time, accurately measuring interrupt handler performance is difficult with the traditional simulation, instrumentation, or statistical sampling approaches. One o f the most important components o f interrupt performance is cache behavior. This paper presents a portable method for measuring the cache effects o f I/O interrupt handling using native hardware performance counters. To provide a portability stress test, the method is demonstrated on two commercial platforms with different architectures, the SGI Origin 200 and the Sun LJltra-1. This case study uses the methodology to measure the overhead of the two most common forms o f interrupt traffic: disk and network interrupts. The study demonstrates that the method works well and is reasonably robust. In addition, the results show that disk interrupts behave similar on both platforms, while differences in OS organization cause network interrupts to behave very differently. Furthermore, network interrupts exhibit significantly larger cache footprints.

The University of Utah: J. Willard Marriott Digital Library

Operating system profiling via latency analysis

Author: Avishay Traeger
Charles P. Wright
Erez Zadok
Nikolai Joukov
Rakesh Iyer
Publication venue
Publication date
Field of study

Operating systems are complex and their behavior depends on many factors. Source code, if available, does not directly help one to understand the OS’s behavior, as the behavior depends on actual workloads and external inputs. Runtime profiling is a key technique to prove new concepts, debug problems, and optimize performance. Unfortunately, existing profiling methods are lacking in important areas—they do not provide enough information about the OS’s behavior, they require OS modification and therefore are not portable, or they incur high overheads thus perturbing the profiled OS. We developed OSprof: a versatile, portable, and efficient OS profiling method based on latency distributions analysis. OSprof automatically selects important profiles for subsequent visual analysis. We have demonstrated that a suitable workload can be used to profile virtually any OS component. OSprof is portable because it can intercept operations and measure OS behavior from user-level or from inside the kernel without requiring source code. OSprof has typical CPU time overheads below 4%. In this paper we describe our techniques and demonstrate their usefulness through a series of profiles conducted on Linux, FreeBSD, and Windows, including client/server scenarios. We discovered and investigated a number of interesting interactions, including scheduler behavior, multi-modal I/O distributions, and a previously unknown lock contention, which we fixed.

CiteSeerX

Recommended from our members

A Self-Scaling and Self-Configuring Benchmark for Web Servers

Author: Courage Michael
Manley Stephen
Seltzer Margo I.
Publication venue
Publication date: 04/03/2016
Field of study

World Wide Web clients and servers have become some of the most important applications in our computing base today, and as such, we need realistic and meaningful ways of capturing their performance. Current server benchmarks do not capture the wide variation that we see in servers and are not accurate in their characterization of web traffic. In this paper, we present a self-configuring, scalable benchmark, that generates a server benchmark load based on actual server loads. In contrast to other web benchmarks, our benchmark characterizes request latency instead of focusing exclusively on throughput sensitive metrics. We present our new benchmark, hbench:Web, and demonstrate how it accurately captures the load observed by an actual server. We then go on to show how it can be used to assess how continued growth or changes in the workload will affect future performance. Using existing log histories, we show that these predictions are sufficiently realistic to provide insight into tomorrow’s web performance.Engineering and Applied Science

Harvard University - DASH

HAPPE: Human and Application-Driven Frequency Scaling for Processor Power Efficiency

Author: L Yang
Lei Yang
Member IEEE Gokhan Memik
Member IEEE Peter Dinda
Member IEEE Robert P Dick
Publication venue
Publication date: 01/01/2013
Field of study

Abstract-Conventional dynamic voltage and frequency scaling techniques use high CPU utilization as a predictor for user dissatisfaction, to which they react by increasing CPU frequency. In this paper, we demonstrate that for many interactive applications, perceived performance is highly dependent upon the particular user and application, and is not linearly related to CPU utilization. This observation reveals an opportunity for reducing power consumption. We propose Human and Application driven frequency scaling for Processor Power Efficiency (HAPPE), an adaptive user-and-application-aware dynamic CPU frequency scaling technique. HAPPE continuously adapts processor frequency and voltage to the learned performance requirement of the current user and application. Adaptation to user requirements is quick and requires minimal effort from the user (typically a handful of key strokes). Once the system has adapted to the user's performance requirements, the user is not required to provide continued feedback but is permitted to provide additional feedback to adjust the control policy to changes in preferences. HAPPE was implemented on a Linux-based laptop and evaluated in 22 hours of controlled user studies. Compared to the default Linux CPU frequency controller, HAPPE reduces the measured system-wide power consumption of CPU-intensive interactive applications by 25 percent on average while maintaining user satisfaction. Index Terms-Power, CPU frequency scaling, user-driven study, mobile systems Ç 1I NTRODUCTION P OWER efficiency has been a major technology driver for battery-powered mobile systems, such as mobile phones, personal digital assistants, MP3 players, and laptops. Power efficiency has also become a new focus for line-powered desktop systems and data centers because of its impact on power dissipation and chip temperature, which affect performance, reliability, and lifetime. Processor power consumption is often a substantial portion of system power consumption in mobile systems Traditional CPU power management approaches can lose sight of an important fact: The ultimate goal of any computer system is to satisfy its users, not to execute a particular number of instructions per second. Although CPU utilization is a good indication of processor performance, the actual perceivable system performance depends on individual users and applications, and user satisfaction is not linearly related to CPU utilization. We conducted a study on 10 users with four interactive applications and found that for some applications, some users are satisfied with system performance when the processor is at the lowest frequency, while other users may not be satisfied even when it operates at the highest frequency. We also found that users may be insensitive to varying processor frequency for one application, but may be very sensitive to such changes for another application. Traditional DVFS policies that consider only CPU utilization or other useroblivious performance metrics are often too pessimistic about user performance requirements, and use a high frequency to satisfy all users, resulting in wasted power. Similar findings were also reported in other studies In this paper, we propose Human and Application driven frequency scaling for Processor Power Efficiency (HAPPE), a CPU DVFS technique that adapts voltage and frequency to the performance requirement of the curren

CiteSeerX

Understanding PCIe performance for end host networking

Author: Dixon Colin
Kalia Anuj
Lee Ki Suh
Lim Hyeontaek
McVoy Larry
Moll L.
Peleg Omer
Pratt Ian
Radhakrishnan Sivasankar
Rizzo Luigi
Zazo Jose Fernando
Publication venue: SIGCOMM '18 Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication
Publication date: 25/08/2018
Field of study

In recent years, spurred on by the development and availability of programmable NICs, end hosts have increasingly become the enforcement point for core network functions such as load balancing, congestion control, and application specific network offloads. However, implementing custom designs on programmable NICs is not easy: many potential bottlenecks can impact performance. This paper focuses on the performance implication of PCIe, the de-facto I/O interconnect in contemporary servers, when interacting with the host architecture and device drivers. We present a theoretical model for PCIe and pcie-bench, an open-source suite, that allows developers to gain an accurate and deep understanding of the PCIe substrate. Using pcie-bench, we characterize the PCIe subsystem in modern servers. We highlight surprising differences in PCIe implementations, evaluate the undesirable impact of PCIe features such as IOMMUs, and show the practical limits for common network cards operating at 40Gb/s and beyond. Furthermore, through pcie-bench we gained insights which guided software and future hardware architectures for both commercial and research oriented network cards and DMA engines

Crossref

Apollo (Cambridge)

Understanding and Leveraging Virtualization Technology in Commodity Computing Systems

Author: Le Duy
Publication venue: W&M ScholarWorks
Publication date: 01/01/2012
Field of study

Commodity computing platforms are imperfect, requiring various enhancements for performance and security purposes. In the past decade, virtualization technology has emerged as a promising trend for commodity computing platforms, ushering many opportunities to optimize the allocation of hardware resources. However, many abstractions offered by virtualization not only make enhancements more challenging, but also complicate the proper understanding of virtualized systems. The current understanding and analysis of these abstractions are far from being satisfactory. This dissertation aims to tackle this problem from a holistic view, by systematically studying the system behaviors. The focus of our work lies in performance implication and security vulnerabilities of a virtualized system.;We start with the first abstraction---an intensive memory multiplexing for I/O of Virtual Machines (VMs)---and present a new technique, called Batmem, to effectively reduce the memory multiplexing overhead of VMs and emulated devices by optimizing the operations of the conventional emulated Memory Mapped I/O in hypervisors. Then we analyze another particular abstraction---a nested file system---and attempt to both quantify and understand the crucial aspects of performance in a variety of settings. Our investigation demonstrates that the choice of a file system at both the guest and hypervisor levels has significant impact upon I/O performance.;Finally, leveraging utilities to manage VM disk images, we present a new patch management framework, called Shadow Patching, to achieve effective software updates. This framework allows system administrators to still take the offline patching approach but retain most of the benefits of live patching by using commonly available virtualization techniques. to demonstrate the effectiveness of the approach, we conduct a series of experiments applying a wide variety of software patches. Our results show that our framework incurs only small overhead in running systems, but can significantly reduce maintenance window

College of William & Mary: W&M Publish