47 research outputs found
Balancing Interactive Performance and Budgeted Resources in Mobile Computing.
In this dissertation, we explore the various limited resources involved in mobile applications --- battery energy, cellular data usage, and, critically, user attention --- and we devise principled methods for managing the tradeoffs involved in creating a good user experience. Building quality mobile applications requires developers to understand complex interactions between network usage, performance, and resource consumption. Because of this
difficulty, developers commonly choose simple but suboptimal approaches that strictly prioritize performance or resource conservation.
These extremes are symptoms of a lack of system-provided abstractions for managing the complexity inherent in managing performance/resource tradeoffs. By providing abstractions that help applications manage these tradeoffs, mobile systems can significantly improve user-visible performance without exhausting resource budgets. This dissertation explores three such abstractions in detail. We first present Intentional
Networking, a system that provides synchronization primitives and intelligent scheduling for multi-network traffic. Next, we present Informed Mobile Prefetching, a system that helps applications decide when to prefetch data and how aggressively to spend limited battery energy and cellular data resources toward that end. Finally, we present Meatballs, a library that helps applications consider the cloudy nature of predictions when making decisions, selectively employing redundancy to mitigate uncertainty and provide more
reliable performance. Overall, experiments show that these abstractions can significantly reduce interactive delay without overspending the available energy and data resources.PHDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/108956/1/brettdh_1.pd
DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks
Data movement between the CPU and main memory is a first-order obstacle
against improving performance, scalability, and energy efficiency in modern
systems. Computer systems employ a range of techniques to reduce overheads tied
to data movement, spanning from traditional mechanisms (e.g., deep multi-level
cache hierarchies, aggressive hardware prefetchers) to emerging techniques such
as Near-Data Processing (NDP), where some computation is moved close to memory.
Our goal is to methodically identify potential sources of data movement over a
broad set of applications and to comprehensively compare traditional
compute-centric data movement mitigation techniques to more memory-centric
techniques, thereby developing a rigorous understanding of the best techniques
to mitigate each source of data movement.
With this goal in mind, we perform the first large-scale characterization of
a wide variety of applications, across a wide range of application domains, to
identify fundamental program properties that lead to data movement to/from main
memory. We develop the first systematic methodology to classify applications
based on the sources contributing to data movement bottlenecks. From our
large-scale characterization of 77K functions across 345 applications, we
select 144 functions to form the first open-source benchmark suite (DAMOV) for
main memory data movement studies. We select a diverse range of functions that
(1) represent different types of data movement bottlenecks, and (2) come from a
wide range of application domains. Using NDP as a case study, we identify new
insights about the different data movement bottlenecks and use these insights
to determine the most suitable data movement mitigation mechanism for a
particular application. We open-source DAMOV and the complete source code for
our new characterization methodology at https://github.com/CMU-SAFARI/DAMOV.Comment: Our open source software is available at
https://github.com/CMU-SAFARI/DAMO
On-chip mechanisms to reduce effective memory access latency
This dissertation develops hardware that automatically reduces the effective latency of accessing memory in both single-core and multi-core systems. To accomplish this, the dissertation shows that all last level cache misses can be separated into two categories: dependent cache misses and independent cache misses. Independent cache misses have all of the source data that is required to generate the address of the memory access available on-chip, while dependent cache misses depend on data that is located off-chip. This dissertation proposes that dependent cache misses are accelerated by migrating the dependence chain that generates the address of the memory access to the memory controller for execution. Independent cache misses are accelerated using a new mode for runahead execution that only executes filtered dependence chains. With these mechanisms, this dissertation demonstrates a 62% increase in performance and a 19% decrease in effective memory access latency for a quad-core processor on a set of high memory intensity workloads.Electrical and Computer Engineerin
One Size Doesn't Fit All: Improving Network QoS Through Preference-driven Web Caching
In order to combat Internet congestion Web caches use replacement policies that attempt to keep the objects in a cache that are most likely to get requested in the future. We adopt the economic perspective that the objects with the greatest value to the users should be in a cache. Using trace driven simulations we implement an incentive compatible market-based Web cache for servers to push content into a cache. This system decentralizes the caching process as servers provide information in the form of bids for space in the cache. Truthful information from the server on valuations of objects and predictions of hit rates is obtained. This information is used in filling the cache, which can provide increased aggregate value and differential quality of service to servers when compared to LFU and LRU.http://deepblue.lib.umich.edu/bitstream/2027.42/50429/1/berlin.pd
Recommended from our members
Minimally Invasive Solutions to Challenges Posed by Mobility Changes
Today, things have changed radically. As network technologies have proliferated and evolved, the components of, and participants in, computerized systems have become increasingly decoupled. Users travel and commute while connecting to their office computer or home media server. Hardware devices may be carried by users, move on their own, or reside in data centers, never to be seen or touched by end-users. Even operating systems (OSes) and applications may now migrate across the network while executing, thanks to advances in virtualization that are only just beginning to remake the computing landscape. The decoupling of users, devices, and software has invalidated properties that enabled desired functionality: resulting in compromised function. Power interfaces utilize physi- cal user interactions to determine when transitions between high and lower power states should occur; what happens when users are no longer physically present? Operating system execution often relies on components such as CPU and local disk responding with tightly bounded delays; what should be done when the OS itself is in the process of migrating between two separate physical machines? The fundamental question explored by this dissertation is: Can we find highly adoptable solutions to restore desired functionality that has been lost because of changed mobility characteristics? Our emphasis on adoptability stems from pragmatic concerns: if a solution is difficult to adopt, it is highly unlikely to be used. Consequently, while many potential approaches may involve changes to the network itself, our work focuses on modifying end-point behavior. We show that practical solutions implemented solely in software and deployed only on network endpoints can be developed for a wide problem range. We consider concrete challenges arising from user, device, and software mobility changes, affecting sub-disciplines spanning cloud computing, green computing, and wireless networks. Cloud Computing: Users increasingly utilize virtual machine (VM) technology to migrate and replicate OS and software amongst networked hosts. Traditional execution required one VM image copy on each host's local storage. By transitioning to networked execution, dozens, if not hundreds, of VM replicas may now be distributed from a single networked storage location to a commensurately large set of physical machines. As these systems expand, they have come to be plagued by boot storms (and similar problems) caused when networked access to storage becomes a major bottleneck, drastically delaying VM distribution and execution. Can we develop techniques that resolve this network bottleneck without the need for expensive hardware over-provisioning? Green Computing: Remote access technologies have enabled users to travel while still interacting with computational machinery left in the office or home. Yet, energy savings mechanisms have traditionally relied on the activity of attached peripherals to determine power usage. The shift to remote interaction, which bypasses physically attached peripherals, has effectively broken these energy savings mechanisms. Can we build an economic and practical system that accommodates energy efficiency without compromising the fluid remote interactions users have now come to expect? Wireless Computing: Increasingly advanced mobile devices have provoked a shift towards heavy usage of 3G and 4G bandwidth use. Accordingly, the capacity of infrastructure wireless networks becomes increasingly strained. Can we find a way of supplementing this relatively low-latency infrastructure with high-latency, high-bandwidth opportunistic content exchange? In each scenario, we design a solution that aims to strike the proper balance between adoptability and technical efficiency - producing what we believe are rigorous, practical and adoptable solutions
Identifying, Quantifying, Extracting and Enhancing Implicit Parallelism
The shift of the microprocessor industry towards multicore architectures has
placed a huge burden on the programmers by requiring explicit parallelization
for performance. Implicit Parallelization is an alternative that could ease the
burden on programmers by parallelizing applications ???under the covers??? while
maintaining sequential semantics externally. This thesis develops a novel
approach for thinking about parallelism, by casting the problem of
parallelization in terms of instruction criticality. Using this approach,
parallelism in a program region is readily identified when certain conditions
about fetch-criticality are satisfied by the region. The thesis formalizes this
approach by developing a criticality-driven model of task-based
parallelization. The model can accurately predict the parallelism that would be
exposed by potential task choices by capturing a wide set of sources of
parallelism as well as costs to parallelization.
The criticality-driven model enables the development of two key components for
Implicit Parallelization: a task selection policy, and a bottleneck analysis
tool. The task selection policy can partition a single-threaded program into
tasks that will profitably execute concurrently on a multicore architecture in
spite of the costs associated with enforcing data-dependences and with
task-related actions. The bottleneck analysis tool gives feedback to the
programmers about data-dependences that limit parallelism. In particular, there
are several ???accidental dependences??? that can be easily removed with large
improvements in parallelism. These tools combine into a systematic methodology
for performance tuning in Implicit Parallelization. Finally, armed with the
criticality-driven model, the thesis revisits several architectural design
decisions, and finds several encouraging ways forward to increase the scope of
Implicit Parallelization.unpublishednot peer reviewe
The Second ICASE/LaRC Industry Roundtable: Session Proceedings
The second ICASE/LaRC Industry Roundtable was held October 7-9, 1996 at the Williamsburg Hospitality House, Williamsburg, Virginia. Like the first roundtable in 1994, this meeting had two objectives: (1) to expose ICASE and LaRC scientists to industrial research agendas; and (2) to acquaint industry with the capabilities and technology available at ICASE, LaRC and academic partners of ICASE. Nineteen sessions were held in three parallel tracks. Of the 170 participants, over one third were affiliated with various industries. Proceedings from the different sessions are summarized in this report