100 research outputs found
Optimizing Data Collection in Deep Reinforcement Learning
Reinforcement learning (RL) workloads take a notoriously long time to train
due to the large number of samples collected at run-time from simulators.
Unfortunately, cluster scale-up approaches remain expensive, and commonly used
CPU implementations of simulators induce high overhead when switching back and
forth between GPU computations. We explore two optimizations that increase RL
data collection efficiency by increasing GPU utilization: (1) GPU
vectorization: parallelizing simulation on the GPU for increased hardware
parallelism, and (2) simulator kernel fusion: fusing multiple simulation steps
to run in a single GPU kernel launch to reduce global memory bandwidth
requirements. We find that GPU vectorization can achieve up to
speedup over commonly used CPU simulators. We profile the performance of
different implementations and show that for a simple simulator, ML compiler
implementations (XLA) of GPU vectorization outperform a DNN framework (PyTorch)
by by reducing CPU overhead from repeated Python to DL backend API
calls. We show that simulator kernel fusion speedups with a simple simulator
are and increase by up to as simulator complexity
increases in terms of memory bandwidth requirements. We show that the speedups
from simulator kernel fusion are orthogonal and combinable with GPU
vectorization, leading to a multiplicative speedup.Comment: MLBench 2022 ( https://memani1.github.io/mlbench22/ ) camera ready
submissio
Collaboration and Multimedia Authoring on Mobile Devices
This paper introduces adaptation-aware editing and progressive update propagation, two novel mechanisms that enable authoring multimedia content and collaborative work on mobile devices. Adaptation-aware editing enables editing content that was adapted to reduce download time to the mobile device. Progressive update propagation reduces the time for propagating content generated at the mobile device by transmitting either a fraction of the modifications or transcoded versions thereof.With application-aware editing and progressive update propagation, an object present at a mobile device is characterized not only by a particular version, as in conventional replication, but also by a particular fidelity. We demonstrate that replication models can be extended to account for fidelity independently of the mechanisms used for concurrency control and consistency maintenance. As a result, the two techniques described in this paper can easily be added to any replication protocol, whether optimistic or pessimistic.We report on our experience implementing adaptation-aware editing and progressive update propagation. Experiments with two multimedia applications, an email reader and a presentation software package, show that both mechanisms can be added with modest programming effort and achieve substantial reductions in upload and download latencies
Collaboration and Document Editing on Bandwidth-Limited Devices
This paper presents the design of CoFi, a novel architecture for supporting document editing and collaborative work over bandwidth-limited clients. CoFi combines the previously disjoint notions of consistency and fidelity in a unified architecture. CoFi enables bandwidth-limited clients to edit documents that are only partially present at the client (because parts of the documents were lossily transcoded, or only a portion of the document was fetched), and to propagate modifications incrementally by progressively increasing their fidelity
Reducing the Energy Usage of Office Applications
In this paper, we demonstrate how component-based middleware can reduce the energy usage of closed-source applications. We first describe how the Puppeteer system exploits well-defined interfaces exported by applications to modify their behavior. We then present a detailed study of the energy usage of Microsoft's PowerPoint application and show that adaptive policies can reduce energy expenditure by 49% in some instances. In addition, we use the results of the study to provide general advice to developers of applications and middleware that will enable them to create more energy-efficient software
Revisiting the arguments for edge computing research
The first author is supported by a Royal Society Short Industry Fellowship.This article argues that low latency, high bandwidth, device proliferation, sustainable digital infrastructure, and data privacy and sovereignty continue to motivate the need for edge computing research even though its initial concepts were formulated more than a decade ago.PostprintPeer reviewe
Conducting visitor studies using smartphone-based location sensing
Visitor studies explore human experiences within museums, cultural heritage sites, and other informal learning settings to inform decisions. Smartphones offer novel opportunities for extending the depth and breadth of visitor studies while considerably reducing their cost and their demands on specialist human resources. By enabling the collection of significantly higher volumes of data, they also make possible the application of advanced machine-learning and visualization techniques, potentially leading to the discovery of new patterns and behaviors that cannot be captured by simple descriptive statistics. In this article, we present a principled approach to the use of smartphones for visitor studies, in particular proposing a structured methodology and associated methods that enable its effective use in this context. We discuss specific methodological considerations that have to be addressed for effective data collection, preprocessing, and analysis and identify the limitations in the applicability of these tools using family visits to the London Zoo as a case study. We conclude with a discussion of the wider opportunities afforded by the introduction of smartphones and related technologies and outline the steps toward establishing them as a standard tool for visitor studies
- …