1,550 research outputs found
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
The challenging deployment of compute-intensive applications from domains
such Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces
the community of computing systems to explore new design approaches.
Approximate Computing appears as an emerging solution, allowing to tune the
quality of results in the design of a system in order to improve the energy
efficiency and/or performance. This radical paradigm shift has attracted
interest from both academia and industry, resulting in significant research on
approximation techniques and methodologies at different design layers (from
system down to integrated circuits). Motivated by the wide appeal of
Approximate Computing over the last 10 years, we conduct a two-part survey to
cover key aspects (e.g., terminology and applications) and review the
state-of-the art approximation techniques from all layers of the traditional
computing stack. In Part II of our survey, we classify and present the
technical details of application-specific and architectural approximation
techniques, which both target the design of resource-efficient
processors/accelerators & systems. Moreover, we present a detailed analysis of
the application spectrum of Approximate Computing and discuss open challenges
and future directions.Comment: Under Review at ACM Computing Survey
One Size Does Not Fit All: Quantifying and Exposing the Accuracy-Latency Trade-off in Machine Learning Cloud Service APIs via Tolerance Tiers
Today's cloud service architectures follow a "one size fits all" deployment
strategy where the same service version instantiation is provided to the end
users. However, consumers are broad and different applications have different
accuracy and responsiveness requirements, which as we demonstrate renders the
"one size fits all" approach inefficient in practice. We use a production-grade
speech recognition engine, which serves several thousands of users, and an open
source computer vision based system, to explain our point. To overcome the
limitations of the "one size fits all" approach, we recommend Tolerance Tiers
where each MLaaS tier exposes an accuracy/responsiveness characteristic, and
consumers can programmatically select a tier. We evaluate our proposal on the
CPU-based automatic speech recognition (ASR) engine and cutting-edge neural
networks for image classification deployed on both CPUs and GPUs. The results
show that our proposed approach provides an MLaaS cloud service architecture
that can be tuned by the end API user or consumer to outperform the
conventional "one size fits all" approach.Comment: 2019 IEEE International Symposium on Performance Analysis of Systems
and Software (ISPASS
A Stochastic Model of Plausibility in Live-Virtual-Constructive Environments
Distributed live-virtual-constructive simulation promises a number of benefits for the test and evaluation community, including reduced costs, access to simulations of limited availability assets, the ability to conduct large-scale multi-service test events, and recapitalization of existing simulation investments. However, geographically distributed systems are subject to fundamental state consistency limitations that make assessing the data quality of live-virtual-constructive experiments difficult. This research presents a data quality model based on the notion of plausible interaction outcomes. This model explicitly accounts for the lack of absolute state consistency in distributed real-time systems and offers system designers a means of estimating data quality and fitness for purpose. Experiments with World of Warcraft player trace data validate the plausibility model and exceedance probability estimates. Additional experiments with synthetic data illustrate the model\u27s use in ensuring fitness for purpose of live-virtual-constructive simulations and estimating the quality of data obtained from live-virtual-constructive experiments
Bridging the Scalability Gap by Exploiting Error Tolerance for Emerging Applications
In recent years, there has been a surge in demand for intelligent applications. These emerging applications are powered by algorithms from domains such as computer vision, image processing, pattern recognition, and machine learning. Across these algorithms, there exist two key computational characteristics. First, the computational demands they place on computing infrastructure is large, with the potential to substantially outstrip existing compute resources. Second, they are necessarily resilient to errors due to their inputs and outputs being inherently noisy and imprecise.
Despite the staggering computational requirements and resilience of intelligent applications, current infrastructure uses conventional software and hardware methodologies. These systems needlessly consume resources for every bit of precision and arithmetic. To address this inefficiency and help bridge the performance gap caused by intelligent applications, this dissertation investigates exploiting error tolerance across the hardware-software stack. Specifically, we propose (1) statistical machinery to guarantee that accuracy is not compromised when removing work or precision, (2) a GPU optimization framework for work skipping and bottleneck mitigation, and (3) exploration of unconventional numerical representations to steer future hardware designs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144025/1/parkerhh_1.pd
- …