133 research outputs found
TFApprox: Towards a Fast Emulation of DNN Approximate Hardware Accelerators on GPU
Energy efficiency of hardware accelerators of deep neural networks (DNN) can
be improved by introducing approximate arithmetic circuits. In order to
quantify the error introduced by using these circuits and avoid the expensive
hardware prototyping, a software emulator of the DNN accelerator is usually
executed on CPU or GPU. However, this emulation is typically two or three
orders of magnitude slower than a software DNN implementation running on CPU or
GPU and operating with standard floating point arithmetic instructions and
common DNN libraries. The reason is that there is no hardware support for
approximate arithmetic operations on common CPUs and GPUs and these operations
have to be expensively emulated. In order to address this issue, we propose an
efficient emulation method for approximate circuits utilized in a given DNN
accelerator which is emulated on GPU. All relevant approximate circuits are
implemented as look-up tables and accessed through a texture memory mechanism
of CUDA capable GPUs. We exploit the fact that the texture memory is optimized
for irregular read-only access and in some GPU architectures is even
implemented as a dedicated cache. This technique allowed us to reduce the
inference time of the emulated DNN accelerator approximately 200 times with
respect to an optimized CPU version on complex DNNs such as ResNet. The
proposed approach extends the TensorFlow library and is available online at
https://github.com/ehw-fit/tf-approximate.Comment: To appear at the 23rd Design, Automation and Test in Europe (DATE
2020). Grenoble, Franc
Can Who-Edits-What Predict Edit Survival?
As the number of contributors to online peer-production systems grows, it
becomes increasingly important to predict whether the edits that users make
will eventually be beneficial to the project. Existing solutions either rely on
a user reputation system or consist of a highly specialized predictor that is
tailored to a specific peer-production system. In this work, we explore a
different point in the solution space that goes beyond user reputation but does
not involve any content-based feature of the edits. We view each edit as a game
between the editor and the component of the project. We posit that the
probability that an edit is accepted is a function of the editor's skill, of
the difficulty of editing the component and of a user-component interaction
term. Our model is broadly applicable, as it only requires observing data about
who makes an edit, what the edit affects and whether the edit survives or not.
We apply our model on Wikipedia and the Linux kernel, two examples of
large-scale peer-production systems, and we seek to understand whether it can
effectively predict edit survival: in both cases, we provide a positive answer.
Our approach significantly outperforms those based solely on user reputation
and bridges the gap with specialized predictors that use content-based
features. It is simple to implement, computationally inexpensive, and in
addition it enables us to discover interesting structure in the data.Comment: Accepted at KDD 201
Reproductive Health Perspectives of Young Women With Perinatally and Behaviourally Acquired HIV: A Qualitative Study
INTRODUCTION: The aim of this study was to describe the sexual and reproductive goals of female adolescents with human immunodeficiency virus (HIV) in an urban cohort and decipher if they vary depending on the mode of HIV acquisition.
METHODS: We conducted in-depth qualitative interviews with 25 Black and/or Hispanic/Latinx female adolescents living with HIV (14 perinatally, 11 behaviourally acquired) aged 17-25 years who have access to care and antiretroviral therapy at an urban public hospitals (NYC, NY). Interviews were transcribed, coded and analysed using thematic analysis.
RESULTS: Interviews demonstrated that access to antiretroviral therapy and HIV disclosure to a sexual partner were critical aspects of sexual health for the majority of participants. Persons with perinatal HIV defined motherhood as a source of self-validation and were confident that antiretroviral therapy prevents HIV transmission. Persons with behaviourally acquired HIV viewed their status as an insurmountable barrier that will prevent them from attaining sexual intimacy with a partner and expressed persistent concerns about HIV transmission during pregnancy despite reassurance from medical providers.
CONCLUSION: Sexual and reproductive perspectives of adolescents/young women living with HIV are multifactorial, highly stigmatized, and likely influenced by the mode of HIV acquisition. This population may benefit from patient-centred care models, including sexual health counselling that addresses sexual agency, intimacy, parenting and transmission risk reduction
Modeling Human Visual Search Performance on Realistic Webpages Using Analytical and Deep Learning Methods
Modeling visual search not only offers an opportunity to predict the
usability of an interface before actually testing it on real users, but also
advances scientific understanding about human behavior. In this work, we first
conduct a set of analyses on a large-scale dataset of visual search tasks on
realistic webpages. We then present a deep neural network that learns to
predict the scannability of webpage content, i.e., how easy it is for a user to
find a specific target. Our model leverages both heuristic-based features such
as target size and unstructured features such as raw image pixels. This
approach allows us to model complex interactions that might be involved in a
realistic visual search task, which can not be easily achieved by traditional
analytical models. We analyze the model behavior to offer our insights into how
the salience map learned by the model aligns with human intuition and how the
learned semantic representation of each target type relates to its visual
search performance.Comment: the 2020 CHI Conference on Human Factors in Computing System
Separate and Attend in Personal Email Search
In personal email search, user queries often impose different requirements on
different aspects of the retrieved emails. For example, the query "my recent
flight to the US" requires emails to be ranked based on both textual contents
and recency of the email documents, while other queries such as "medical
history" do not impose any constraints on the recency of the email. Recent deep
learning-to-rank models for personal email search often directly concatenate
dense numerical features (e.g., document age) with embedded sparse features
(e.g., n-gram embeddings). In this paper, we first show with a set of
experiments on synthetic datasets that direct concatenation of dense and sparse
features does not lead to the optimal search performance of deep neural ranking
models. To effectively incorporate both sparse and dense email features into
personal email search ranking, we propose a novel neural model, SepAttn.
SepAttn first builds two separate neural models to learn from sparse and dense
features respectively, and then applies an attention mechanism at the
prediction level to derive the final prediction from these two models. We
conduct a comprehensive set of experiments on a large-scale email search
dataset, and demonstrate that our SepAttn model consistently improves the
search quality over the baseline models.Comment: WSDM 202
Comyco: Quality-Aware Adaptive Video Streaming via Imitation Learning
Learning-based Adaptive Bit Rate~(ABR) method, aiming to learn outstanding
strategies without any presumptions, has become one of the research hotspots
for adaptive streaming. However, it typically suffers from several issues,
i.e., low sample efficiency and lack of awareness of the video quality
information. In this paper, we propose Comyco, a video quality-aware ABR
approach that enormously improves the learning-based methods by tackling the
above issues. Comyco trains the policy via imitating expert trajectories given
by the instant solver, which can not only avoid redundant exploration but also
make better use of the collected samples. Meanwhile, Comyco attempts to pick
the chunk with higher perceptual video qualities rather than video bitrates. To
achieve this, we construct Comyco's neural network architecture, video datasets
and QoE metrics with video quality features. Using trace-driven and real-world
experiments, we demonstrate significant improvements of Comyco's sample
efficiency in comparison to prior work, with 1700x improvements in terms of the
number of samples required and 16x improvements on training time required.
Moreover, results illustrate that Comyco outperforms previously proposed
methods, with the improvements on average QoE of 7.5% - 16.79%. Especially,
Comyco also surpasses state-of-the-art approach Pensieve by 7.37% on average
video quality under the same rebuffering time.Comment: ACM Multimedia 201
SparCML: High-Performance Sparse Communication for Machine Learning
Applying machine learning techniques to the quickly growing data in science
and industry requires highly-scalable algorithms. Large datasets are most
commonly processed "data parallel" distributed across many nodes. Each node's
contribution to the overall gradient is summed using a global allreduce. This
allreduce is the single communication and thus scalability bottleneck for most
machine learning workloads. We observe that frequently, many gradient values
are (close to) zero, leading to sparse of sparsifyable communications. To
exploit this insight, we analyze, design, and implement a set of
communication-efficient protocols for sparse input data, in conjunction with
efficient machine learning algorithms which can leverage these primitives. Our
communication protocols generalize standard collective operations, by allowing
processes to contribute arbitrary sparse input data vectors. Our generic
communication library, SparCML, extends MPI to support additional features,
such as non-blocking (asynchronous) operations and low-precision data
representations. As such, SparCML and its techniques will form the basis of
future highly-scalable machine learning frameworks
Skyline: Interactive In-Editor Computational Performance Profiling for Deep Neural Network Training
Training a state-of-the-art deep neural network (DNN) is a
computationally-expensive and time-consuming process, which incentivizes deep
learning developers to debug their DNNs for computational performance. However,
effectively performing this debugging requires intimate knowledge about the
underlying software and hardware systems---something that the typical deep
learning developer may not have. To help bridge this gap, we present Skyline: a
new interactive tool for DNN training that supports in-editor computational
performance profiling, visualization, and debugging. Skyline's key contribution
is that it leverages special computational properties of DNN training to
provide (i) interactive performance predictions and visualizations, and (ii)
directly manipulatable visualizations that, when dragged, mutate the batch size
in the code. As an in-editor tool, Skyline allows users to leverage these
diagnostic features to debug the performance of their DNNs during development.
An exploratory qualitative user study of Skyline produced promising results;
all the participants found Skyline to be useful and easy to use.Comment: 14 pages, 5 figures. Appears in the proceedings of UIST'2
- …