23,192 research outputs found
Improving Mobile Video Streaming with Mobility Prediction and Prefetching in Integrated Cellular-WiFi Networks
We present and evaluate a procedure that utilizes mobility and throughput
prediction to prefetch video streaming data in integrated cellular and WiFi
networks. The effective integration of such heterogeneous wireless technologies
will be significant for supporting high performance and energy efficient video
streaming in ubiquitous networking environments. Our evaluation is based on
trace-driven simulation considering empirical measurements and shows how
various system parameters influence the performance, in terms of the number of
paused video frames and the energy consumption; these parameters include the
number of video streams, the mobile, WiFi, and ADSL backhaul throughput, and
the number of WiFi hotspots. Also, we assess the procedure's robustness to time
and throughput variability. Finally, we present our initial prototype that
implements the proposed approach.Comment: 7 pages, 15 figure
Undermining User Privacy on Mobile Devices Using AI
Over the past years, literature has shown that attacks exploiting the
microarchitecture of modern processors pose a serious threat to the privacy of
mobile phone users. This is because applications leave distinct footprints in
the processor, which can be used by malware to infer user activities. In this
work, we show that these inference attacks are considerably more practical when
combined with advanced AI techniques. In particular, we focus on profiling the
activity in the last-level cache (LLC) of ARM processors. We employ a simple
Prime+Probe based monitoring technique to obtain cache traces, which we
classify with Deep Learning methods including Convolutional Neural Networks. We
demonstrate our approach on an off-the-shelf Android phone by launching a
successful attack from an unprivileged, zeropermission App in well under a
minute. The App thereby detects running applications with an accuracy of 98%
and reveals opened websites and streaming videos by monitoring the LLC for at
most 6 seconds. This is possible, since Deep Learning compensates measurement
disturbances stemming from the inherently noisy LLC monitoring and unfavorable
cache characteristics such as random line replacement policies. In summary, our
results show that thanks to advanced AI techniques, inference attacks are
becoming alarmingly easy to implement and execute in practice. This once more
calls for countermeasures that confine microarchitectural leakage and protect
mobile phone applications, especially those valuing the privacy of their users
Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels
Achieving optimal program performance requires deep insight into the
interaction between hardware and software. For software developers without an
in-depth background in computer architecture, understanding and fully utilizing
modern architectures is close to impossible. Analytic loop performance modeling
is a useful way to understand the relevant bottlenecks of code execution based
on simple machine models. The Roofline Model and the Execution-Cache-Memory
(ECM) model are proven approaches to performance modeling of loop nests. In
comparison to the Roofline model, the ECM model can also describes the
single-core performance and saturation behavior on a multicore chip. We give an
introduction to the Roofline and ECM models, and to stencil performance
modeling using layer conditions (LC). We then present Kerncraft, a tool that
can automatically construct Roofline and ECM models for loop nests by
performing the required code, data transfer, and LC analysis. The layer
condition analysis allows to predict optimal spatial blocking factors for loop
nests. Together with the models it enables an ab-initio estimate of the
potential benefits of loop blocking optimizations and of useful block sizes. In
cases where LC analysis is not easily possible, Kerncraft supports a cache
simulator as a fallback option. Using a 25-point long-range stencil we
demonstrate the usefulness and predictive power of the Kerncraft tool.Comment: 22 pages, 5 figure
Cache-aware Performance Modeling and Prediction for Dense Linear Algebra
Countless applications cast their computational core in terms of dense linear
algebra operations. These operations can usually be implemented by combining
the routines offered by standard linear algebra libraries such as BLAS and
LAPACK, and typically each operation can be obtained in many alternative ways.
Interestingly, identifying the fastest implementation -- without executing it
-- is a challenging task even for experts. An equally challenging task is that
of tuning each routine to performance-optimal configurations. Indeed, the
problem is so difficult that even the default values provided by the libraries
are often considerably suboptimal; as a solution, normally one has to resort to
executing and timing the routines, driven by some form of parameter search. In
this paper, we discuss a methodology to solve both problems: identifying the
best performing algorithm within a family of alternatives, and tuning
algorithmic parameters for maximum performance; in both cases, we do not
execute the algorithms themselves. Instead, our methodology relies on timing
and modeling the computational kernels underlying the algorithms, and on a
technique for tracking the contents of the CPU cache. In general, our
performance predictions allow us to tune dense linear algebra algorithms within
few percents from the best attainable results, thus allowing computational
scientists and code developers alike to efficiently optimize their linear
algebra routines and codes.Comment: Submitted to PMBS1
- …