2,827 research outputs found
PerfWeb: How to Violate Web Privacy with Hardware Performance Events
The browser history reveals highly sensitive information about users, such as
financial status, health conditions, or political views. Private browsing modes
and anonymity networks are consequently important tools to preserve the privacy
not only of regular users but in particular of whistleblowers and dissidents.
Yet, in this work we show how a malicious application can infer opened websites
from Google Chrome in Incognito mode and from Tor Browser by exploiting
hardware performance events (HPEs). In particular, we analyze the browsers'
microarchitectural footprint with the help of advanced Machine Learning
techniques: k-th Nearest Neighbors, Decision Trees, Support Vector Machines,
and in contrast to previous literature also Convolutional Neural Networks. We
profile 40 different websites, 30 of the top Alexa sites and 10 whistleblowing
portals, on two machines featuring an Intel and an ARM processor. By monitoring
retired instructions, cache accesses, and bus cycles for at most 5 seconds, we
manage to classify the selected websites with a success rate of up to 86.3%.
The results show that hardware performance events can clearly undermine the
privacy of web users. We therefore propose mitigation strategies that impede
our attacks and still allow legitimate use of HPEs
Costing JIT Traces
Tracing JIT compilation generates units of compilation that
are easy to analyse and are known to execute frequently. The AJITPar
project aims to investigate whether the information in JIT traces can be
used to make better scheduling decisions or perform code transformations
to adapt the code for a specific parallel architecture. To achieve this goal,
a cost model must be developed to estimate the execution time of an
individual trace.
This paper presents the design and implementation of a system for extracting
JIT trace information from the Pycket JIT compiler. We define
three increasingly parametric cost models for Pycket traces. We perform
a search of the cost model parameter space using genetic algorithms to
identify the best weightings for those parameters. We test the accuracy
of these cost models for predicting the cost of individual traces on a set
of loop-based micro-benchmarks. We also compare the accuracy of the
cost models for predicting whole program execution time over the Pycket
benchmark suite. Our results show that the weighted cost model
using the weightings found from the genetic algorithm search has the
best accuracy
MoonGen: A Scriptable High-Speed Packet Generator
We present MoonGen, a flexible high-speed packet generator. It can saturate
10 GbE links with minimum sized packets using only a single CPU core by running
on top of the packet processing framework DPDK. Linear multi-core scaling
allows for even higher rates: We have tested MoonGen with up to 178.5 Mpps at
120 Gbit/s. We move the whole packet generation logic into user-controlled Lua
scripts to achieve the highest possible flexibility. In addition, we utilize
hardware features of Intel NICs that have not been used for packet generators
previously. A key feature is the measurement of latency with sub-microsecond
precision and accuracy by using hardware timestamping capabilities of modern
commodity NICs. We address timing issues with software-based packet generators
and apply methods to mitigate them with both hardware support on commodity NICs
and with a novel method to control the inter-packet gap in software. Features
that were previously only possible with hardware-based solutions are now
provided by MoonGen on commodity hardware. MoonGen is available as free
software under the MIT license at https://github.com/emmericp/MoonGenComment: Published at IMC 201
GeantV: Results from the prototype of concurrent vector particle transport simulation in HEP
Full detector simulation was among the largest CPU consumer in all CERN
experiment software stacks for the first two runs of the Large Hadron Collider
(LHC). In the early 2010's, the projections were that simulation demands would
scale linearly with luminosity increase, compensated only partially by an
increase of computing resources. The extension of fast simulation approaches to
more use cases, covering a larger fraction of the simulation budget, is only
part of the solution due to intrinsic precision limitations. The remainder
corresponds to speeding-up the simulation software by several factors, which is
out of reach using simple optimizations on the current code base. In this
context, the GeantV R&D project was launched, aiming to redesign the legacy
particle transport codes in order to make them benefit from fine-grained
parallelism features such as vectorization, but also from increased code and
data locality. This paper presents extensively the results and achievements of
this R&D, as well as the conclusions and lessons learnt from the beta
prototype.Comment: 34 pages, 26 figures, 24 table
- …