164 research outputs found
Measuring Software Performance on Linux
Measuring and analyzing the performance of software has reached a high
complexity, caused by more advanced processor designs and the intricate
interaction between user programs, the operating system, and the processor's
microarchitecture. In this report, we summarize our experience about how
performance characteristics of software should be measured when running on a
Linux operating system and a modern processor. In particular, (1) We provide a
general overview about hardware and operating system features that may have a
significant impact on timing and how they interact, (2) we identify sources of
errors that need to be controlled in order to obtain unbiased measurement
results, and (3) we propose a measurement setup for Linux to minimize errors.
Although not the focus of this report, we describe the measurement process
using hardware performance counters, which can faithfully reflect the real
bottlenecks on a given processor. Our experiments confirm that our measurement
setup has a large impact on the results. More surprisingly, however, they also
suggest that the setup can be negligible for certain analysis methods.
Furthermore, we found that our setup maintains significantly better performance
under background load conditions, which means it can be used to improve
software in high-performance applications
A Valgrind Tool to Compute the Working Set of a Software Process
This paper introduces a new open-source tool for the dynamic analyzer
Valgrind. The tool measures the amount of memory that is actively being used by
a process at any given point in time. While there exist numerous tools to
measure the memory requirements of a process, the vast majority only focuses on
metrics like resident or proportional set sizes, which include memory that was
once claimed, but is momentarily disused. Consequently, such tools do not
permit drawing conclusions about how much cache or RAM a process actually
requires at each point in time, and thus cannot be used for performance
debugging. The few tools which do measure only actively used memory, however,
have limitations in temporal resolution and introspection. In contrast, our
tool offers an easy way to compute the memory that has recently been accessed
at any point in time, reflecting how cache and RAM requirements change over
time. In particular, this tool computes the set of memory references made
within a fixed time interval before any point in time, known as the working
set, and captures call stacks for interesting peaks in the working set size. We
first introduce the tool, then we run some examples comparing the output from
our tool with similar memory tools, and we close with a discussion of
limitationsComment: 8 page
Thermodynamics of FRW Universe With Chaplygin Gas Models
In this paper we have examined the validity of the generalized second law of
thermodynamics (GSLT) in an expanding Friedmann Robertson Walker (FRW) universe
filled with different variants of Chaplygin gases. Assuming that the universe
is a closed system bounded by the cosmological horizon, we first present the
general prescription for the rate of change of total entropy on the boundary.
In the subsequent part we have analyzed the validity of the generalised second
law of thermodynamics on the cosmological apparent horizon and the cosmological
event horizon for different Chaplygin gas models of the universe. The analysis
is supported with the help of suitable graphs to clarify the status of the GSLT
on the cosmological horizons. In the case of the cosmological apparent horizon
we have found that some of these models always obey the GSLT, whereas the
validity of GSLT on the cosmological event horizon of all these models depend
on the choice of free parameters in the respective models.Comment: 20 pages, 19 figures, final version published online in General
Relativity and Gravitatio
How Reliable is Smartphone-based Electronic Contact Tracing for COVID-19?
Smartphone-based electronic contact tracing is currently considered an
essential tool towards easing lockdowns, curfews, and shelter-in-place orders
issued by most governments around the world in response to the 2020 novel
coronavirus (SARS-CoV-2) crisis. While the focus on developing smartphone-based
contact tracing applications or apps has been on privacy concerns stemming from
the use of such apps, an important question that has not received sufficient
attention is: How reliable will such smartphone-based electronic contact
tracing be?
This is a technical question related to how two smartphones reliably register
their mutual proximity. Here, we examine in detail the technical prerequisites
required for effective smartphone-based contact tracing. The underlying
mechanism that any contact tracing app relies on is called Neighbor Discovery
(ND), which involves smartphones transmitting and scanning for Bluetooth
signals to record their mutual presence whenever they are in close proximity.
The hardware support and the software protocols used for ND in smartphones,
however, were not designed for reliable contact tracing. In this paper, we
quantitatively evaluate how reliably can smartphones do contact tracing. Our
results point towards the design of a wearable solution for contact tracing
that can overcome the shortcomings of a smartphone-based solution to provide
more reliable and accurate contact tracing. To the best of our knowledge, this
is the first study that quantifies, both, the suitability and also the
drawbacks of smartphone-based contact tracing. Further, our results can be used
to parameterize a ND protocol to maximize the reliability of any contact
tracing app that uses it
Slotless Protocols for Fast and Energy-Efficient Neighbor Discovery
In mobile ad-hoc networks, neighbor discovery protocols are used to find
surrounding devices and to establish a first contact between them. Since the
clocks of the devices are not synchronized and their energy-budgets are
limited, usually duty-cycled, asynchronous discovery protocols are applied.
Only if two devices are awake at the same point in time, they can rendezvous.
Currently, time-slotted protocols, which subdivide time into multiple intervals
with equal lengths (slots), are considered to be the most efficient discovery
schemes. In this paper, we break away from the assumption of slotted time. We
propose a novel, continuous-time discovery protocol, which temporally decouples
beaconing and listening. Each device periodically sends packets with a certain
interval, and periodically listens for a given duration with a different
interval. By optimizing these interval lengths, we show that this scheme can,
to the best of our knowledge, outperform all known protocols such as DISCO,
U-Connect or Searchlight significantly. For example, Searchlight takes up to
740 % longer than our proposed technique to discover a device with the same
duty-cycle. Further, our proposed technique can also be applied in widely-used
asymmetric purely interval-based protocols such as ANT or Bluetooth Low Energy
Highway traffic data: macroscopic, microscopic and criticality analysis for capturing relevant traffic scenarios and traffic modeling based on the highD data set
This work provides a comprehensive analysis on naturalistic driving behavior
for highways based on the highD data set. Two thematic fields are considered.
First, some macroscopic and microscopic traffic statistics are provided. These
include the traffic flow rate and the traffic density, as well as velocity,
acceleration and distance distributions. Additionally, the dependencies to each
other are examined and compared to related work. The second part investigates
the distributions of criticality measures. The Time-To-Collision, Time-Headway
and a third measure, which couples both, are analyzed. These measures are also
combined with other indicators. Scenarios, in which these measures reach a
critical level, are separately discussed. The results are compared to related
work as well. The two main contributions of this work can be stated as follows.
First, the analysis on the criticality measures can be used to find suitable
thresholds for rare traffic scenarios. Second, the statistics provided in this
work can also be utilized for traffic modeling, for example in simulation
environments
Cost-effective Energy Monitoring of a Zynq-based Real-time System including dual Gigabit Ethernet
The ongoing integration of fine-grained power management features already
established in CPU-driven Systems-on-Chip (SoCs) enables both traditional Field
Programmable Gate Arrays (FPGAs) and, more recently, hybrid Programmable SoCs
(pSoCs) to reach more energy-sensitive application domains (such as, e.g.,
automotive and robotics). By combining a fixed-function multi-core SoC with
flexible, configurable FPGA fabric, the latter can be used to realize
heterogeneous Real-time Systems (RTSs) commonly implementing complex
application-specific architectures with high computation and communication
(I/O) densities. Their dynamic changes in workload, currently active power
saving features and thus power consumption require precise voltage and current
sensing on all relevant supply rails to enable dependable evaluation of the
various power management techniques. In this paper, we propose a low-cost
18-channel 16-bit-resolution measurement (sub-)system capable of 200 kSPS
(kilo-samples per second) for instrumentation of current pSoC development
boards. To this end, we join simultaneously sampling analog-to-digital
converters (ADCs) and analog voltage/current sensing circuitry with a Cortex M7
microcontroller using an SD card for storage. In addition, we propose to
include crucial I/O components such as Ethernet PHYs into the power monitoring
to gain a holistic view on the RTS's temporal behavior covering not only
computation on FPGA and CPUs, but also communication in terms of, e.g.,
reception of sensor values and transmission of actuation signals. We present an
FMC-sized implementation of our measurement system combined with two Gigabit
Ethernet PHYs and one HDMI input. Paired with Xilinx' ZC702 development board,
we are able to synchronously acquire power traces of a Zynq pSoC and the two
PHYs precise enough to identify individual Ethernet frames.Comment: 4 pages, 4 figure
Precise Energy Modeling for the Bluetooth Low Energy Protocol
Bluetooth Low Energy (BLE) is a wireless protocol well suited for
ultra-low-power sensors running on small batteries. BLE is described as a new
protocol in the official Bluetooth 4.0 specification. To design
energy-efficient devices, the protocol provides a number of parameters that
need to be optimized within an energy, latency and throughput design space. To
minimize power consumption, the protocol parameters have to be optimized for a
given application. Therefore, an energy-model that can predict the energy
consumption of a BLE-based wireless device for different parameter value
settings, is needed. As BLE differs from the original Bluetooth significantly,
models for Bluetooth cannot be easily applied to the BLE protocol. Since the
last one year, there have been a couple of proposals on energy models for BLE.
However, none of them can model all the operating modes of the protocol. This
paper presents a precise energy model of the BLE protocol, that allows the
computation of a device's power consumption in all possible operating modes. To
the best of our knowledge, our proposed model is not only one of the most
accurate ones known so far (because it accounts for all protocol parameters),
but it is also the only one that models all the operating modes of BLE.
Furthermore, we present a sensitivity analysis of the different parameters on
the energy consumption and evaluate the accuracy of the model using both
discrete event simulation and actual measurements. Based on this model,
guidelines for system designers are presented, that help choosing the right
parameters for optimizing the energy consumption for a given application
Neighbor discovery latency in BLE-like duty-cycled protocols
Neighbor discovery is the procedure using which two wireless devices initiate
a first contact. In low power ad-hoc networks, radios are duty-cycled and the
latency until a packet meets a reception phase of another device is determined
by a random process. Most research considers slotted protocols, in which the
points in time for reception are temporally coupled to beacon transmissions. In
contrast, many recent protocols, such as ANT/ANT+ and Bluetooth Low Energy
(BLE) use a slotless, periodic-interval based scheme for neighbor discovery.
Here, one device periodically broadcasts packets, whereas the other device
periodically listens to the channel. Both periods are independent from each
other and drawn over continuous time. Such protocols provide 3 degrees of
freedom (viz., the intervals for advertising and scanning and the duration of
each scan phase). Though billions of existing BLE devices rely on these
protocols, neither their expected latencies nor beneficial configurations with
good latency-duty-cycle relations are known. Parametrizations for the
participating devices are usually determined based on a "good guess". In this
paper, we for the first time present a mathematical theory which can compute
the neighbor discovery latencies for all possible parametrizations. Further,
our theory shows that upper bounds on the latency can be guaranteed for all
parametrizations, except for a finite number of singularities. Therefore,
slotless, periodic interval-based protocols can be used in applications with
deterministic latency demands, which have been reserved for slotted protocols
until now. Our proposed theory can be used for analyzing the neighbor discovery
latencies, for tweaking protocol parameters and for developing new protocols
Density perturbation and cosmological evolution in the presence of magnetic field in gravity models
In this paper, we have investigated the density perturbations and
cosmological evolution in the FLRW universe in presence of a cosmic magnetic
field, which may be assumed to mimic primordial magnetic fields. Such magnetic
fields have sufficient strength to influence galaxy formation and cluster
dynamics, thereby leaving an imprint on the CMB anisotropies. We have
considered the FLRW universe as a representative of the isotropic cosmological
model in the 1+3 covariant formalism for gravity. The propagation
equations have been determined and analyzed, where we have assumed that the
magnetic field is aligned uniformly along the -direction, resulting in a
diagonal shear tensor. Subsequently,the density perturbation evolution
equations have been studied and the results have been interpreted. We have also
indicated how these results change in the general relativistic case and briefly
mentioned the expected change in higher-order gravity theories.Comment: 11 page
- …