30 research outputs found
A Penny a Function: Towards Cost Transparent Cloud Programming
Understanding and managing monetary cost factors is crucial when developing
cloud applications. However, the diverse range of factors influencing costs for
computation, storage, and networking in cloud applications poses a challenge
for developers who want to manage and minimize costs proactively. Existing
tools for understanding cost factors are often detached from source code,
causing opaqueness regarding the origin of costs. Moreover, existing cost
models for cloud applications focus on specific factors such as compute
resources and necessitate manual effort to create the models. This paper
presents initial work toward a cost model based on a directed graph that allows
deriving monetary cost estimations directly from code using static analysis.
Leveraging the cost model, we explore visualizations embedded in a code editor
that display costs close to the code causing them. This makes cost exploration
an integrated part of the developer experience, thereby removing the overhead
of external tooling for cost estimation of cloud applications at development
time.Comment: Proceedings of the 2nd ACM SIGPLAN International Workshop on
Programming Abstractions and Interactive Notations, Tools, and Environments
(PAINT 2023), 10 pages, 5 figure
Locating Faults with Program Slicing: An Empirical Analysis
Statistical fault localization is an easily deployed technique for quickly
determining candidates for faulty code locations. If a human programmer has to
search the fault beyond the top candidate locations, though, more traditional
techniques of following dependencies along dynamic slices may be better suited.
In a large study of 457 bugs (369 single faults and 88 multiple faults) in 46
open source C programs, we compare the effectiveness of statistical fault
localization against dynamic slicing. For single faults, we find that dynamic
slicing was eight percentage points more effective than the best performing
statistical debugging formula; for 66% of the bugs, dynamic slicing finds the
fault earlier than the best performing statistical debugging formula. In our
evaluation, dynamic slicing is more effective for programs with single fault,
but statistical debugging performs better on multiple faults. Best results,
however, are obtained by a hybrid approach: If programmers first examine at
most the top five most suspicious locations from statistical debugging, and
then switch to dynamic slices, on average, they will need to examine 15% (30
lines) of the code. These findings hold for 18 most effective statistical
debugging formulas and our results are independent of the number of faults
(i.e. single or multiple faults) and error type (i.e. artificial or real
errors)
Locating Faults with Program Slicing: An Empirical Analysis
Statistical fault localization is an easily deployed technique for quickly determining candidates for faulty code locations. If a human programmer has to search the fault beyond the top candidate locations, though, more traditional techniques of following dependencies along dynamic slices may be better suited. In a large study of 457 bugs (369 single faults and 88 multiple faults) in 46 open-source C programs, we compare the effectiveness of statistical fault localization against dynamic slicing. For single faults, we find that dynamic slicing was eight percentage points more effective than the best per- forming statistical debugging formula; for 66% of the bugs, dynamic slicing finds the fault earlier than the best performing statistical debugging formula. In our evaluation, dynamic slicing is more effective for programs with single fault, but statistical debugging performs better on multiple faults. Best results, however, are obtained by a hybrid approach: If programmers first examine at most the top five most suspicious locations from statistical debugging, and
then switch to dynamic slices, on average, they will need to examine 15% (30 lines) of the code. These findings hold for 18 most effective statistical debugging formulas and our results are independent of the number of faults (i.e. single or multiple faults) and error type (i.e. artificial or real errors)
Matching LOFAR sources across radio bands
Aims. With the recent preliminary release of the LOFAR LBA Sky Survey
(LoLSS), the first wide-area, ultra-low frequency observations from LOFAR were
published. Our aim is to combine this data set with other surveys at higher
frequencies to study the spectral properties of a large sample of radio
sources. Methods. We present a new cross-matching algorithm taking into account
the sizes of the radio sources and apply it to the LoLSS-PR, LoTSS-DR1,
LoTSS-DR2 (all LOFAR), TGSS-ADR1 (GMRT), WENSS (WSRT) and NVSS (VLA)
catalogues. We then study the number of matched counterparts for LoLSS radio
sources and their spectral properties. Results. We find counterparts for 22 607
(89.5%) LoLSS sources. The remaining 2 640 sources (10.5%) are identified
either as an artefact in the LoLSS survey (3.6%) or flagged due to their
closeness to bright sources (6.9%). We find an average spectral index of
between LoLSS and NVSS. Between LoLSS and LoTSS-DR2
we find . The average spectral index is flux density
independent above mJy. Comparison of the spectral slopes from
LoLSS--LoTSS-DR2 with LoTSS-DR2--NVSS indicates that the probed population of
radio sources exhibits evidence for a negative spectral curvature.Comment: 13 pages, 22 figures and 2 tables. Accepted for publication in A&
1 Identifying the root causes of wait states in large-scale parallel applications
Abstract—Driven by growing application requirements and accelerated by current trends in microprocessor design, the number of processor cores on modern supercomputers is increasing from generation to generation. However, load or communication imbalance prevents many codes from taking advantage of the available parallelism, as delays of single processes may spread wait states across the entire machine. Moreover, when employing complex point-to-point communication patterns, wait states may propagate along far-reaching cause-effect chains that are hard to track manually and that complicate an assessment of the actual costs of an imbalance. Building on earlier work by Meira Jr. et al., we present a scalable approach that identifies program wait states and attributes their costs in terms of resource waste to their original cause. By replaying event traces in parallel both in forward and backward direction, we can identify the processes and call paths responsible for the most severe imbalances even for runs with tens of thousands of processes. I
Identifying the Root Causes of Wait States in Large-Scale Parallel Applications
Driven by growing application requirements and accelerated by current trends in microprocessor design, the number of processor cores on modern supercomputers is increasing from generation to generation. However, load or communication imbalance prevents many codes from taking advantage of the available parallelism, as delays of single processes may spread wait states across the entire machine. Moreover, when employing complex point-to-point communication patterns, wait states may propagate along far-reaching cause-effect chains that are hard to track manually and that complicate an assessment of the actual costs of an imbalance. Building on earlier work by Meira Jr. et al., we present a scalable approach that identifies program wait states and attributes their costs in terms of resource waste to their original cause. By replaying event traces in parallel both forward and backward, we can identify the processes and call paths responsible for the most severe imbalances even for runs with hundreds of thousands of processes
Debugging Assumptions Artifact
Artifact for "Evaluating the Impact of Experimental Assumptions in Automated Fault Localization", accepted at ICSE 2023 Technical Track.</p
Scalasca analysis report for SPEC MPI.2007 benchmark 132.zeump2 on 512 processes in virtual-node mode on Blue Gene/P
<p>A Cube3 performance analysis report written by the Scalasca 1.x parallel analyzer of a measurement of the SPEC MPI 2007 benchmark 132.zeusmp2 (ZeusMP/2) executed in virtual-node mode on 512 processes of the IBM Blue Gene/P system JUQUEEN, operated by Forschungszentrum Jülich GmbH.</p