3,078 research outputs found
Combining hardware and software instrumentation to classify program executions
Several research efforts have studied ways to infer properties of software systems from program spectra gathered from the running systems, usually with software-level instrumentation. While these efforts appear to produce accurate classifications, detailed understanding of their costs and potential cost-benefit tradeoffs is lacking. In this work we present a hybrid instrumentation approach which uses hardware performance counters to gather program spectra at very low cost. This underlying data is further augmented with data captured by minimal amounts of software-level instrumentation. We also
evaluate this hybrid approach by comparing it to other existing approaches. We conclude that these hybrid spectra can reliably distinguish failed executions from successful executions at a fraction of the runtime overhead cost of using software-based execution data
A Practical Blended Analysis for Dynamic Features in JavaScript
The JavaScript Blended Analysis Framework is designed to
perform a general-purpose, practical combined static/dynamic
analysis of JavaScript programs, while handling dynamic
features such as run-time generated code and variadic func-
tions. The idea of blended analysis is to focus static anal-
ysis on a dynamic calling structure collected at runtime in
a lightweight manner, and to rene the static analysis us-
ing additional dynamic information. We perform blended
points-to analysis of JavaScript with our framework and
compare results with those computed by a pure static points-
to analysis. Using JavaScript codes from actual webpages
as benchmarks, we show that optimized blended analysis
for JavaScript obtains good coverage (86.6% on average per
website) of the pure static analysis solution and nds ad-
ditional points-to pairs (7.0% on average per website) con-
tributed by dynamically generated/loaded code
PerfXplain: Debugging MapReduce Job Performance
While users today have access to many tools that assist in performing large
scale data analysis tasks, understanding the performance characteristics of
their parallel computations, such as MapReduce jobs, remains difficult. We
present PerfXplain, a system that enables users to ask questions about the
relative performances (i.e., runtimes) of pairs of MapReduce jobs. PerfXplain
provides a new query language for articulating performance queries and an
algorithm for generating explanations from a log of past MapReduce job
executions. We formally define the notion of an explanation together with three
metrics, relevance, precision, and generality, that measure explanation
quality. We present the explanation-generation algorithm based on techniques
related to decision-tree building. We evaluate the approach on a log of past
executions on Amazon EC2, and show that our approach can generate quality
explanations, outperforming two naive explanation-generation methods.Comment: VLDB201
Statistics and runtime verification
The importance of correctness of systems is becoming more crucial as computers control more of
our everyday activities. Various approaches have been
advocated and used for the verification of such correctness, with one of the more promising ones being
runtime verification. One important issue in runtime
verification is the logic used to specify properties,
since this influences both the overheads induced by
the monitors, and the applicability of the approach
to a particular domain. In this paper we propose
techniques for the expression and runtime monitoring
of statistical properties, enabling easier manipulation
and expression of non-functional requirements. The
logic is developed as an extension of the existing
runtime verification tool LARVA, and has been applied
to an ftp server implementation, adding a new layer of
probabilistic intrusion detection and system profiling.peer-reviewe
Seer: a lightweight online failure prediction approach
Online failure prediction aims to predict the manifestation of failures at runtime before the failures actually occur. Existing online failure prediction approaches typically operate on data which is either directly reported by the system under test or directly observable from outside system executions. These approaches generally refrain themselves from collecting internal execution data that can further improve the prediction quality. One reason behind this general trend is due to the runtime overhead cost incurred by the measurement instruments that are required to collect the data. In this work we conjecture that large cost reductions in collecting internal execution data for online failure prediction can derive from reducing the cost of the measurement instruments, while still supporting acceptable levels of prediction quality. To evaluate this conjecture, we present a lightweight online failure prediction approach, called Seer. Seer uses fast hardware performance counters to perform most of the data collection work. The data is augmented with further data collected by a minimal amount of software instrumentation that is added to the systems software. We refer to the data collected in this manner as hybrid spectra. We applied the proposed approach to three widely used open source subject applications and evaluated it by comparing and contrasting three types of hybrid spectra and two types of traditional software spectra. At the lowest level of runtime overheads attained in the experiments, the hybrid spectra predicted the failures about half way through the executions with an F-measure of 0.77 and a runtime overhead of 1.98%, on average. Comparing hybrid spectra to software spectra, we observed that, for comparable runtime overhead levels, the hybrid spectra provided significantly better prediction accuracies and earlier warnings for failures than the software spectra. Alternatively, for comparable accuracy levels, the hybrid spectra incurred significantly less runtime overheads and provided earlier warnings
Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach
Many algorithms in workflow scheduling and resource provisioning rely on the
performance estimation of tasks to produce a scheduling plan. A profiler that
is capable of modeling the execution of tasks and predicting their runtime
accurately, therefore, becomes an essential part of any Workflow Management
System (WMS). With the emergence of multi-tenant Workflow as a Service (WaaS)
platforms that use clouds for deploying scientific workflows, task runtime
prediction becomes more challenging because it requires the processing of a
significant amount of data in a near real-time scenario while dealing with the
performance variability of cloud resources. Hence, relying on methods such as
profiling tasks' execution data using basic statistical description (e.g.,
mean, standard deviation) or batch offline regression techniques to estimate
the runtime may not be suitable for such environments. In this paper, we
propose an online incremental learning approach to predict the runtime of tasks
in scientific workflows in clouds. To improve the performance of the
predictions, we harness fine-grained resources monitoring data in the form of
time-series records of CPU utilization, memory usage, and I/O activities that
are reflecting the unique characteristics of a task's execution. We compare our
solution to a state-of-the-art approach that exploits the resources monitoring
data based on regression machine learning technique. From our experiments, the
proposed strategy improves the performance, in terms of the error, up to
29.89%, compared to the state-of-the-art solutions.Comment: Accepted for presentation at main conference track of 11th IEEE/ACM
International Conference on Utility and Cloud Computin
Software Runtime Monitoring with Adaptive Sampling Rate to Collect Representative Samples of Execution Traces
Monitoring software systems at runtime is key for understanding workloads,
debugging, and self-adaptation. It typically involves collecting and storing
observable software data, which can be analyzed online or offline. Despite the
usefulness of collecting system data, it may significantly impact the system
execution by delaying response times and competing with system resources. The
typical approach to cope with this is to filter portions of the system to be
monitored and to sample data. Although these approaches are a step towards
achieving a desired trade-off between the amount of collected information and
the impact on the system performance, they focus on collecting data of a
particular type or may capture a sample that does not correspond to the actual
system behavior. In response, we propose an adaptive runtime monitoring process
to dynamically adapt the sampling rate while monitoring software systems. It
includes algorithms with statistical foundations to improve the
representativeness of collected samples without compromising the system
performance. Our evaluation targets five applications of a widely used
benchmark. It shows that the error (RMSE) of the samples collected with our
approach is 9-54% lower than the main alternative strategy (sampling rate
inversely proportional to the throughput), with 1-6% higher performance impact.Comment: in Journal of Systems and Softwar
- …