1,084 research outputs found
Optimization Coaching for JavaScript
The performance of dynamic object-oriented programming languages such as JavaScript depends heavily on highly optimizing just-in-time compilers. Such compilers, like all compilers, can silently fall back to generating conservative, low-performance code during optimization. As a result, programmers may inadvertently cause performance issues on users\u27 systems by making seemingly inoffensive changes to programs. This paper shows how to solve the problem of silent optimization failures. It specifically explains how to create a so-called optimization coach for an object-oriented just-in-time-compiled programming language. The development and evaluation build on the SpiderMonkey JavaScript engine, but the results should generalize to a variety of similar platforms
A Comprehensive Survey on Database Management System Fuzzing: Techniques, Taxonomy and Experimental Comparison
Database Management System (DBMS) fuzzing is an automated testing technique
aimed at detecting errors and vulnerabilities in DBMSs by generating, mutating,
and executing test cases. It not only reduces the time and cost of manual
testing but also enhances detection coverage, providing valuable assistance in
developing commercial DBMSs. Existing fuzzing surveys mainly focus on
general-purpose software. However, DBMSs are different from them in terms of
internal structure, input/output, and test objectives, requiring specialized
fuzzing strategies. Therefore, this paper focuses on DBMS fuzzing and provides
a comprehensive review and comparison of the methods in this field. We first
introduce the fundamental concepts. Then, we systematically define a general
fuzzing procedure and decompose and categorize existing methods. Furthermore,
we classify existing methods from the testing objective perspective, covering
various components in DBMSs. For representative works, more detailed
descriptions are provided to analyze their strengths and limitations. To
objectively evaluate the performance of each method, we present an open-source
DBMS fuzzing toolkit, OpenDBFuzz. Based on this toolkit, we conduct a detailed
experimental comparative analysis of existing methods and finally discuss
future research directions.Comment: 34 pages, 22 figure
Actionable Program Analyses for Improving Software Performance
Nowadays, we have greater expectations of software than ever before. This is followed by constant pressure to run the same program on smaller and cheaper machines. To meet this demand, the application’s performance has become the essential concern in software development. Unfortunately, many applications still suffer from performance issues: coding or design errors that lead to performance degradation. However, finding performance issues is a challenging task: there is
limited knowledge on how performance issues are discovered and fixed in practice, and current performance profilers report only where resources are spent, but not where resources are wasted. The goal of this dissertation is to investigate actionable performance analyses that help developers optimize their software by applying relatively simple code changes. To understand causes and fixes of performance issues in real-world software, we first present an empirical study of 98 issues in popular JavaScript projects. The study illustrates the prevalence of simple and recurring optimization patterns that lead to significant performance improvements. Then, to help developers optimize their code, we propose two actionable performance analyses that suggest optimizations based on reordering opportunities and method inlining. In this work, we focus on optimizations with four key properties. First, the optimizations are effective, that is, the changes suggested by the analysis lead to statistically significant performance improvements. Second, the optimizations are exploitable, that is, they are easy to understand and apply. Third, the optimizations are recurring, that is, they are applicable across multiple projects. Fourth, the optimizations are out-of-reach for compilers, that is, compilers can not guarantee that a code transformation preserves the original semantics. To reliably detect optimization opportunities and measure their performance benefits, the code must be executed with sufficient test inputs. The last contribution complements state-of-the-art test generation techniques by proposing a novel automated approach for generating effective tests for higher-order functions. We implement our techniques in practical tools and evaluate their effectiveness on a set of popular software systems. The empirical evaluation demonstrates the potential of actionable analyses in improving software performance through relatively simple optimization opportunities
Retromorphic Testing: A New Approach to the Test Oracle Problem
A test oracle serves as a criterion or mechanism to assess the correspondence
between software output and the anticipated behavior for a given input set. In
automated testing, black-box techniques, known for their non-intrusive nature
in test oracle construction, are widely used, including notable methodologies
like differential testing and metamorphic testing. Inspired by the mathematical
concept of inverse function, we present Retromorphic Testing, a novel black-box
testing methodology. It leverages an auxiliary program in conjunction with the
program under test, which establishes a dual-program structure consisting of a
forward program and a backward program. The input data is first processed by
the forward program and then its program output is reversed to its original
input format using the backward program. In particular, the auxiliary program
can operate as either the forward or backward program, leading to different
testing modes. The process concludes by examining the relationship between the
initial input and the transformed output within the input domain. For example,
to test the implementation of the sine function , we can employ its
inverse function, , and validate the equation . In addition to the
high-level concept of Retromorphic Testing, this paper presents its three
testing modes with illustrative use cases across diverse programs, including
algorithms, traditional software, and AI applications
Finding Performance Issues in Database Engines via Cardinality Estimation Testing
Database Management Systems (DBMSs) process a given query by creating an
execution plan, which is subsequently executed, to compute the query's result.
Deriving an efficient query plan is challenging, and both academia and industry
have invested decades into researching query optimization. Despite this, DBMSs
are prone to performance issues, where a DBMS produces an inefficient query
plan that might lead to the slow execution of a query. Finding such issues is a
longstanding problem and inherently difficult, because no ground truth
information on an expected execution time exists. In this work, we propose
Cardinality Estimation Restriction Testing (CERT), a novel technique that
detects performance issues through the lens of cardinality estimation. Given a
query on a database, CERT derives a more restrictive query (e.g., by replacing
a LEFT JOIN with an INNER JOIN), whose estimated number of rows should not
exceed the number of estimated rows for the original query. CERT tests
cardinality estimators specifically, because they were shown to be the most
important component for query optimization; thus, we expect that finding and
fixing such issues might result in the highest performance gains. In addition,
we found that some other kinds of query optimization issues are exposed by the
unexpected cardinality estimation, which can also be detected by CERT. CERT is
a black-box technique that does not require access to the source code; DBMSs
expose query plans via the EXPLAIN statement. CERT eschews executing queries,
which is costly and prone to performance fluctuations. We evaluated CERT on
three widely used and mature DBMSs, MySQL, TiDB, and CockroachDB. CERT found 13
unique issues, of which 2 issues were fixed and 9 confirmed by the developers.
We expect that this new angle on finding performance bugs will help DBMS
developers in improving DMBSs' performance
Towards Implicit Parallel Programming for Systems
Multi-core processors require a program to be decomposable into independent parts that can execute in parallel in order to scale performance with the number of cores. But parallel programming is hard especially when the program requires state, which many system programs use for optimization, such as for example a cache to reduce disk I/O. Most prevalent parallel programming models do not support a notion of state and require the programmer to synchronize state access manually, i.e., outside the realms of an associated optimizing compiler. This prevents the compiler to introduce parallelism automatically and requires the programmer to optimize the program manually.
In this dissertation, we propose a programming language/compiler co-design to provide a new programming model for implicit parallel programming with state and a compiler that can optimize the program for a parallel execution.
We define the notion of a stateful function along with their composition and control structures. An example implementation of a highly scalable server shows that stateful functions smoothly integrate into existing programming language concepts, such as object-oriented programming and programming with structs. Our programming model is also highly practical and allows to gradually adapt existing code bases. As a case study, we implemented a new data processing core for the Hadoop Map/Reduce system to overcome existing performance bottlenecks. Our lambda-calculus-based compiler automatically extracts parallelism without changing the program's semantics. We added further domain-specific semantic-preserving transformations that reduce I/O calls for microservice programs. The runtime format of a program is a dataflow graph that can be executed in parallel, performs concurrent I/O and allows for non-blocking live updates
- …