8 research outputs found
Convolutional Neural Networks over Tree Structures for Programming Language Processing
Programming language processing (similar to natural language processing) is a
hot research topic in the field of software engineering; it has also aroused
growing interest in the artificial intelligence community. However, different
from a natural language sentence, a program contains rich, explicit, and
complicated structural information. Hence, traditional NLP models may be
inappropriate for programs. In this paper, we propose a novel tree-based
convolutional neural network (TBCNN) for programming language processing, in
which a convolution kernel is designed over programs' abstract syntax trees to
capture structural information. TBCNN is a generic architecture for programming
language processing; our experiments show its effectiveness in two different
program analysis tasks: classifying programs according to functionality, and
detecting code snippets of certain patterns. TBCNN outperforms baseline
methods, including several neural models for NLP.Comment: Accepted at AAAI-1
What Causes My Test Alarm? Automatic Cause Analysis for Test Alarms in System and Integration Testing
Driven by new software development processes and testing in clouds, system
and integration testing nowadays tends to produce enormous number of alarms.
Such test alarms lay an almost unbearable burden on software testing engineers
who have to manually analyze the causes of these alarms. The causes are
critical because they decide which stakeholders are responsible to fix the bugs
detected during the testing. In this paper, we present a novel approach that
aims to relieve the burden by automating the procedure. Our approach, called
Cause Analysis Model, exploits information retrieval techniques to efficiently
infer test alarm causes based on test logs. We have developed a prototype and
evaluated our tool on two industrial datasets with more than 14,000 test
alarms. Experiments on the two datasets show that our tool achieves an accuracy
of 58.3% and 65.8%, respectively, which outperforms the baseline algorithms by
up to 13.3%. Our algorithm is also extremely efficient, spending about 0.1s per
cause analysis. Due to the attractive experimental results, our industrial
partner, a leading information and communication technology company in the
world, has deployed the tool and it achieves an average accuracy of 72% after
two months of running, nearly three times more accurate than a previous
strategy based on regular expressions.Comment: 12 page
Continuous Integration Build Failures in Practice
Automated software testing is a popular method of quality control that aims to detect bugs before software is released to the end user. Unfortunately, writing, maintaining, and executing automated test suites is expensive and not necessarily cost effective.
To gain a better understanding of the return on investment and cost of automated testing, we studied the continuous integration build results of 61 open source projects. We found that 19.4% of build failures are resolved by a change to only test code. These failures do not find bugs in the code under test, but rather bugs in the test suites themselves and represent a maintenance cost. We also found that 12.8% of the build failures we studied are due to non-deterministic tests that can be difficult to debug. To gain a better understanding of what leads to test maintenance we manually inspected both valuable and costly tests and found a variety of reasons for test maintenance.
Our empirical analysis of real projects quantifies the cost of maintaining test suites and shows that reducing maintenance costs could make automated testing much more affordable
Determining flaky tests from test failures
Automated regression testing is widely used in modern software development. Whenever a developer pushes some changes to a repository, tests are run to check whether the changes broke some functionality. When previously passing tests fail, the most recent changes are typically suspected, and developers invest time and effort to debug those changes. Unfortunately, new test failures may not be due to the latest changes but due to non-deterministic tests, popularly called flaky tests, that can pass or fail even without changes to the code under test. Many projects have such flaky tests, which can cause developers to lose confidence in test results. Therefore, developers need techniques that can help them determine whether a test failure is due to their latest changes and warrants their debugging, or whether it is due to a flaky test that should be potentially debugged by someone else.
The most widely used technique for determining whether a test failure is due to a flaky test is to rerun the failing test multiple times immediately after it fails: if some rerun does pass, the test is definitely flaky, but if all reruns still fail, the status is unknown. This thesis proposes three improvements to this basic technique: (1) postponing the reruns, (2) rerunning in a new runtime environment (e.g., a new JVM for Java tests), and (3) intersecting the test coverage with the latest changes. The thesis evaluates the cost of (1) and (2) and evaluates the applicability of (3) on 15 projects with a total of 2715 test classes, 10 of which contain previously known flaky tests. The results show that the proposed improvements are highly applicable and would be able to determine that more failures are due to flaky tests for the same or somewhat higher cost as rerunning failing tests immediately after failure
Understanding and Generating Patches for Bugs Introduced by Third-party Library Upgrades
During the process of software development, developers rely heavily on third-party libraries to enable functionalities and features in their projects. However, developers are faced with challenges of managing dependency messes when a project evolves. One of the most challenging problems is to handle issues caused by dependency upgrades.
To better understand the issues caused by Third-party Library Upgrades (TLU), in this thesis, we conduct a comprehensive study of the bugs caused by dependency upgrades. The study is conducted on a collection of 8,952 open-source Java projects from GitHub and 304 Java projects on Apache Software Foundation (ASF) JIRA systems. We collect 83 bugs caused by inappropriate TLUs in total. Our inspection shows that TLUs are conducted out of different reasons. The most popular reason is that the project is preparing for release and wants to keep its dependencies up-to-date (62.3%). Another popular reason is that the older version of a dependency is not compatible with other dependencies (15.3%). Our inspection also indicates that the problems introduced by inappropriate dependency upgrades can be categorized into different types, i.e., program failures that are detectable statically and dynamically. Then, we investigate developers’ efforts on repairing bugs caused by inappropriate TLUs. We notice that 32.53% of these bugs can be fixed by only modifying the build scripts (which we call TLU-build bugs), 20.48% of them can be fixed by merely modifying the source code (which is called TLU-code bugs), and 16.87% of them require modifications in multiple sources. TLU-build bugs and TLU-code bugs as the two most popular types, are explored more by us.
For TLU-code bugs, we summarize the common ways used to fix them. Furthermore, we study whether current repair techniques can fix TLU-code bugs efficiently. For the 14 TLU-code bugs that cause test failures and runtime failures, the study shows that existing automated program repair tools can only work on 6 of the 14 investigated bugs. Each of them can only fix a limited amount of the 6 bugs, but the union of them can finally fix 5 out of 6 bugs.
For TLU-build bugs, by leveraging the knowledge from our study, we summarize common patterns to fix build scripts, and propose a technique to automatically fix them. Our evaluation shows the proposed technique can successfully fix 9 out of 14 TLU-build bugs
Enforcing Abstract Immutability
Researchers have recently proposed a number of systems for expressing,
verifying, and inferring immutability declarations. These systems are
often rigid, and do not support "abstract immutability". An abstractly
immutable object is an object o which is immutable from the point of
view of any external methods. The C++ programming language is not
rigid–it allows developers to express intent by adding immutability
declarations to methods. Abstract immutability allows for performance
improvements such as caching, even in the presence of writes to object
fields. This dissertation presents a system to enforce abstract
immutability.
First, we explore abstract immutability in real-world systems. We
found that developers often incorrectly use abstract immutability,
perhaps because no programming language helps developers correctly
implement abstract immutability. We believe that this omission leads
to incorrect usages. Specifically, we wrote a dynamic analysis that
reports any writes through immutability declarations. To our knowledge,
this work was the first to explore how objects implement abstract
immutability (or fail to implement it). Our novel study found three
uses of abstract immutability: caching, delayed initialization, and
unit testing. Unit testing was a surprising application of abstract
immutability, and we believe that the ability to modify state is
needed for proper unit testing.
Next, we explore developers' revealed needs for immutability in the
source code. We found that the majority of classes contain a mix of
immutable and mutable methods, with a majority of the overall methods
being immutable. Immutability systems with only immutable or
all-mutating classes are insufficient: developers need immutability
declarations at method granularity. Our study then combined developer
immutability declarations with results from a static analysis to
estimate the true number of immutable methods. The static analysis
checked that no transitive writes to a receiver object occurred. Our
results indicated the need for a sophisticated analysis to check that
these apparently abstractly immutable methods were indeed abstractly
immutable.
Finally, we created a novel static analysis which checks that
developers follow abstract immutability. Specifically, we define
abstract immutability to mean that a class's set of immutable methods
is collectively free of writes to exposed fields. Our analysis found
incorrect usages of abstract immutability, such as incorrect caching.
This analysis is particularly valuable in the context of code
evolution, whereby subsequent programmers may make changes that break
previously-correct cache implementations, for instance. Our work
allows developers to trust that their code is abstractly immutable