3 research outputs found

    Automated Decomposition of Build Targets

    Full text link
    A (build) target specifies the information that is needed to automatically build a software artifact. Managing the de-pendencies between the targets of a large code base is chal-lenging. This paper focuses on underutilized targets—an im-portant dependency problem that we identified at Google. An underutilized target is one with files not needed by some of its dependents. Underutilized targets result in less mod-ular code, overly large artifacts, slow builds, and unneces-sary build and test triggers. To mitigate these problems, programmers decompose underutilized targets into smaller targets. However, manually decomposing a target is tedious and error-prone. Although we prove that finding the best target decomposition is NP-hard, we introduce a greedy algo-rithm that proposes a decomposition through iterative uni-fication of the strongly connected components of the target. Our tool found 19,994 decomposable targets in a set of 40,000 Java library targets at Google. A decomposable target is one that can be decomposed to at least two targets. Our tool found that decomposing any of the 5,129 decomposable tar-gets would save at least one build or test trigger. The eval-uation results show that our tool is (1) efficient because on average, it analyzes a target in two minutes and (2) effective because for each of 1,010 targets, it would save more than 50 % of the total execution time of the tests triggered by the target. 1

    Improving regression testing efficiency and reliability via test-suite transformations

    Get PDF
    As software becomes more important and ubiquitous, high quality software also becomes crucial. Developers constantly make changes to improve software, and they rely on regression testing—the process of running tests after every change—to ensure that changes do not break existing functionality. Regression testing is widely used both in industry and in open source, but it suffers from two main challenges. (1) Regression testing is costly. Developers run a large number of tests in the test suite after every change, and changes happen very frequently. The cost is both in the time developers spend waiting for the tests to finish running so that developers know whether the changes break existing functionality, and in the monetary cost of running the tests on machines. (2) Regression test suites contain flaky tests, which nondeterministically pass or fail when run on the same version of code, regardless of any changes. Flaky test failures can mislead developers into believing that their changes break existing functionality, even though those tests can fail without any changes. Developers will therefore waste time trying to debug non existent faults in their changes. This dissertation proposes three lines of work that address these challenges of regression testing through test-suite transformations that modify test suites to make them more efficient or more reliable. Specifically, two lines of work explore how to reduce the cost of regression testing and one line of work explores how to fix existing flaky tests. First, this dissertation investigates the effectiveness of test-suite reduction (TSR), a traditional test-suite transformation that removes tests deemed redundant with respect to other tests in the test suite based on heuristics. TSR outputs a smaller, reduced test suite to be run in the future. However, TSR risks removing tests that can potentially detect faults in future changes. While TSR was proposed over two decades ago, it was always evaluated using program versions with seeded faults. Such evaluations do not precisely predict the effectiveness of the reduced test suite on the future changes. This dissertation evaluates TSR in a real-world setting using real software evolution with real test failures. The results show that TSR techniques proposed in the past are not as effective as suggested by traditional TSR metrics, and those same metrics do not predict how effective a reduced test suite is in the future. Researchers need to either propose new TSR techniques that produce more effective reduced test suites or better metrics for predicting the effectiveness of reduced test suites. Second, this dissertation proposes a new transformation to improve regression testing cost when using a modern build system by optimizing the placement of tests, implemented in a technique called TestOptimizer. Modern build systems treat a software project as a group of inter-dependent modules, including test modules that contain only tests. As such, when developers make a change, the build system can use a developer-specified dependency graph among modules to determine which test modules are affected by any changed modules and to run only tests in the affected test modules. However, wasteful test executions are a problem when using build systems this way. Suboptimal placements of tests, where developers may place some tests in a module that has more dependencies than the test actually needs, lead to running more tests than necessary after a change. TestOptimizer analyzes a project and proposes moving tests to reduce the number of test executions that are triggered over time due to developer changes. Evaluation of TestOptimizer on five large proprietary projects at Microsoft shows that the suggested test movements can reduce 21.7 million test executions (17.1%) across all evaluation projects. Developers accepted and intend to implement 84.4% of the reported suggestions. Third, to make regression testing more reliable, this dissertation proposes iFixFlakies, a framework for fixing a prominent kind of flaky tests: order dependent tests. Order-dependent tests pass or fail depending on the order in which the tests are run. Intuitively, order-dependent tests fail either because they need another test to set up the state for them to pass, or because some other test pollutes the state before they are run, and the polluted state makes them fail. The key insight behind iFixFlakies is that test suites often already have tests, which we call helpers, that contain the logic for setting/resetting the state needed for order-dependent tests to pass. iFixFlakies searches a test suite for these helpers and then recommends patches for order-dependent tests using code from the helpers. Evaluation of iFixFlakies on 137 truly order-dependent tests from a public dataset shows that 81 of them have helpers, and iFixFlakies can fix all 81. Furthermore, among our GitHub pull requests for 78 of these order dependent tests (3 of 81 had been already fixed), developers accepted 38; the remaining ones are still pending, and none are rejected so far

    Vanishing Point: Where Infrastructures, Architectures, and Processes of Software Engineering Meet

    Get PDF
    In software project management, there exists a triangle-like relation connecting the required time to deliver a certain scope of software features with a certain cost. If one of these three is affected, others must compensate for this. For example, having a faster delivery means either a costlier product or less features, or both. Long delivery times are usually unacceptable in any case as the business environment is changing fast.To deal with this, contemporary software is mostly produced with Agile methods, which emphasise developing small increments to deliver constant stream of value to the customer. A small piece of software is easier to produce and test. Also, rapid feedback can be gained with tight co-operation with the customer. As the increments have become almost infinitesimally small, the working software can be constantly improved as the changes can be delivered to the end user almost instantly. However, this is only possible when the increments are reliably tested and the delivery itself is rapid. Thus, automation in these crucial parts is a must. Furthermore, the customer is not able to comment on every change in person, so the collection of the feedback must be automated. Furthermore, the software product itself has to support the continuous delivery. There exists a certain relation between these aspects–namely the tool infrastructure, processes and the architecture—reminiscent of the project triangle of time, cost, and scope.In this thesis, we examine the crucial properties these aspects in the context of increasing the speed of delivery–up to continuous delivery and deployment combined with the idea of continuous feedback. Also, the ramifications of rapid software delivery are studied. The research is carried out as interviews and related methods, such as surveys, to gain data from the companies involved in software development. Also, some quantitative analysis is used to back up the findings. As a result, a model is introduced based on the research. It can be used to explore the aspects and their interrelationships. We present a set of key enablers of increasing the delivery speed and present a set of side-effects that have to be considered. These can be used as a guideline in the companies which are striving to hasten their delivery pace. Additionally, a comparison of various companies based on their delivery speed is presented
    corecore