154,477 research outputs found

    Automatically Documenting Software Artifacts

    Get PDF
    Software artifacts, such as database schema and unit test cases, constantly change during evolution and maintenance of software systems. Co-evolution of code and DB schemas in Database-Centric Applications (DCAs) often leads to two types of challenging scenarios for developers, where (i) changes to the DB schema need to be incorporated in the source code, and (ii) maintenance of a DCAs code requires understanding of how the features are implemented by relying on DB operations and corresponding schema constraints. On the other hand, the number of unit test cases often grows as new functionality is introduced into the system, and maintaining these unit tests is important to reduce the introduction of regression bugs due to outdated unit tests. Therefore, one critical artifact that developers need to be able to maintain during evolution and maintenance of software systems is up-to-date and complete documentation. In order to understand developer practices regarding documenting and maintaining these software artifacts, we designed two empirical studies both composed of (i) an online survey of contributors of open source projects and (ii) a mining-based analysis of method comments in these projects. We observed that documenting methods with database accesses and unit test cases is not a common practice. Further, motivated by the findings of the studies, we proposed three novel approaches: (i) DBScribe is an approach for automatically documenting database usages and schema constraints, (ii) UnitTestScribe is an approach for automatically documenting test cases, and (iii) TeStereo tags stereotypes for unit tests and generates html reports to improve the comprehension and browsing of unit tests in a large test suite. We evaluated our tools in the case studies with industrial developers and graduate students. In general, developers indicated that descriptions generated by the tools are complete, concise, and easy to read. The reports are useful for source code comprehension tasks as well as other tasks, such as code smell detection and source code navigation

    Which Software Faults Are Tests Not Detecting?

    Get PDF
    Context: Software testing plays an important role in assuring the reliability of systems. Assessing the efficacy of testing remains challenging with few established test effectiveness metrics. Those metrics that have been used (e.g. coverage and mutation analysis) have been criticised for insufficiently differentiating between the faults detected by tests. Objective: We investigate how effective tests are at detecting different types of faults and whether some types of fault evade tests more than others. Our aim is to suggest to developers specific ways in which their tests need to be improved to increase fault detection. Method: We investigate seven fault types and analyse how often each goes undetected in 10 open source systems. We statistically look for any relationship between the test set and faults. Results: Our results suggest that the fault detection rates of unit tests are relatively low, typically finding only about a half of all faults. In addition, conditional boundary and method call removals are less well detected by tests than other fault types. Conclusions: We conclude that the testing of these open source systems needs to be improved across the board. In addition, despite boundary cases being long known to attract faults, tests covering boundaries need particular improvement. Overall, we recommend that developers do not rely only on code coverage and mutation score to measure the effectiveness of their tests

    Symbolic Execution for Runtime Error Detection and Investigation of Refactoring Activities Based on a New Dataset

    Get PDF
    It is a big challenge in software engineering to produce huge, reliable and robust software systems. In industry, developers typically have to focus on solving problems quickly. The importance of code quality in time pressure is frequently secondary. However, software code quality is very important, because a too complex, hard-to-maintain code results in more bugs, and makes further development more expensive. The research work behind this thesis is inspired by the wish to develop high quality software systems in industry in a more effective and easier way to make the lives of customers and eventually end-users more comfortable and more effective. The thesis consists of two main topics: the utilization of symbolic execution for runtime error detection and the investigation of practical refactoring activity. Both topics address the area of program source code quality. Symbolic execution is a program analysis technique which explores the possible execution paths of a program by handling the inputs as unknown variables (symbolic variables). The main usages of symbolic execution are generating inputs of program failure and high-coverage test cases. It is able to expose defects that would be very difficult and timeconsuming to find through manual testing, and would be exponentially more costly to fix if they were not detected until runtime. In this work, we focus on runtime error detection (such as null pointer dereference, bad array indexing, division by zero, etc.) by discovering critical execution paths in Java programs. One of the greater challenges in symbolic execution is the very large number of possible execution paths, which increas exponentially. Our research proposes approaches to handling the aforementioned problem of path explosion by applying symbolic execution at the level of methods. We also investigated the limitations of this state space together with the development of efficient search heuristics. To make the detection of runtime errors more accurate, we propose a novel algorithm that keeps track of the conditions above symbolic variables during the analysis. Source code refactoring is a popular and powerful technique for improving the internal structure of software systems. The concept of refactoring was introduced by Martin Fowler. He originally proposed that detecting code smells should be the primary technique for identifying refactoring opportunities in the code. However, we lack empirical research results on how, when and why refactoring is used in everyday software development, what are its effects on short- and long-term maintainability and costs. By getting answers to these questions, we could understand how developers refactor code in practice, which would help propose new methods and tools for them that are aligned with their current habits leading to more effective software engineering methodologies in the industry. To help further empirical investigations of code refactoring, we proposed a publicly available refactoring dataset. The dataset consists of refactorings and source code metrics of open-source Java systems. We subjected the dataset to an analysis of the effects of code refactoring on source code metrics and maintainability, which are primary quality attributes in software development

    Improving regression testing efficiency and reliability via test-suite transformations

    Get PDF
    As software becomes more important and ubiquitous, high quality software also becomes crucial. Developers constantly make changes to improve software, and they rely on regression testing—the process of running tests after every change—to ensure that changes do not break existing functionality. Regression testing is widely used both in industry and in open source, but it suffers from two main challenges. (1) Regression testing is costly. Developers run a large number of tests in the test suite after every change, and changes happen very frequently. The cost is both in the time developers spend waiting for the tests to finish running so that developers know whether the changes break existing functionality, and in the monetary cost of running the tests on machines. (2) Regression test suites contain flaky tests, which nondeterministically pass or fail when run on the same version of code, regardless of any changes. Flaky test failures can mislead developers into believing that their changes break existing functionality, even though those tests can fail without any changes. Developers will therefore waste time trying to debug non existent faults in their changes. This dissertation proposes three lines of work that address these challenges of regression testing through test-suite transformations that modify test suites to make them more efficient or more reliable. Specifically, two lines of work explore how to reduce the cost of regression testing and one line of work explores how to fix existing flaky tests. First, this dissertation investigates the effectiveness of test-suite reduction (TSR), a traditional test-suite transformation that removes tests deemed redundant with respect to other tests in the test suite based on heuristics. TSR outputs a smaller, reduced test suite to be run in the future. However, TSR risks removing tests that can potentially detect faults in future changes. While TSR was proposed over two decades ago, it was always evaluated using program versions with seeded faults. Such evaluations do not precisely predict the effectiveness of the reduced test suite on the future changes. This dissertation evaluates TSR in a real-world setting using real software evolution with real test failures. The results show that TSR techniques proposed in the past are not as effective as suggested by traditional TSR metrics, and those same metrics do not predict how effective a reduced test suite is in the future. Researchers need to either propose new TSR techniques that produce more effective reduced test suites or better metrics for predicting the effectiveness of reduced test suites. Second, this dissertation proposes a new transformation to improve regression testing cost when using a modern build system by optimizing the placement of tests, implemented in a technique called TestOptimizer. Modern build systems treat a software project as a group of inter-dependent modules, including test modules that contain only tests. As such, when developers make a change, the build system can use a developer-specified dependency graph among modules to determine which test modules are affected by any changed modules and to run only tests in the affected test modules. However, wasteful test executions are a problem when using build systems this way. Suboptimal placements of tests, where developers may place some tests in a module that has more dependencies than the test actually needs, lead to running more tests than necessary after a change. TestOptimizer analyzes a project and proposes moving tests to reduce the number of test executions that are triggered over time due to developer changes. Evaluation of TestOptimizer on five large proprietary projects at Microsoft shows that the suggested test movements can reduce 21.7 million test executions (17.1%) across all evaluation projects. Developers accepted and intend to implement 84.4% of the reported suggestions. Third, to make regression testing more reliable, this dissertation proposes iFixFlakies, a framework for fixing a prominent kind of flaky tests: order dependent tests. Order-dependent tests pass or fail depending on the order in which the tests are run. Intuitively, order-dependent tests fail either because they need another test to set up the state for them to pass, or because some other test pollutes the state before they are run, and the polluted state makes them fail. The key insight behind iFixFlakies is that test suites often already have tests, which we call helpers, that contain the logic for setting/resetting the state needed for order-dependent tests to pass. iFixFlakies searches a test suite for these helpers and then recommends patches for order-dependent tests using code from the helpers. Evaluation of iFixFlakies on 137 truly order-dependent tests from a public dataset shows that 81 of them have helpers, and iFixFlakies can fix all 81. Furthermore, among our GitHub pull requests for 78 of these order dependent tests (3 of 81 had been already fixed), developers accepted 38; the remaining ones are still pending, and none are rejected so far

    Unit testing methods for Internet of Things Mbed OS operating system

    Get PDF
    Abstract. Embedded operating systems for Internet of Things are responsible for managing hardware and software in these systems. From the vast number of IoT operating system projects available, some projects are backed by large companies or institutes and some are developed completely by the open source community. IoT operating system testing focuses on the key features of IoT such as networking and limited resources. In this thesis, problems in Mbed OS operating system testing methods are identified and a unit testing solution is implemented. The implemented unit testing framework allows developers to write and run unit tests. The framework is also integrated into Mbed OS continuous integration to increase test coverage. This thesis shows how functional testing and unit testing are the most common types of testing in open source embedded operating system projects. Mbed OS unit testing framework results shows how running tests on PC platforms is faster than running tests on IoT devices. This framework also enables developers to write unit tests more freely and improve Mbed OS development process. The implemented unit testing framework solved issues in Mbed OS testing but more in depth research is needed to improve testing methods further.Yksikkötestausmenetelmät esineiden internet Mbed OS käyttöjärjestelmälle. Tiivistelmä. Esineiden internettiin tarkoitetut sulautetut käyttöjärjestelmät ovat tarvittavia laitteiston ja sovellusten hallintaan IoT järjestelmissä. Saatavilla olevien IoT käyttöjärjestelmien joukosta osa on suurten yritysten tai instituutioiden tukemia, ja osa on täysin vapaan lähdekoodin yhteisön kehittämiä. IoT käyttöjärjestelmän testaus keskittyy esineiden internetin avainominaisuuksiin kuten verkkotietoliikenteeseen ja rajallisiin resursseihin. Työssä tunnistetaan Mbed OS käyttöjärjestelmän testausmenetelmien ongelmia ja kehitetään yksikkötestaustyökalu. Kehitetty yksikkötestausympäristö mahdollistaa kehittäjille yksikkötestien kirjoittamisen ja ajamisen. Testaustyökalu yhdistetään myös Mbed OS jatkuvan integraation prosessiin testauskattavuuden parantamiseksi. Työssä katsotaan kuinka funktionaaliset testit ja yksikkötestit ovat yleisimmät testityypit avoimen lähdekoodin sulautetuissa käyttöjärjestelmäprojekteissa. Mbed OS yksikkötestaustyökalu näyttää kuinka testien ajaminen PC ympäristössä on nopeampaa kuin IoT laitteissa. Tämä työkalu myös mahdollistaa kehittäjien kirjoittaa yksikkötestejä vapaammin ja siten parantaa kehitysprosessia. Kehitetty yksikkötestaustyökalu ratkaisi Mbed OS testauksen ongelmia, mutta syventävää tutkimusta tarvitaan enemmän testausmenetelmien parantamiseksi edelleen

    To the attention of mobile software developers: Guess what, test your app!

    Get PDF
    Software testing is an important phase in the software development life-cycle because it helps in identifying bugs in a software system before it is shipped into the hand of its end users. There are numerous studies on how developers test general-purpose software applications. The idiosyncrasies of mobile software applications, however, set mobile apps apart from general-purpose systems (e.g., desktop, stand-alone applications, web services). This paper investigates working habits and challenges of mobile software developers with respect to testing. A key finding of our exhaustive study, using 1000 Android apps, demonstrates that mobile apps are still tested in a very ad hoc way, if tested at all. However, we show that, as in other types of software, testing increases the quality of apps (demonstrated in user ratings and number of code issues). Furthermore, we find evidence that tests are essential when it comes to engaging the community to contribute to mobile open source software. We discuss reasons and potential directions to address our findings. Yet another relevant finding of our study is that Continuous Integration and Continuous Deployment (CI/CD) pipelines are rare in the mobile apps world (only 26% of the apps are developed in projects employing CI/CD) --- we argue that one of the main reasons is due to the lack of exhaustive and automatic testing.Comment: Journal of Empirical Software Engineerin
    corecore