315 research outputs found

    Augmenting IDEs with Runtime Information for Software Maintenance

    Get PDF
    Object-oriented language features such as inheritance, abstract types, late-binding, or polymorphism lead to distributed and scattered code, rendering a software system hard to understand and maintain. The integrated development environment (IDE), the primary tool used by developers to maintain software systems, usually purely operates on static source code and does not reveal dynamic relationships between distributed source artifacts, which makes it difficult for developers to understand and navigate software systems. Another shortcoming of today's IDEs is the large amount of information with which they typically overwhelm developers. Large software systems encompass several thousand source artifacts such as classes and methods. These static artifacts are presented by IDEs in views such as trees or source editors. To gain an understanding of a system, developers have to open many such views, which leads to a workspace cluttered with different windows or tabs. Navigating through the code or maintaining a working context is thus difficult for developers working on large software systems. In this dissertation we address the question how to augment IDEs with dynamic information to better navigate scattered code while at the same time not overwhelming developers with even more information in the IDE views. We claim that by first reducing the amount of information developers have to deal with, we are subsequently able to embed dynamic information in the familiar source perspectives of IDEs to better comprehend and navigate large software spaces. We propose means to reduce or mitigate the information by highlighting relevant source elements, by explicitly representing working context, and by automatically housekeeping the workspace in the IDE. We then improve navigation of scattered code by explicitly representing dynamic collaboration and software features in the static source perspectives of IDEs. We validate our claim by conducting empirical experiments with developers and by analyzing recorded development sessions

    Programming tools for intelligent systems

    Full text link
    Les outils de programmation sont des programmes informatiques qui aident les humains à programmer des ordinateurs. Les outils sont de toutes formes et tailles, par exemple les éditeurs, les compilateurs, les débogueurs et les profileurs. Chacun de ces outils facilite une tâche principale dans le flux de travail de programmation qui consomme des ressources cognitives lorsqu’il est effectué manuellement. Dans cette thèse, nous explorons plusieurs outils qui facilitent le processus de construction de systèmes intelligents et qui réduisent l’effort cognitif requis pour concevoir, développer, tester et déployer des systèmes logiciels intelligents. Tout d’abord, nous introduisons un environnement de développement intégré (EDI) pour la programmation d’applications Robot Operating System (ROS), appelé Hatchery (Chapter 2). Deuxièmement, nous décrivons Kotlin∇, un système de langage et de type pour la programmation différenciable, un paradigme émergent dans l’apprentissage automatique (Chapter 3). Troisièmement, nous proposons un nouvel algorithme pour tester automatiquement les programmes différenciables, en nous inspirant des techniques de tests contradictoires et métamorphiques (Chapter 4), et démontrons son efficacité empirique dans le cadre de la régression. Quatrièmement, nous explorons une infrastructure de conteneurs basée sur Docker, qui permet un déploiement reproductible des applications ROS sur la plateforme Duckietown (Chapter 5). Enfin, nous réfléchissons à l’état actuel des outils de programmation pour ces applications et spéculons à quoi pourrait ressembler la programmation de systèmes intelligents à l’avenir (Chapter 6).Programming tools are computer programs which help humans program computers. Tools come in all shapes and forms, from editors and compilers to debuggers and profilers. Each of these tools facilitates a core task in the programming workflow which consumes cognitive resources when performed manually. In this thesis, we explore several tools that facilitate the process of building intelligent systems, and which reduce the cognitive effort required to design, develop, test and deploy intelligent software systems. First, we introduce an integrated development environment (IDE) for programming Robot Operating System (ROS) applications, called Hatchery (Chapter 2). Second, we describe Kotlin∇, a language and type system for differentiable programming, an emerging paradigm in machine learning (Chapter 3). Third, we propose a new algorithm for automatically testing differentiable programs, drawing inspiration from techniques in adversarial and metamorphic testing (Chapter 4), and demonstrate its empirical efficiency in the regression setting. Fourth, we explore a container infrastructure based on Docker, which enables reproducible deployment of ROS applications on the Duckietown platform (Chapter 5). Finally, we reflect on the current state of programming tools for these applications and speculate what intelligent systems programming might look like in the future (Chapter 6)

    Thinking interactively with visualization

    Get PDF
    Interaction is becoming an integral part of using visualization for analysis. When interaction is tightly and appropriately coupled with visualization, it can transform the visualization from display- ing static imagery to assisting comprehensive analysis of data at all scales. In this relationship, a deeper understanding of the role of interaction, its effects, and how visualization relates to interaction is necessary for designing systems in which the two components complement each other. This thesis approaches interaction in visualization from three different perspectives. First, it considers the cost of maintaining interaction in manipulating visualization of large datasets. Namely, large datasets often require a simplification process for the visualization to maintain interactivity, and this thesis examines how simplification affects the resulting visualization. Secondly, example interactive visual analytical systems are presented to demonstrate how interactivity could be applied in visualization. Specifically, four fully developed systems for four distinct problem domains are discussed to determine the common role of interactivity in these visualizations that make the systems successful. Lastly, this thesis presents evidence that interactions are important for analytical tasks using visualizations. Interaction logs of financial analysts using a visualization were collected, coded, and examined to determine the amount of analysis strategies contained within the interaction logs. The finding supports the benefits of high interactivity in analytical tasks when using a visualization. The example visualizations used to support these three perspectives are diverse in their goals and features. However, they all share similar design guidelines and visualization principles. Based on their characteristics, this thesis groups these visualizations into urban visualization, visual analytical systems, and interaction capturing and discusses them separately in terms of lessons learned and future directions

    A systematic literature review on source code similarity measurement and clone detection: techniques, applications, and challenges

    Full text link
    Measuring and evaluating source code similarity is a fundamental software engineering activity that embraces a broad range of applications, including but not limited to code recommendation, duplicate code, plagiarism, malware, and smell detection. This paper proposes a systematic literature review and meta-analysis on code similarity measurement and evaluation techniques to shed light on the existing approaches and their characteristics in different applications. We initially found over 10000 articles by querying four digital libraries and ended up with 136 primary studies in the field. The studies were classified according to their methodology, programming languages, datasets, tools, and applications. A deep investigation reveals 80 software tools, working with eight different techniques on five application domains. Nearly 49% of the tools work on Java programs and 37% support C and C++, while there is no support for many programming languages. A noteworthy point was the existence of 12 datasets related to source code similarity measurement and duplicate codes, of which only eight datasets were publicly accessible. The lack of reliable datasets, empirical evaluations, hybrid methods, and focuses on multi-paradigm languages are the main challenges in the field. Emerging applications of code similarity measurement concentrate on the development phase in addition to the maintenance.Comment: 49 pages, 10 figures, 6 table

    Studying the lives of software bugs

    Get PDF
    For as long as people have made software, they have made mistakes in that software. Software bugs are widespread, and the maintenance required to fix them has a major impact on the cost of software and how developers' time is spent. Reducing this maintenance time would lower the cost of software and allow for developers to spend more time on new features, improving the software for end-users. Bugs are hugely diverse and have a complex life cycle. This makes them difficult to study, and research is often carried out on synthetic bugs or toy programs. However, a better understanding of the bug life cycle would greatly aid in developing tools to reduce the time spent on maintenance. This thesis will study the life cycle of bugs, and develop such an understanding. Overall, this thesis examines over 3000 real bugs, from real projects, concentrating on three of the most important points in the life cycle: origin, reporting and fix. Firstly, two existing techniques are compared for discovering the origin of a bug. A number of improvements are evaluated, and the most effective approach is found to be combining the techniques. Furthermore, the behaviour of developers is found to have a major impact on the accuracy of the techniques. Secondly, a large number of bugs are analysed to determine what information is provided when users report bugs. For most bugs, much important information is missing, or inaccurate. Most importantly, there appears to be a considerable gap between what users provide and what developers actually want. Finally, an evaluation is carried out on a number of novel alterations to techniques used to determine the location of bug fixes. Compared to existing techniques, these alterations successfully increase the number of bugs which can be usefully localised, aiding developers in removing the bugs.For as long as people have made software, they have made mistakes in that software. Software bugs are widespread, and the maintenance required to fix them has a major impact on the cost of software and how developers' time is spent. Reducing this maintenance time would lower the cost of software and allow for developers to spend more time on new features, improving the software for end-users. Bugs are hugely diverse and have a complex life cycle. This makes them difficult to study, and research is often carried out on synthetic bugs or toy programs. However, a better understanding of the bug life cycle would greatly aid in developing tools to reduce the time spent on maintenance. This thesis will study the life cycle of bugs, and develop such an understanding. Overall, this thesis examines over 3000 real bugs, from real projects, concentrating on three of the most important points in the life cycle: origin, reporting and fix. Firstly, two existing techniques are compared for discovering the origin of a bug. A number of improvements are evaluated, and the most effective approach is found to be combining the techniques. Furthermore, the behaviour of developers is found to have a major impact on the accuracy of the techniques. Secondly, a large number of bugs are analysed to determine what information is provided when users report bugs. For most bugs, much important information is missing, or inaccurate. Most importantly, there appears to be a considerable gap between what users provide and what developers actually want. Finally, an evaluation is carried out on a number of novel alterations to techniques used to determine the location of bug fixes. Compared to existing techniques, these alterations successfully increase the number of bugs which can be usefully localised, aiding developers in removing the bugs

    Improving regression testing efficiency and reliability via test-suite transformations

    Get PDF
    As software becomes more important and ubiquitous, high quality software also becomes crucial. Developers constantly make changes to improve software, and they rely on regression testing—the process of running tests after every change—to ensure that changes do not break existing functionality. Regression testing is widely used both in industry and in open source, but it suffers from two main challenges. (1) Regression testing is costly. Developers run a large number of tests in the test suite after every change, and changes happen very frequently. The cost is both in the time developers spend waiting for the tests to finish running so that developers know whether the changes break existing functionality, and in the monetary cost of running the tests on machines. (2) Regression test suites contain flaky tests, which nondeterministically pass or fail when run on the same version of code, regardless of any changes. Flaky test failures can mislead developers into believing that their changes break existing functionality, even though those tests can fail without any changes. Developers will therefore waste time trying to debug non existent faults in their changes. This dissertation proposes three lines of work that address these challenges of regression testing through test-suite transformations that modify test suites to make them more efficient or more reliable. Specifically, two lines of work explore how to reduce the cost of regression testing and one line of work explores how to fix existing flaky tests. First, this dissertation investigates the effectiveness of test-suite reduction (TSR), a traditional test-suite transformation that removes tests deemed redundant with respect to other tests in the test suite based on heuristics. TSR outputs a smaller, reduced test suite to be run in the future. However, TSR risks removing tests that can potentially detect faults in future changes. While TSR was proposed over two decades ago, it was always evaluated using program versions with seeded faults. Such evaluations do not precisely predict the effectiveness of the reduced test suite on the future changes. This dissertation evaluates TSR in a real-world setting using real software evolution with real test failures. The results show that TSR techniques proposed in the past are not as effective as suggested by traditional TSR metrics, and those same metrics do not predict how effective a reduced test suite is in the future. Researchers need to either propose new TSR techniques that produce more effective reduced test suites or better metrics for predicting the effectiveness of reduced test suites. Second, this dissertation proposes a new transformation to improve regression testing cost when using a modern build system by optimizing the placement of tests, implemented in a technique called TestOptimizer. Modern build systems treat a software project as a group of inter-dependent modules, including test modules that contain only tests. As such, when developers make a change, the build system can use a developer-specified dependency graph among modules to determine which test modules are affected by any changed modules and to run only tests in the affected test modules. However, wasteful test executions are a problem when using build systems this way. Suboptimal placements of tests, where developers may place some tests in a module that has more dependencies than the test actually needs, lead to running more tests than necessary after a change. TestOptimizer analyzes a project and proposes moving tests to reduce the number of test executions that are triggered over time due to developer changes. Evaluation of TestOptimizer on five large proprietary projects at Microsoft shows that the suggested test movements can reduce 21.7 million test executions (17.1%) across all evaluation projects. Developers accepted and intend to implement 84.4% of the reported suggestions. Third, to make regression testing more reliable, this dissertation proposes iFixFlakies, a framework for fixing a prominent kind of flaky tests: order dependent tests. Order-dependent tests pass or fail depending on the order in which the tests are run. Intuitively, order-dependent tests fail either because they need another test to set up the state for them to pass, or because some other test pollutes the state before they are run, and the polluted state makes them fail. The key insight behind iFixFlakies is that test suites often already have tests, which we call helpers, that contain the logic for setting/resetting the state needed for order-dependent tests to pass. iFixFlakies searches a test suite for these helpers and then recommends patches for order-dependent tests using code from the helpers. Evaluation of iFixFlakies on 137 truly order-dependent tests from a public dataset shows that 81 of them have helpers, and iFixFlakies can fix all 81. Furthermore, among our GitHub pull requests for 78 of these order dependent tests (3 of 81 had been already fixed), developers accepted 38; the remaining ones are still pending, and none are rejected so far

    Test Generation and Dependency Analysis for Web Applications

    Get PDF
    In web application testing existing model based web test generators derive test paths from a navigation model of the web application, completed with either manually or randomly generated inputs. Test paths extraction and input generation are handled separately, ignoring the fact that generating inputs for test paths is difficult or even impossible if such paths are infeasible. In this thesis, we propose three directions to mitigate the path infeasibility problem. The first direction uses a search based approach defining novel set of genetic operators that support the joint generation of test inputs and feasible test paths. Results show that such search based approach can achieve higher level of model coverage than existing approaches. Secondly, we propose a novel web test generation algorithm that pre-selects the most promising candidate test cases based on their diversity from previously generated tests. Results of our empirical evaluation show that promoting diversity is beneficial not only to a thorough exploration of the web application behaviours, but also to the feasibility of automatically generated test cases. Moreover, the diversity based approach achieves higher coverage of the navigation model significantly faster than crawling based and search based approaches. The third approach we propose uses a web crawler as a test generator. As such, the generated tests are concrete, hence their navigations among the web application states are feasible by construction. However, the crawling trace cannot be easily turned into a minimal test suite that achieves the same coverage due to test dependencies. Indeed, test dependencies are undesirable in the context of regression testing, preventing the adoption of testing optimization techniques that assume tests to be independent. In this thesis, we propose the first approach to detect test dependencies in a given web test suite by leveraging the information available both in the web test code and on the client side of the web application. Results of our empirical validation show that our approach can effectively and efficiently detect test dependencies and it enables dependency aware formulations of test parallelization and test minimization
    corecore