142 research outputs found

    Developing App from User Feedback using Deep Learning

    Get PDF

    PreciseBugCollector: Extensible, Executable and Precise Bug-fix Collection

    Full text link
    Bug datasets are vital for enabling deep learning techniques to address software maintenance tasks related to bugs. However, existing bug datasets suffer from precise and scale limitations: they are either small-scale but precise with manual validation or large-scale but imprecise with simple commit message processing. In this paper, we introduce PreciseBugCollector, a precise, multi-language bug collection approach that overcomes these two limitations. PreciseBugCollector is based on two novel components: a) A bug tracker to map the codebase repositories with external bug repositories to trace bug type information, and b) A bug injector to generate project-specific bugs by injecting noise into the correct codebases and then executing them against their test suites to obtain test failure messages. We implement PreciseBugCollector against three sources: 1) A bug tracker that links to the national vulnerability data set (NVD) to collect general-wise vulnerabilities, 2) A bug tracker that links to OSS-Fuzz to collect general-wise bugs, and 3) A bug injector based on 16 injection rules to generate project-wise bugs. To date, PreciseBugCollector comprises 1057818 bugs extracted from 2968 open-source projects. Of these, 12602 bugs are sourced from bug repositories (NVD and OSS-Fuzz), while the remaining 1045216 project-specific bugs are generated by the bug injector. Considering the challenge objectives, we argue that a bug injection approach is highly valuable for the industrial setting, since project-specific bugs align with domain knowledge, share the same codebase, and adhere to the coding style employed in industrial projects.Comment: Accepted at the industry challenge track of ASE 202

    Automated Testing and Bug Reproduction of Android Apps

    Get PDF
    The large demand of mobile devices creates significant concerns about the quality of mobile applications (apps). The corresponding increase in app complexity has made app testing and maintenance activities more challenging. During app development phase, developers need to test the app in order to guarantee its quality before releasing it to the market. During the deployment phase, developers heavily rely on bug reports to reproduce failures reported by users. Because of the rapid releasing cycle of apps and limited human resources, it is difficult for developers to manually construct test cases for testing the apps or diagnose failures from a large number of bug reports. However, existing automated test case generation techniques are ineffective in exploring most effective events that can quickly improve code coverage and fault detection capability. In addition, none of existing techniques can reproduce failures directly from bug reports. This dissertation provides a framework that employs artifact intelligence (AI) techniques to improve testing and debugging of mobile apps. Specifically, the testing approach employs a Q-network that learns a behavior model from a set of existing apps and the learned model can be used to explore and generate tests for new apps. The framework is able to capture the fine-grained details of GUI events (e.g., visiting times of events, text on the widgets) and use them as features that are fed into a deep neural network, which acts as the agent to guide the app exploration. The debugging approach focuses on automatically reproducing crashes from bug reports for mobile apps. The approach uses a combination of natural language processing (NLP), deep learning, and dynamic GUI exploration to synthesize event sequences with the goal of reproducing the reported crash

    Avatud lähtekoodiga tarkvaraprojektide vearaportite ja tehniliste sõltuvuste haldamise analüüsimine

    Get PDF
    Nüüdisaegses tarkvaraarenduses kasutatakse avatud lähtekoodiga tarkvara komponente, et vähendada korratava töö hulka. Tarkvaraarendajad lisavad vaba lähtekoodiga komponente oma projektidesse, omamata ülevaadet kasutatud komponentide arendamisest ja hooldamisest. Selle töö eesmärk on analüüsida tarkvaraprojektide vearaporteid ja sõltuvuste haldamist ning arendada välja kohased meetodid. Tarkvaraprojektides kasutatakse töö organiseerimiseks veahaldussüsteeme, mille abil hallatakse tööülesandeid, vearaporteid ja uusi kasutajanõudeid. Enamat kui 4000 avatud lähtekoodiga projekti analüüsides selgus, et paljud vearaportid jäävad pikaks ajaks lahendamata. Muu hulgas võib nii ka mõni kriitiline turvaviga parandamata jääda. Doktoritöös arendatakse välja meetod, mis võimaldab automaatselt hinnata vearaporti lahendamiseks kuluvat aega. Meetod põhineb veahaldussüsteemi talletunud andmete analüüsil. Vearaporti eluaja hindamine aitab projektiosalistel prioriseerida tööülesandeid ja planeerida ressursse. Töö teises osas uuritakse, kuidas avatud lähtekoodiga projektide koodis kolmanda poole komponente kasutatakse. Tarkvaraarendajad kasutavad varem väljaarendatud komponente, et kiirendada arendust ja vähendada korratava töö hulka. Samamoodi kasutavad spetsiifilised komponendid veel omakorda teisi komponente, misläbi moodustub komponentide vaheliste seoste kaudu sõltuvuslik võrgustik. Selles doktoritöös analüüsitakse sõltuvuste võrgustikku populaarsete programmeerimiskeelte näidetel. Töö käigus arendatud meetod on rakendatav sõltuvuste võrgustiku struktuuri ja kasvu analüüsimiseks. Töös demonstreeritakse, kuidas võrgustiku struktuuri analüüsi abil saab hinnata tarkvaraprojektide riski hõlmata sõltuvusahela kaudu mõni turvaviga. Doktoritöös arendatud meetodid ja tulemused aitavad avatud lähtekoodiga projektide vearaportite ja tehniliste sõltuvuste haldamise praktikat läbipaistvamaks muuta.Modern software development relies on open-source software to facilitate reuse and reduce redundant work. Software developers use open-source packages in their projects without having insights into how these components are being developed and maintained. The aim of this thesis is to develop approaches for analyzing issue and dependency management in software projects. Software projects organize their work with issue trackers, tools for tracking issues such as development tasks, bug reports, and feature requests. By analyzing issue handling in more than 4,000 open-source projects, we found that many issues are left open for long periods of time, which can result in bugs and vulnerabilities not being fixed in a timely manner. This thesis proposes a method for predicting the amount of time it takes to resolve an issue by using the historical data available in issue trackers. Methods for predicting issue lifetime can help software project managers to prioritize issues and allocate resources accordingly. Another problem studied in this thesis is how software dependencies are used. Software developers often include third-party open-source software packages in their project code as a dependency. The included dependencies can also have their own dependencies. A complex network of dependency relationships exists among open-source software packages. This thesis analyzes the structure and the evolution of dependency networks of three popular programming languages. We propose an approach to measure the growth and the evolution of dependency networks. This thesis demonstrates that dependency network analysis can quantify what is the likelihood of acquiring vulnerabilities through software packages and how it changes over time. The approaches and findings developed here could help to bring transparency into open-source projects with respect to how issues are handled, or dependencies are updated

    Neural Embeddings for Web Testing

    Full text link
    Web test automation techniques employ web crawlers to automatically produce a web app model that is used for test generation. Existing crawlers rely on app-specific, threshold-based, algorithms to assess state equivalence. Such algorithms are hard to tune in the general case and cannot accurately identify and remove near-duplicate web pages from crawl models. Failing to retrieve an accurate web app model results in automated test generation solutions that produce redundant test cases and inadequate test suites that do not cover the web app functionalities adequately. In this paper, we propose WEBEMBED, a novel abstraction function based on neural network embeddings and threshold-free classifiers that can be used to produce accurate web app models during model-based test generation. Our evaluation on nine web apps shows that WEBEMBED outperforms state-of-the-art techniques by detecting near-duplicates more accurately, inferring better web app models that exhibit 22% more precision, and 24% more recall on average. Consequently, the test suites generated from these models achieve higher code coverage, with improvements ranging from 2% to 59% on an app-wise basis and averaging at 23%.Comment: 12 pages; in revisio

    Holistic recommender systems for software engineering

    Get PDF
    The knowledge possessed by developers is often not sufficient to overcome a programming problem. Short of talking to teammates, when available, developers often gather additional knowledge from development artifacts (e.g., project documentation), as well as online resources. The web has become an essential component in the modern developer’s daily life, providing a plethora of information from sources like forums, tutorials, Q&A websites, API documentation, and even video tutorials. Recommender Systems for Software Engineering (RSSE) provide developers with assistance to navigate the information space, automatically suggest useful items, and reduce the time required to locate the needed information. Current RSSEs consider development artifacts as containers of homogeneous information in form of pure text. However, text is a means to represent heterogeneous information provided by, for example, natural language, source code, interchange formats (e.g., XML, JSON), and stack traces. Interpreting the information from a pure textual point of view misses the intrinsic heterogeneity of the artifacts, thus leading to a reductionist approach. We propose the concept of Holistic Recommender Systems for Software Engineering (H-RSSE), i.e., RSSEs that go beyond the textual interpretation of the information contained in development artifacts. Our thesis is that modeling and aggregating information in a holistic fashion enables novel and advanced analyses of development artifacts. To validate our thesis we developed a framework to extract, model and analyze information contained in development artifacts in a reusable meta- information model. We show how RSSEs benefit from a meta-information model, since it enables customized and novel analyses built on top of our framework. The information can be thus reinterpreted from an holistic point of view, preserving its multi-dimensionality, and opening the path towards the concept of holistic recommender systems for software engineering

    Understanding and Generating Patches for Bugs Introduced by Third-party Library Upgrades

    Get PDF
    During the process of software development, developers rely heavily on third-party libraries to enable functionalities and features in their projects. However, developers are faced with challenges of managing dependency messes when a project evolves. One of the most challenging problems is to handle issues caused by dependency upgrades. To better understand the issues caused by Third-party Library Upgrades (TLU), in this thesis, we conduct a comprehensive study of the bugs caused by dependency upgrades. The study is conducted on a collection of 8,952 open-source Java projects from GitHub and 304 Java projects on Apache Software Foundation (ASF) JIRA systems. We collect 83 bugs caused by inappropriate TLUs in total. Our inspection shows that TLUs are conducted out of different reasons. The most popular reason is that the project is preparing for release and wants to keep its dependencies up-to-date (62.3%). Another popular reason is that the older version of a dependency is not compatible with other dependencies (15.3%). Our inspection also indicates that the problems introduced by inappropriate dependency upgrades can be categorized into different types, i.e., program failures that are detectable statically and dynamically. Then, we investigate developers’ efforts on repairing bugs caused by inappropriate TLUs. We notice that 32.53% of these bugs can be fixed by only modifying the build scripts (which we call TLU-build bugs), 20.48% of them can be fixed by merely modifying the source code (which is called TLU-code bugs), and 16.87% of them require modifications in multiple sources. TLU-build bugs and TLU-code bugs as the two most popular types, are explored more by us. For TLU-code bugs, we summarize the common ways used to fix them. Furthermore, we study whether current repair techniques can fix TLU-code bugs efficiently. For the 14 TLU-code bugs that cause test failures and runtime failures, the study shows that existing automated program repair tools can only work on 6 of the 14 investigated bugs. Each of them can only fix a limited amount of the 6 bugs, but the union of them can finally fix 5 out of 6 bugs. For TLU-build bugs, by leveraging the knowledge from our study, we summarize common patterns to fix build scripts, and propose a technique to automatically fix them. Our evaluation shows the proposed technique can successfully fix 9 out of 14 TLU-build bugs
    corecore