866 research outputs found

    Some issues in the 'archaeology' of software evolution

    Get PDF
    During a software project's lifetime, the software goes through many changes, as components are added, removed and modified to fix bugs and add new features. This paper is intended as a lightweight introduction to some of the issues arising from an `archaeological' investigation of software evolution. We use our own work to look at some of the challenges faced, techniques used, findings obtained, and lessons learnt when measuring and visualising the historical changes that happen during the evolution of software

    Finding Bugs in Web Applications Using Dynamic Test Generation and Explicit State Model Checking

    Get PDF
    Web script crashes and malformed dynamically-generated web pages are common errors, and they seriously impact the usability of web applications. Current tools for web-page validation cannot handle the dynamically generated pages that are ubiquitous on today's Internet. We present a dynamic test generation technique for the domain of dynamic web applications. The technique utilizes both combined concrete and symbolic execution and explicit-state model checking. The technique generates tests automatically, runs the tests capturing logical constraints on inputs, and minimizes the conditions on the inputs to failing tests, so that the resulting bug reports are small and useful in finding and fixing the underlying faults. Our tool Apollo implements the technique for the PHP programming language. Apollo generates test inputs for a web application, monitors the application for crashes, and validates that the output conforms to the HTML specification. This paper presents Apollo's algorithms and implementation, and an experimental evaluation that revealed 302 faults in 6 PHP web applications

    Using multiclass classification algorithms to improve text categorization tool:NLoN

    Get PDF
    Abstract. Natural language processing (NLP) and machine learning techniques have been widely utilized in the mining software repositories (MSR) field in recent years. Separating natural language from source code is a pre-processing step that is needed in both NLP and the MSR domain for better data quality. This paper presents the design and implementation of a multi-class classification approach that is based on the existing open-source R package Natural Language or Not (NLoN). This article also reviews the existing literature on MSR and NLP. The review classified the information sources and approaches of MSR in detail, and also focused on the text representation and classification tasks of NLP. In addition, the design and implementation methods of the original paper are briefly introduced. Regarding the research methodology, since the research goal is technology-oriented, i.e., to improve the design and implementation of existing technologies, this article adopts the design science research methodology and also describes how the methodology was adopted. This research implements an open-source Python library, namely NLoN-PY. This is an open-source library hosted on GitHub, and users can also directly use the tools published to the PyPI library. Since NLoN has achieved comparable performance on two-class classification tasks with the Lasso regression model, this study evaluated other multi-class classification algorithms, i.e., Naive Bayes, k-Nearest Neighbours, and Support Vector Machine. Using 10-fold cross-validation, the expanded classifier achieved AUC performance of 0.901 for the 5-class classification task and the AUC performance of 0.92 for the 2-class task. Although the design of this study did not show a significant performance improvement compared to the original design, the impact of unbalanced data distribution on performance was detected and the category of the classification problem was also refined in the process. These findings on the multi-class classification design can provide a research foundation or direction for future research

    Comparing infrared and webcam eye tracking in the Visual World Paradigm

    Get PDF
    Visual World eye tracking is a temporally fine-grained method of monitoring attention, making it a popular tool in the study of online sentence processing. Recently, while infrared eye tracking was mostly unavailable, various web-based experiment platforms have rapidly developed webcam eye tracking functionalities, which are now in urgent need of testing and evaluation. We replicated a recent Visual World study on the incremental processing of verb aspect in English using ‘out of the box’ webcam eye tracking software (jsPsych; de Leeuw, 2015) and crowdsourced participants, and fully replicated both the offline and online results of the original study. We furthermore discuss factors influencing the quality and interpretability of webcam eye tracking data, particularly with regards to temporal and spatial resolution; and conclude that remote webcam eye tracking can serve as an affordable and accessible alternative to lab-based infrared eye tracking, even for questions probing the time-course of language processing

    On the Concerns of Developers When Using GitHub Copilot

    Full text link
    With the recent advancement of Artificial Intelligence (AI) and the emergence of Large Language Models (LLMs), AI-based code generation tools have achieved significant progress and become a practical solution for software development. GitHub Copilot, referred to as AI pair programmer, utilizes machine learning models that are trained on a large corpus of code snippets to generate code suggestions or auto-complete code using natural language processing. Despite its popularity, there is little empirical evidence on the actual experiences of software developers who work with Copilot. To this end, we conducted an empirical study to understand the issues and challenges that developers face when using Copilot in practice, as well as their underlying causes and potential solutions. We collected data from 476 GitHub issues, 706 GitHub discussions, and 184 Stack Overflow posts, and identified the issues, causes that trigger the issues, and solutions that resolve the issues when using Copilot. Our results reveal that (1) Usage Issue and Compatibility Issue are the most common problems faced by Copilot users, (2) Copilot Internal Issue, Network Connection Issue, and Editor/IDE Compatibility Issue are identified as the most frequent causes, and (3) Bug Fixed by Copilot, Modify Configuration/Setting, and Use Suitable Version are the predominant solutions. Based on the results, we delve into the main challenges users encounter when implementing Copilot in practical development, the possible impact of Copilot on the coding process, aspects in which Copilot can be further enhanced, and potential new features desired by Copilot users
    • …
    corecore