38 research outputs found

    Computer Games Are Serious Business and so is their Quality: Particularities of Software Testing in Game Development from the Perspective of Practitioners

    Get PDF
    Over the last several decades, computer games started to have a significant impact on society. However, although a computer game is a type of software, the process to conceptualize, produce and deliver a game could involve unusual features. In software testing, for instance, studies demonstrated the hesitance of professionals to use automated testing techniques with games, due to the constant changes in requirements and design, and pointed out the need for creating testing tools that take into account the flexibility required for the game development process. Goal. This study aims to improve the current body of knowledge regarding software testing in game development and point out the existing particularities observed in software testing considering the development of a computer game. Method. A mixed-method approach based on a case study and an opinion survey was applied to collect quantitative and qualitative data from software professionals regarding the particularities of software testing in game development. Results. We analyzed over 70 messages posted on three well-established network of question-and-answer communities related to software engineering, software testing and game development and received answers of 38 professionals discussing differences between testing a computer game and a general software, and identified important aspects to be observed by practitioners in the process of planning, performing and reporting tests in this context. Conclusion. Considering computer games, software testing must focus not only on the common aspects of a general software, but also, track and investigate issues that could be related to game balance, game physics and entertainment related-aspects to guarantee the quality of computer games and a successful testing process

    Vulnerable Open Source Dependencies: Counting Those That Matter

    Full text link
    BACKGROUND: Vulnerable dependencies are a known problem in today's open-source software ecosystems because OSS libraries are highly interconnected and developers do not always update their dependencies. AIMS: In this paper we aim to present a precise methodology, that combines the code-based analysis of patches with information on build, test, update dates, and group extracted from the very code repository, and therefore, caters to the needs of industrial practice for correct allocation of development and audit resources. METHOD: To understand the industrial impact of the proposed methodology, we considered the 200 most popular OSS Java libraries used by SAP in its own software. Our analysis included 10905 distinct GAVs (group, artifact, version) when considering all the library versions. RESULTS: We found that about 20% of the dependencies affected by a known vulnerability are not deployed, and therefore, they do not represent a danger to the analyzed library because they cannot be exploited in practice. Developers of the analyzed libraries are able to fix (and actually responsible for) 82% of the deployed vulnerable dependencies. The vast majority (81%) of vulnerable dependencies may be fixed by simply updating to a new version, while 1% of the vulnerable dependencies in our sample are halted, and therefore, potentially require a costly mitigation strategy. CONCLUSIONS: Our case study shows that the correct counting allows software development companies to receive actionable information about their library dependencies, and therefore, correctly allocate costly development and audit resources, which is spent inefficiently in case of distorted measurements.Comment: This is a pre-print of the paper that appears, with the same title, in the proceedings of the 12th International Symposium on Empirical Software Engineering and Measurement, 201

    Assessing the Effect of Data Transformations on Test Suite Compilation

    Get PDF

    An Empirical Analysis of Vulnerabilities in Python Packages for Web Applications

    Full text link
    This paper examines software vulnerabilities in common Python packages used particularly for web development. The empirical dataset is based on the PyPI package repository and the so-called Safety DB used to track vulnerabilities in selected packages within the repository. The methodological approach builds on a release-based time series analysis of the conditional probabilities for the releases of the packages to be vulnerable. According to the results, many of the Python vulnerabilities observed seem to be only modestly severe; input validation and cross-site scripting have been the most typical vulnerabilities. In terms of the time series analysis based on the release histories, only the recent past is observed to be relevant for statistical predictions; the classical Markov property holds.Comment: Forthcoming in: Proceedings of the 9th International Workshop on Empirical Software Engineering in Practice (IWESEP 2018), Nara, IEE

    20-MAD -- 20 Years of Issues and Commits of Mozilla and Apache Development

    Full text link
    Data of long-lived and high profile projects is valuable for research on successful software engineering in the wild. Having a dataset with different linked software repositories of such projects, enables deeper diving investigations. This paper presents 20-MAD, a dataset linking the commit and issue data of Mozilla and Apache projects. It includes over 20 years of information about 765 projects, 3.4M commits, 2.3M issues, and 17.3M issue comments, and its compressed size is over 6 GB. The data contains all the typical information about source code commits (e.g., lines added and removed, message and commit time) and issues (status, severity, votes, and summary). The issue comments have been pre-processed for natural language processing and sentiment analysis. This includes emoticons and valence and arousal scores. Linking code repository and issue tracker information, allows studying individuals in two types of repositories and provide more accurate time zone information for issue trackers as well. To our knowledge, this the largest linked dataset in size and in project lifetime that is not based on GitHub.Comment: 17th International Conference on Mining Software Repositories, 202

    Quality measurement in agile and rapid software development: A systematic mapping

    Get PDF
    Context: In despite of agile and rapid software development (ARSD) being researched and applied extensively, managing quality requirements (QRs) are still challenging. As ARSD processes produce a large amount of data, measurement has become a strategy to facilitate QR management. Objective: This study aims to survey the literature related to QR management through metrics in ARSD, focusing on: bibliometrics, QR metrics, and quality-related indicators used in quality management. Method: The study design includes the definition of research questions, selection criteria, and snowballing as search strategy. Results: We selected 61 primary studies (2001-2019). Despite a large body of knowledge and standards, there is no consensus regarding QR measurement. Terminology is varying as are the measuring models. However, seemingly different measurement models do contain similarities. Conclusion: The industrial relevance of the primary studies shows that practitioners have a need to improve quality measurement. Our collection of measures and data sources can serve as a starting point for practitioners to include quality measurement into their decision-making processes. Researchers could benefit from the identified similarities to start building a common framework for quality measurement. In addition, this could help researchers identify what quality aspects need more focus, e.g., security and usability with few metrics reported.This work has been funded by the European Union’s Horizon 2020 research and innovation program through the Q-Rapids project (grant no. 732253). This research was also partially supported by the Spanish Ministerio de Economía, Industria y Competitividad through the DOGO4ML project (grant PID2020-117191RB-I00). Silverio Martínez-Fernández worked in Fraunhofer IESE before January 2020.Peer ReviewedPostprint (published version

    Snoring: A Noise Defect Prediction Datasets

    Get PDF
    Defect prediction aims at identifying software artifacts that are likely to exhibit a defect. The main purpose of defect prediction is to reduce the cost of testing and code review, by letting developers focus on specific artifacts. Several researchers have worked on improving the accuracy of defect estimation models using techniques such as tuning, re-balancing, or feature selection. Ultimately, the reliability of a prediction model depends on the quality of the dataset. Therefore effort has been spent in identifying sources of noise in the datasets, and how to deal with them, including defect misclassification and defect origin. A key component of defect prediction approaches is the attribution of a defect to a projects release. Although developers might be able to attribute a defect to a specific release, in most cases a defect is attributed to the release after which the defect has been discovered. However, in many circumstances, it can happen that a defect is only discovered several releases after its introduction. This might introduce a bias in the dataset, i.e., treating the intermediate releases as defect-free and the latter as defect-prone. We call this phenomenon a “sleeping defect”. We call “snoring” the phenomenon in which classes are affected by sleeping defects only, that would be treated as defect-free until the defect is discovered. In this work, we analyze, on data from more than 4,000 bugs and 600 releases of 20 open source projects from the Apache ecosystem for investigating: 1)the magnitude of the sleeping defects, 2) the magnitude of the snoring classes, 3)if snoring impacts the evaluation of classifiers, 4)if snoring impacts classifier accuracy, and 5)if removing the last releases of data is beneficial in reducing the negative impact of the snoring noise on classifiers accuracy. Our results show that, on average across projects: 1)most of the defects in a project slept for more than 19% of the existing releases, 2)the missing rate is more than 50% unless we remove more than 20% of the releases, 3) the relative error in measuring the classifier accuracy achieved by using a dataset with snoring is about 100% in all accuracy metrics other than AUC, 4) the presence of snoring decreases the accuracy in each of the 15 classifiers, in each of the 6 accuracy metrics. For instance, Recall, F1, Kappa and Matthews decreases by about 80%, and 5) removing one release of data is better than removing no data in all accuracy metrics. For instance, Recall, F1, Kappa and Matthews increase by about 30%
    corecore