3,258 research outputs found
Analysis and Detection of Information Types of Open Source Software Issue Discussions
Most modern Issue Tracking Systems (ITSs) for open source software (OSS)
projects allow users to add comments to issues. Over time, these comments
accumulate into discussion threads embedded with rich information about the
software project, which can potentially satisfy the diverse needs of OSS
stakeholders. However, discovering and retrieving relevant information from the
discussion threads is a challenging task, especially when the discussions are
lengthy and the number of issues in ITSs are vast. In this paper, we address
this challenge by identifying the information types presented in OSS issue
discussions. Through qualitative content analysis of 15 complex issue threads
across three projects hosted on GitHub, we uncovered 16 information types and
created a labeled corpus containing 4656 sentences. Our investigation of
supervised, automated classification techniques indicated that, when prior
knowledge about the issue is available, Random Forest can effectively detect
most sentence types using conversational features such as the sentence length
and its position. When classifying sentences from new issues, Logistic
Regression can yield satisfactory performance using textual features for
certain information types, while falling short on others. Our work represents a
nontrivial first step towards tools and techniques for identifying and
obtaining the rich information recorded in the ITSs to support various software
engineering activities and to satisfy the diverse needs of OSS stakeholders.Comment: 41st ACM/IEEE International Conference on Software Engineering
(ICSE2019
Influence of developer factors on code quality: a data study
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Automatic source-code inspection tools help to assess,
monitor and improve code quality. Since these tools only
examine the software project’s codebase, they overlook other
possible factors that may impact code quality and the assessment of the technical debt (TD). Our initial hypothesis is that human factors associated with the software developers, like coding expertise, communication skills, and experience in the project have some measurable impact on the code quality. In this exploratory study, we test this hypothesis on two large open source repositories, using TD as a code quality metric and the data that may be inferred from the version control systems. The preliminary results of our statistical analysis suggest that the level of participation of the developers and their experience in the project have a positive correlation with the amount of TD
that they introduce. On the contrary, communication skills have
barely any impact on TD.Peer ReviewedPostprint (author's final draft
- …