5 research outputs found

    Cross-Dataset Design Discussion Mining

    Full text link
    Being able to identify software discussions that are primarily about design, which we call design mining, can improve documentation and maintenance of software systems. Existing design mining approaches have good classification performance using natural language processing (NLP) techniques, but the conclusion stability of these approaches is generally poor. A classifier trained on a given dataset of software projects has so far not worked well on different artifacts or different datasets. In this study, we replicate and synthesize these earlier results in a meta-analysis. We then apply recent work in transfer learning for NLP to the problem of design mining. However, for our datasets, these deep transfer learning classifiers perform no better than less complex classifiers. We conclude by discussing some reasons behind the transfer learning approach to design mining.Comment: accepted for SANER 2020, Feb, London, ON. 12 pages. Replication package: https://doi.org/10.5281/zenodo.359012

    Tools, processes and factors influencing of code review

    Get PDF
    Code review is the most effective quality assurance strategy in software development where reviewers aim to identify defects and improve the quality of source code of both commercial and open-source software. Ultimately, the main purpose of code review activities is to produce better software products. Review comments are the building blocks of code review. There are many approaches to conduct reviews and analysis source code such as pair programming, informal inspections, and formal inspections. Reviewers are responsible for providing comments and suggestions to improve the quality of the proposed source code modifications. This work aims to succinctly describe code review process, giving a framework of the tools and factors influencing code review to aid reviewers and authors in the code review stages and choose the suitable code review tool

    Toward Effective Secure Code Reviews: An Empirical Study of Security-Related Coding Weaknesses

    Full text link
    Identifying security issues early is encouraged to reduce the latent negative impacts on software systems. Code review is a widely-used method that allows developers to manually inspect modified code, catching security issues during a software development cycle. However, existing code review studies often focus on known vulnerabilities, neglecting coding weaknesses, which can introduce real-world security issues that are more visible through code review. The practices of code reviews in identifying such coding weaknesses are not yet fully investigated. To better understand this, we conducted an empirical case study in two large open-source projects, OpenSSL and PHP. Based on 135,560 code review comments, we found that reviewers raised security concerns in 35 out of 40 coding weakness categories. Surprisingly, some coding weaknesses related to past vulnerabilities, such as memory errors and resource management, were discussed less often than the vulnerabilities. Developers attempted to address raised security concerns in many cases (39%-41%), but a substantial portion was merely acknowledged (30%-36%), and some went unfixed due to disagreements about solutions (18%-20%). This highlights that coding weaknesses can slip through code review even when identified. Our findings suggest that reviewers can identify various coding weaknesses leading to security issues during code reviews. However, these results also reveal shortcomings in current code review practices, indicating the need for more effective mechanisms or support for increasing awareness of security issue management in code reviews
    corecore