57 research outputs found

    A survey on software coupling relations and tools

    Full text link
    Context Coupling relations reflect the dependencies between software entities and can be used to assess the quality of a program. For this reason, a vast amount of them has been developed, together with tools to compute their related metrics. However, this makes the coupling measures suitable for a given application challenging to find. Goals The first objective of this work is to provide a classification of the different kinds of coupling relations, together with the metrics to measure them. The second consists in presenting an overview of the tools proposed until now by the software engineering academic community to extract these metrics. Method This work constitutes a systematic literature review in software engineering. To retrieve the referenced publications, publicly available scientific research databases were used. These sources were queried using keywords inherent to software coupling. We included publications from the period 2002 to 2017 and highly cited earlier publications. A snowballing technique was used to retrieve further related material. Results Four groups of coupling relations were found: structural, dynamic, semantic and logical. A fifth set of coupling relations includes approaches too recent to be considered an independent group and measures developed for specific environments. The investigation also retrieved tools that extract the metrics belonging to each coupling group. Conclusion This study shows the directions followed by the research on software coupling: e.g., developing metrics for specific environments. Concerning the metric tools, three trends have emerged in recent years: use of visualization techniques, extensibility and scalability. Finally, some coupling metrics applications were presented (e.g., code smell detection), indicating possible future research directions. Public preprint [https://doi.org/10.5281/zenodo.2002001]

    Metrics Dashboard Services: A Framework for Analyzing Free/Open Source Team Repositories

    Get PDF
    Software engineering as practiced today (especially in the industry) is no longer about the stereotypical monolithic life cycle processes (e.g. waterfall, spiral, etc.) found in most software engineering textbooks. These heavyweight methods historically have impeded progress for small/medium sized development teams owing to their inherent complexity and rather limited data collection strategies that predominated the 1980s until relatively recently in the mid-2000s. The discipline and practice of software engineering includes software quality, which has an established theoretical foundation for doing software metrics. Software metrics are a critical tool which provide continuous insight to products and processes and help build reliable software in mission critical environments. Using software metrics we can perform calculations that help assess the effectiveness of the underlying software or process. The type of metrics relevant to our work are in-process metrics. In-process metrics focus on a higher-level view of software quality, measuring information that can provide insight into the underlying software development process. In this thesis, we aim to develop and evaluate a metrics dashboard to support Computational Science and Engineering (CSE) software development projects. This task requires us to perform the following activities: Assess how metrics are used and which general classes/types of metrics will be useful in CSE projects. Develop a metrics dashboard that will work for teams using sites like Github, Bitbucket etc. Assess the effectiveness of the dashboard in terms of project success and developer attitude towards metrics and process. The challenge is to cherry-pick the most-effective practices from a large suite of tools and incorporate them into existing cloud-based workflows. As part of this thesis, we have developed a metrics dashboard based on the currently identified metrics types. The tools we are developing can be incorporated in any existing GitHub-based workflow without requiring the developers to install anything

    Detecting Communities of Methods Using Dynamic Analysis Data

    Full text link

    Snoring: A Noise Defect Prediction Datasets

    Get PDF
    Defect prediction aims at identifying software artifacts that are likely to exhibit a defect. The main purpose of defect prediction is to reduce the cost of testing and code review, by letting developers focus on specific artifacts. Several researchers have worked on improving the accuracy of defect estimation models using techniques such as tuning, re-balancing, or feature selection. Ultimately, the reliability of a prediction model depends on the quality of the dataset. Therefore effort has been spent in identifying sources of noise in the datasets, and how to deal with them, including defect misclassification and defect origin. A key component of defect prediction approaches is the attribution of a defect to a projects release. Although developers might be able to attribute a defect to a specific release, in most cases a defect is attributed to the release after which the defect has been discovered. However, in many circumstances, it can happen that a defect is only discovered several releases after its introduction. This might introduce a bias in the dataset, i.e., treating the intermediate releases as defect-free and the latter as defect-prone. We call this phenomenon a “sleeping defect”. We call “snoring” the phenomenon in which classes are affected by sleeping defects only, that would be treated as defect-free until the defect is discovered. In this work, we analyze, on data from more than 4,000 bugs and 600 releases of 20 open source projects from the Apache ecosystem for investigating: 1)the magnitude of the sleeping defects, 2) the magnitude of the snoring classes, 3)if snoring impacts the evaluation of classifiers, 4)if snoring impacts classifier accuracy, and 5)if removing the last releases of data is beneficial in reducing the negative impact of the snoring noise on classifiers accuracy. Our results show that, on average across projects: 1)most of the defects in a project slept for more than 19% of the existing releases, 2)the missing rate is more than 50% unless we remove more than 20% of the releases, 3) the relative error in measuring the classifier accuracy achieved by using a dataset with snoring is about 100% in all accuracy metrics other than AUC, 4) the presence of snoring decreases the accuracy in each of the 15 classifiers, in each of the 6 accuracy metrics. For instance, Recall, F1, Kappa and Matthews decreases by about 80%, and 5) removing one release of data is better than removing no data in all accuracy metrics. For instance, Recall, F1, Kappa and Matthews increase by about 30%
    corecore