92 research outputs found

    Measuring and assessing maintainability at the end of high level design

    Get PDF
    Software architecture appears to be one of the main factors affecting software maintainability. Therefore, in order to be able to predict and assess maintainability early in the development process we need to be able to measure the high-level design characteristics that affect the change process. To this end, we propose a measurement approach, which is based on precise assumptions derived from the change process, which is based on Object-Oriented Design principles and is partially language independent. We define metrics for cohesion, coupling, and visibility in order to capture the difficulty of isolating, understanding, designing and validating changes

    An empirical evaluation of the “cognitive complexity” measure as a predictor of code understandability

    Get PDF
    Background: Code that is difficult to understand is also difficult to inspect and maintain and ultimately causes increased costs. Therefore, it would be greatly beneficial to have source code measures that are related to code understandability. Many ‘‘traditional’’ source code measures, including for instance Lines of Code and McCabe’s Cyclomatic Complexity, have been used to identify hard-to-understand code. In addition, the ‘‘Cognitive Complexity’’ measure was introduced in 2018 with the specific goal of improving the ability to evaluate code understandability. Aims: The goals of this paper are to assess whether (1) ‘‘Cognitive Complexity’’ is better correlated with code understandability than traditional measures, and (2) the availability of the ‘‘Cognitive Complexity’’ measure improves the performance (i.e., the accuracy) of code understandability prediction models. Method: We carried out an empirical study, in which we reused code understandability measures used in several previous studies. We first built Support Vector Regression models of understandability vs. code measures, and we then compared the performance of models that use ‘‘Cognitive Complexity’’ against the performance of models that do not. Results: ‘‘Cognitive Complexity’’ appears to be correlated to code understandability approximately as much as traditional measures, and the performance of models that use ‘‘Cognitive Complexity’’ is extremely close to the performance of models that use only traditional measures. Conclusions: The ‘‘Cognitive Complexity’’ measure does not appear to fulfill the promise of being a significant improvement over previously proposed measures, as far as code understandability prediction is concerned

    Property-based Software Engineering Measurement

    Get PDF
    Little theory exists in the field of software system measurement. Concepts such as complexity, coupling, cohesion or even size are very often subject to interpretation and appear to have inconsistent definitions in the literature. As a consequence, there is little guidance provided to the analyst attempting to define proper measures for specific problems. Many controversies in the literature are simply misunderstandings and stem from the fact that some people talk about different measurement concepts under the same label (complexity is the most common case). There is a need to define unambiguously the most important measurement concepts used in the measurement of software products. One way of doing so is to define precisely what mathematical properties characterize these concepts, regardless of the specific software artifacts to which these concepts are applied. Such a mathematical framework could generate a consensus in the software engineering community and provide a means for better communication among researchers, better guidelines for analysts, and better evaluation methods for commercial static analyzers for practitioners. In this paper, we propose a mathematical framework which is generic, because it is not specific to any particular software artifact, and rigorous, because it is based on precise mathematical concepts. This framework defines several important measurement concepts (size, length, complexity, cohesion, coupling). It does not intend to be complete or fully objective; other frameworks could have been proposed and different choices could have been made. However, we believe that the formalisms and properties we introduce are convenient and intuitive. In addition, we have reviewed the literature on this subject and compared it with our work. This framework contributes constructively to a firmer theoretical ground of software measurement. (Also cross-referenced as UMIACS-TR-94-119

    Defining and Validating High-Level Design Metrics

    Get PDF
    The availability of significant metrics in the early phases of the software development process allows for a better management of the later phases, and a more effective quality assessment when software quality can still be easily affected by preventive or corrective actions. In this paper, we introduce and compare four strategies for defining high-level design metrics. They are based on different sets of assumptions (about the design process) related to a well defined experimental goal they help reach: identify error-prone software parts. In particular, we define ratio-scale metrics for cohesion and coupling that show interesting properties. An in-depth experimental validation, conducted on large scale projects demonstrates the usefulness of the metrics we define. (Also cross-referenced as UMIACS-TR-94-75

    An Investigation of the users’ perception of OSS quality

    Get PDF
    Abstract. The quality of Open Source Software (OSS) is generally much debated. Some state that it is generally higher than closed-source counterparts, while others are more skeptical. The authors have collected the opinions of the users concerning the quality of 44 OSS products in a systematic manner, so that it is now possible to present the actual opinions of real users about the quality of OSS products. Among the results reported in the paper are: the distribution of trustworthiness of OSS based on our survey; a comparison of the trustworthiness of the surveyed products with respect to both open and closed-source competitors; the identification of the qualities that affect the perception of trustworthiness, based on rigorous statistical analysis

    Using Logistic Regression to Estimate the Number of Faulty Software Modules

    No full text
    Background. The evaluation of the accuracy of an estimation model for software fault-proneness is carried out by using the model with data collected on a set of software modules and classifying the modules in the set as either estimated faulty or estimated non-faulty. This classification usually involves setting a fault-proneness threshold: software modules whose fault-proneness is above that threshold are classified as estimated faulty and the others as estimated non-faulty. The selection of the threshold value is to some extent subjective and arbitrary, and different threshold values may lead to very different results in terms of classification accuracy. Objective. With our proposal, the accuracy of a fault-proneness model can be evaluated without fixing a threshold. Method. We first derive a property of Binary Logistic Regression fault-proneness estimation models. We show that the number of actually faulty software modules in the training set used to build a model is equal to the number of modules estimated faulty in that set, i.e., estimation is perfect on the training set. Then, we use the model on a different set, the test set, and estimate the number of faulty modules. We also estimate the number of faulty modules in the test set by using a more conventional approach with five different fault-proneness thresholds, and we finally compare the estimates with the estimates obtained via our approach. We carried out the empirical validation on a data set from NASA hosted on the PROMISE repository, by using a technique similar to the one used in K-fold cross validation. Results. In the empirical validation we carried out, the approach we propose is able to estimate the number of faulty modules in the test sets better than the threshold-based ones, in a statistically significant way. Conclusions. Our approach seems to have the potential to be practically used to accurately estimate the number of faulty modules without having to set specific fault-proneness thresholds
    corecore