14,183 research outputs found

    Should I Bug You? Identifying Domain Experts in Software Projects Using Code Complexity Metrics

    Full text link
    In any sufficiently complex software system there are experts, having a deeper understanding of parts of the system than others. However, it is not always clear who these experts are and which particular parts of the system they can provide help with. We propose a framework to elicit the expertise of developers and recommend experts by analyzing complexity measures over time. Furthermore, teams can detect those parts of the software for which currently no, or only few experts exist and take preventive actions to keep the collective code knowledge and ownership high. We employed the developed approach at a medium-sized company. The results were evaluated with a survey, comparing the perceived and the computed expertise of developers. We show that aggregated code metrics can be used to identify experts for different software components. The identified experts were rated as acceptable candidates by developers in over 90% of all cases

    How Scale Affects Structure in Java Programs

    Full text link
    Many internal software metrics and external quality attributes of Java programs correlate strongly with program size. This knowledge has been used pervasively in quantitative studies of software through practices such as normalization on size metrics. This paper reports size-related super- and sublinear effects that have not been known before. Findings obtained on a very large collection of Java programs -- 30,911 projects hosted at Google Code as of Summer 2011 -- unveils how certain characteristics of programs vary disproportionately with program size, sometimes even non-monotonically. Many of the specific parameters of nonlinear relations are reported. This result gives further insights for the differences of "programming in the small" vs. "programming in the large." The reported findings carry important consequences for OO software metrics, and software research in general: metrics that have been known to correlate with size can now be properly normalized so that all the information that is left in them is size-independent.Comment: ACM Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA), October 2015. (Preprint

    Evaluation of Software Product Quality Metrics

    Full text link
    Computing devices and associated software govern everyday life, and form the backbone of safety critical systems in banking, healthcare, automotive and other fields. Increasing system complexity, quickly evolving technologies and paradigm shifts have kept software quality research at the forefront. Standards such as ISO's 25010 express it in terms of sub-characteristics such as maintainability, reliability and security. A significant body of literature attempts to link these subcharacteristics with software metric values, with the end goal of creating a metric-based model of software product quality. However, research also identifies the most important existing barriers. Among them we mention the diversity of software application types, development platforms and languages. Additionally, unified definitions to make software metrics truly language-agnostic do not exist, and would be difficult to implement given programming language levels of variety. This is compounded by the fact that many existing studies do not detail their methodology and tooling, which precludes researchers from creating surveys to enable data analysis on a larger scale. In our paper, we propose a comprehensive study of metric values in the context of three complex, open-source applications. We align our methodology and tooling with that of existing research, and present it in detail in order to facilitate comparative evaluation. We study metric values during the entire 18-year development history of our target applications, in order to capture the longitudinal view that we found lacking in existing literature. We identify metric dependencies and check their consistency across applications and their versions. At each step, we carry out comparative evaluation with existing research and present our results.Comment: Published in: Molnar AJ., Neam\c{t}u A., Motogna S. (2020) Evaluation of Software Product Quality Metrics. In: Damiani E., Spanoudakis G., Maciaszek L. (eds) Evaluation of Novel Approaches to Software Engineering. ENASE 2019. Communications in Computer and Information Science, vol 1172. Springer, Cham. https://doi.org/10.1007/978-3-030-40223-5_

    Bayesian Hierarchical Modelling for Tailoring Metric Thresholds

    Full text link
    Software is highly contextual. While there are cross-cutting `global' lessons, individual software projects exhibit many `local' properties. This data heterogeneity makes drawing local conclusions from global data dangerous. A key research challenge is to construct locally accurate prediction models that are informed by global characteristics and data volumes. Previous work has tackled this problem using clustering and transfer learning approaches, which identify locally similar characteristics. This paper applies a simpler approach known as Bayesian hierarchical modeling. We show that hierarchical modeling supports cross-project comparisons, while preserving local context. To demonstrate the approach, we conduct a conceptual replication of an existing study on setting software metrics thresholds. Our emerging results show our hierarchical model reduces model prediction error compared to a global approach by up to 50%.Comment: Short paper, published at MSR '18: 15th International Conference on Mining Software Repositories May 28--29, 2018, Gothenburg, Swede

    An Extended Stable Marriage Problem Algorithm for Clone Detection

    Full text link
    Code cloning negatively affects industrial software and threatens intellectual property. This paper presents a novel approach to detecting cloned software by using a bijective matching technique. The proposed approach focuses on increasing the range of similarity measures and thus enhancing the precision of the detection. This is achieved by extending a well-known stable-marriage problem (SMP) and demonstrating how matches between code fragments of different files can be expressed. A prototype of the proposed approach is provided using a proper scenario, which shows a noticeable improvement in several features of clone detection such as scalability and accuracy.Comment: 20 pages, 10 figures, 6 table

    On the State of the Art of Coupling and Cohesion Measures for Service-Oriented System Design

    Get PDF
    Service-oriented computing has encountered an increasing importance for enterprises over the last years. With Web services, the major underlying technical basis is already in an advanced state. The service design area, on the other hand, still provides several research gaps such as the field of service identification and in particular the determination of an optimal granularity level for services. Granularity, assessed through coupling and cohesion considerations, is yet a rather unexplored domain when it comes to service-orientation, although several results from earlier design principles are available. In this paper we summarize the current state of the art in granularity measures and identify the implications emerging for practice and re-search. As we reveal, several existing measures for other paradigms, which might be adapted for service-orientation, are left unconsidered. Further research gaps, as the mainly missing empirical evaluation or a tighter inclusion in the development process, are also detected

    Principles and Concepts of Agent-Based Modelling for Developing Geospatial Simulations

    Get PDF
    The aim of this paper is to outline fundamental concepts and principles of the Agent-Based Modelling (ABM) paradigm, with particular reference to the development of geospatial simulations. The paper begins with a brief definition of modelling, followed by a classification of model types, and a comment regarding a shift (in certain circumstances) towards modelling systems at the individual-level. In particular, automata approaches (e.g. Cellular Automata, CA, and ABM) have been particularly popular, with ABM moving to the fore. A definition of agents and agent-based models is given; identifying their advantages and disadvantages, especially in relation to geospatial modelling. The potential use of agent-based models is discussed, and how-to instructions for developing an agent-based model are provided. Types of simulation / modelling systems available for ABM are defined, supplemented with criteria to consider before choosing a particular system for a modelling endeavour. Information pertaining to a selection of simulation / modelling systems (Swarm, MASON, Repast, StarLogo, NetLogo, OBEUS, AgentSheets and AnyLogic) is provided, categorised by their licensing policy (open source, shareware / freeware and proprietary systems). The evaluation (i.e. verification, calibration, validation and analysis) of agent-based models and their output is examined, and noteworthy applications are discussed.Geographical Information Systems (GIS) are a particularly useful medium for representing model input and output of a geospatial nature. However, GIS are not well suited to dynamic modelling (e.g. ABM). In particular, problems of representing time and change within GIS are highlighted. Consequently, this paper explores the opportunity of linking (through coupling or integration / embedding) a GIS with a simulation / modelling system purposely built, and therefore better suited to supporting the requirements of ABM. This paper concludes with a synthesis of the discussion that has proceeded. The aim of this paper is to outline fundamental concepts and principles of the Agent-Based Modelling (ABM) paradigm, with particular reference to the development of geospatial simulations. The paper begins with a brief definition of modelling, followed by a classification of model types, and a comment regarding a shift (in certain circumstances) towards modelling systems at the individual-level. In particular, automata approaches (e.g. Cellular Automata, CA, and ABM) have been particularly popular, with ABM moving to the fore. A definition of agents and agent-based models is given; identifying their advantages and disadvantages, especially in relation to geospatial modelling. The potential use of agent-based models is discussed, and how-to instructions for developing an agent-based model are provided. Types of simulation / modelling systems available for ABM are defined, supplemented with criteria to consider before choosing a particular system for a modelling endeavour. Information pertaining to a selection of simulation / modelling systems (Swarm, MASON, Repast, StarLogo, NetLogo, OBEUS, AgentSheets and AnyLogic) is provided, categorised by their licensing policy (open source, shareware / freeware and proprietary systems). The evaluation (i.e. verification, calibration, validation and analysis) of agent-based models and their output is examined, and noteworthy applications are discussed.Geographical Information Systems (GIS) are a particularly useful medium for representing model input and output of a geospatial nature. However, GIS are not well suited to dynamic modelling (e.g. ABM). In particular, problems of representing time and change within GIS are highlighted. Consequently, this paper explores the opportunity of linking (through coupling or integration / embedding) a GIS with a simulation / modelling system purposely built, and therefore better suited to supporting the requirements of ABM. This paper concludes with a synthesis of the discussion that has proceeded
    • …
    corecore