The relevance of application domains in empirical findings

Abstract

The term 'software ecosystem' refers to a collection of software systems that are related in some way. Researchers have been using different levels of aggregation to define an ecosystem: grouping them by a common named project (e.g., the Apache ecosystem); or considering all the projects contained in online repositories (e.g., the GoogleCode ecosystem). In this paper we propose a definition of ecosystem based on application domains: software systems are in the same ecosystem if they share the same application domain, as described by a similar technological scope, context or objective. As an example, all projects implementing networking capabilities to trade Bitcoin and other virtual currencies can be considered as part of the same "cryp-tocurrency" ecosystem. Utilising a sample of 100 Java software systems, we derive their application domains using the Latent Dirichlet Allocation (LDA) approach. We then evaluate a suite of object-oriented metrics per ecosystem, and test a null hypothesis: 'the OO metrics of all ecosystems come from the same population'. Our results show that the null hypothesis is rejected for most of the metrics chosen: the ecosystems that we extracted, based on application domains, show different structural properties. From the point of view of the interested stakeholders, this could mean that the health of a software system depends on domain-dependent factors, that could be common to the projects in the same domain-based ecosystem

    Similar works