Search CORE

26 research outputs found

Bayesian Hierarchical Modelling for Tailoring Metric Thresholds

Author: Bettenburg N.
Bob Carpenter
McIlreath Richard
Oliveira P.
Panichella A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/04/2018
Field of study

Software is highly contextual. While there are cross-cutting `global' lessons, individual software projects exhibit many `local' properties. This data heterogeneity makes drawing local conclusions from global data dangerous. A key research challenge is to construct locally accurate prediction models that are informed by global characteristics and data volumes. Previous work has tackled this problem using clustering and transfer learning approaches, which identify locally similar characteristics. This paper applies a simpler approach known as Bayesian hierarchical modeling. We show that hierarchical modeling supports cross-project comparisons, while preserving local context. To demonstrate the approach, we conduct a conceptual replication of an existing study on setting software metrics thresholds. Our emerging results show our hierarchical model reduces model prediction error compared to a global approach by up to 50%.Comment: Short paper, published at MSR '18: 15th International Conference on Mining Software Repositories May 28--29, 2018, Gothenburg, Swede

arXiv.org e-Print Archive

Crossref

How reliable are systematic reviews in empirical software engineering?

Author: Kitchenham BA
MacDonell SG
Mendes E
Shepperd MJ
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2010
Field of study

BACKGROUND – the systematic review is becoming a more commonly employed research instrument in empirical software engineering. Before undue reliance is placed on the outcomes of such reviews it would seem useful to consider the robustness of the approach in this particular research context. OBJECTIVE – the aim of this study is to assess the reliability of systematic reviews as a research instrument. In particular we wish to investigate the consistency of process and the stability of outcomes. METHOD – we compare the results of two independent reviews under taken with a common research question. RESULTS – the two reviews ﬁnd similar answers to the research question, although the means of arriving at those answers vary. CONCLUSIONS – in addressing a well-bounded research question, groups of researchers with similar domain experience can arrive at the same review outcomes, even though they may do so in different ways. This provides evidence that, in this context at least, the systematic review is a robust research method

Crossref

AUT Scholarly Commons

Brunel University Research Archive

Comparative analysis of meta-analysis methods: when to use which?

Author: Dieste Tubio Oscar
Fernández Enrique
García Martínez Ramón
Juristo Juzgado Natalia
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2011
Field of study

Background: Several meta-analysis methods can be used to quantitatively combine the results of a group of experiments, including the weighted mean difference, statistical vote counting, the parametric response ratio and the non-parametric response ratio. The software engineering community has focused on the weighted mean difference method. However, other meta-analysis methods have distinct strengths, such as being able to be used when variances are not reported. There are as yet no guidelines to indicate which method is best for use in each case. Aim: Compile a set of rules that SE researchers can use to ascertain which aggregation method is best for use in the synthesis phase of a systematic review. Method: Monte Carlo simulation varying the number of experiments in the meta analyses, the number of subjects that they include, their variance and effect size. We empirically calculated the reliability and statistical power in each case Results: WMD is generally reliable if the variance is low, whereas its power depends on the effect size and number of subjects per meta-analysis; the reliability of RR is generally unaffected by changes in variance, but it does require more subjects than WMD to be powerful; NPRR is the most reliable method, but it is not very powerful; SVC behaves well when the effect size is moderate, but is less reliable with other effect sizes. Detailed tables of results are annexed. Conclusions: Before undertaking statistical aggregation in software engineering, it is worthwhile checking whether there is any appreciable difference in the reliability and power of the methods. If there is, software engineers should select the method that optimizes both parameters

Crossref

Archivo Digital UPM

Cross-Dataset Design Discussion Mining

Author: Ernst Neil A.
Mahadi Alvi
Tongay Karan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/01/2020
Field of study

Being able to identify software discussions that are primarily about design, which we call design mining, can improve documentation and maintenance of software systems. Existing design mining approaches have good classification performance using natural language processing (NLP) techniques, but the conclusion stability of these approaches is generally poor. A classifier trained on a given dataset of software projects has so far not worked well on different artifacts or different datasets. In this study, we replicate and synthesize these earlier results in a meta-analysis. We then apply recent work in transfer learning for NLP to the problem of design mining. However, for our datasets, these deep transfer learning classifiers perform no better than less complex classifiers. We conclude by discussing some reasons behind the transfer learning approach to design mining.Comment: accepted for SANER 2020, Feb, London, ON. 12 pages. Replication package: https://doi.org/10.5281/zenodo.359012

arXiv.org e-Print Archive

Crossref

Investigating probabilistic sampling approaches for large-scale surveys in software engineering

Author
Publication venue: Springer
Publication date: 10/06/2015
Field of study

Springer - Publisher Connector

Understanding the Role and Methods of Meta-Analysis in IS Research

Author: He Jun
King William R.
Publication venue: AIS Electronic Library (AISeL)
Publication date: 22/10/2005
Field of study

Four methods for reviewing a body of research literature - narrative review, descriptive review, vote-counting, and meta-analysis - are compared. Meta-analysis as a formalized, systematic review method is discussed in detail in terms of its history, current status, advantages, common analytic methods, and recent developments

AIS Electronic Library (AISeL)

Reporting experiments to satisfy professionals information needs

Author: Jedlitschka Andreas
Juristo Juzgado Natalia
Rombach Dieter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Although the aim of empirical software engineering is to provide evidence for selecting the appropriate technology, it appears that there is a lack of recognition of this work in industry. Results from empirical research only rarely seem to find their way to company decision makers. If information relevant for software managers is provided in reports on experiments, such reports can be considered as a source of information for them when they are faced with making decisions about the selection of software engineering technologies. To bridge this communication gap between researchers and professionals, we propose characterizing the information needs of software managers in order to show empirical software engineering researchers which information is relevant for decision-making and thus enable them to make this information available. We empirically investigated decision makers? information needs to identify which information they need to judge the appropriateness and impact of a software technology. We empirically developed a model that characterizes these needs. To ensure that researchers provide relevant information when reporting results from experiments, we extended existing reporting guidelines accordingly.We performed an experiment to evaluate our model with regard to its effectiveness. Software managers who read an experiment report according to the proposed model judged the technology?s appropriateness significantly better than those reading a report about the same experiment that did not explicitly address their information needs. Our research shows that information regarding a technology, the context in which it is supposed to work, and most importantly, the impact of this technology on development costs and schedule as well as on product quality is crucial for decision makers

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Understanding replication of experiments in software engineering: a classification

Author: Gómez Gómez Omar Salvador
Juristo Juzgado Natalia
Vegas Hernández Sira
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Context: Replication plays an important role in experimental disciplines. There are still many uncertain- ties about how to proceed with replications of SE experiments. Should replicators reuse the baseline experiment materials? How much liaison should there be among the original and replicating experiment- ers, if any? What elements of the experimental configuration can be changed for the experiment to be considered a replication rather than a new experiment? Objective: To improve our understanding of SE experiment replication, in this work we propose a classi- fication which is intend to provide experimenters with guidance about what types of replication they can perform. Method: The research approach followed is structured according to the following activities: (1) a litera- ture review of experiment replication in SE and in other disciplines, (2) identification of typical elements that compose an experimental configuration, (3) identification of different replications purposes and (4) development of a classification of experiment replications for SE. Results: We propose a classification of replications which provides experimenters in SE with guidance about what changes can they make in a replication and, based on these, what verification purposes such a replication can serve. The proposed classification helped to accommodate opposing views within a broader framework, it is capable of accounting for less similar replications to more similar ones regarding the baseline experiment. Conclusion: The aim of replication is to verify results, but different types of replication serve special ver- ification purposes and afford different degrees of change. Each replication type helps to discover partic- ular experimental conditions that might influence the results. The proposed classification can be used to identify changes in a replication and, based on these, understand the level of verification

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Do software models based on the UML aid in source-code comprehensibility? Aggregating evidence from 12 controlled experiments

Author: Cruz-Lemus J. A.
Dodero G.
Genero M.
Gravino C.
Risi M.
Scanniello G.
Tortora G.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

In this paper, we present the results of long-term research conducted in order to study the contribution made by software models based on the Unified Modeling Language (UML) to the comprehensibility of Java source-code deprived of comments. We have conducted 12 controlled experiments in different experimental contexts and on different sites with participants with different levels of expertise (i.e., Bachelor’s, Master’s, and PhD students and software practitioners from Italy and Spain). A total of 333 observations were obtained from these experiments. The UML models in our experiments were those produced in the analysis and design phases. The models produced in the analysis phase were created with the objective of abstracting the environment in which the software will work (i.e., the problem domain), while those produced in the design phase were created with the goal of abstracting implementation aspects of the software (i.e., the solution/application domain). Source-code comprehensibility was assessed with regard to correctness of understanding, time taken to accomplish the comprehension tasks, and efficiency as regards accomplishing those tasks. In order to study the global effect of UML models on source-code comprehensibility, we aggregated results from the individual experiments using a meta-analysis. We made every effort to account for the heterogeneity of our experiments when aggregating the results obtained from them. The overall results suggest that the use of UML models affects the comprehensibility of source-code, when it is deprived of comments. Indeed, models produced in the analysis phase might reduce source-code comprehensibility, while increasing the time taken to complete comprehension tasks. That is, browsing source code and this kind of models together negatively impacts on the time taken to complete comprehension tasks without having a positive effect on the comprehensibility of source code. One plausible justification for this is that the UML models produced in the analysis phase focus on the problem domain. That is, models produced in the analysis phase say nothing about source code and there should be no expectation that they would, in any way, be beneficial to comprehensibility. On the other hand, UML models produced in the design phase improve source-code comprehensibility. One possible justification for this result is that models produced in the design phase are more focused on implementation details. Therefore, although the participants had more material to read and browse, this additional effort was paid back in the form of an improved comprehension of source code

Crossref

Archivio della Ricerca - Università della Basilicata

Archivio della Ricerca - Università di Salerno