4,982 research outputs found

    A Method to Assess and Argue for Practical Significance in Software Engineering

    Get PDF
    A key goal of empirical research in software engineering is to assess practical significance, which answers whether the observed effects of some compared treatments show a relevant difference in practice in realistic scenarios. Even though plenty of standard techniques exist to assess statistical significance, connecting it to practical significance is not straightforward or routinely done; indeed, only a few empirical studies in software engineering assess practical significance in a principled and systematic way. In this paper, we argue that Bayesian data analysis provides suitable tools to assess practical significance rigorously. We demonstrate our claims in a case study comparing different test techniques. The case study\u27s data was previously analyzed (Afzal et al., 2015) using standard techniques focusing on statistical significance. Here, we build a multilevel model of the same data, which we fit and validate using Bayesian techniques. Our method is to apply cumulative prospect theory on top of the statistical model to quantitatively connect our statistical analysis output to a practically meaningful context. This is then the basis both for assessing and arguing for practical significance. Our study demonstrates that Bayesian analysis provides a technically rigorous yet practical framework for empirical software engineering. A substantial side effect is that any uncertainty in the underlying data will be propagated through the statistical model, and its effects on practical significance are made clear. Thus, in combination with cumulative prospect theory, Bayesian analysis supports seamlessly assessing practical significance in an empirical software engineering context, thus potentially clarifying and extending the relevance of research for practitioners

    Evolution of statistical analysis in empirical software engineering research: Current state and steps forward

    Full text link
    Software engineering research is evolving and papers are increasingly based on empirical data from a multitude of sources, using statistical tests to determine if and to what degree empirical evidence supports their hypotheses. To investigate the practices and trends of statistical analysis in empirical software engineering (ESE), this paper presents a review of a large pool of papers from top-ranked software engineering journals. First, we manually reviewed 161 papers and in the second phase of our method, we conducted a more extensive semi-automatic classification of papers spanning the years 2001--2015 and 5,196 papers. Results from both review steps was used to: i) identify and analyze the predominant practices in ESE (e.g., using t-test or ANOVA), as well as relevant trends in usage of specific statistical methods (e.g., nonparametric tests and effect size measures) and, ii) develop a conceptual model for a statistical analysis workflow with suggestions on how to apply different statistical methods as well as guidelines to avoid pitfalls. Lastly, we confirm existing claims that current ESE practices lack a standard to report practical significance of results. We illustrate how practical significance can be discussed in terms of both the statistical analysis and in the practitioner's context.Comment: journal submission, 34 pages, 8 figure

    The Unfulfilled Potential of Data-Driven Decision Making in Agile Software Development

    Get PDF
    With the general trend towards data-driven decision making (DDDM), organizations are looking for ways to use DDDM to improve their decisions. However, few studies have looked into the practitioners view of DDDM, in particular for agile organizations. In this paper we investigated the experiences of using DDDM, and how data can improve decision making. An emailed questionnaire was sent out to 124 industry practitioners in agile software developing companies, of which 84 answered. The results show that few practitioners indicated a widespread use of DDDM in their current decision making practices. The practitioners were more positive to its future use for higher-level and more general decision making, fairly positive to its use for requirements elicitation and prioritization decisions, while being less positive to its future use at the team level. The practitioners do see a lot of potential for DDDM in an agile context; however, currently unfulfilled

    Feature selection methods for solving the reference class problem

    Get PDF
    Probabilistic inference from frequencies, such as "Most Quakers are pacifists; Nixon is a Quaker, so probably Nixon is a pacifist" suffer from the problem that an individual is typically a member of many "reference classes" (such as Quakers, Republicans, Californians, etc) in which the frequency of the target attribute varies. How to choose the best class or combine the information? The article argues that the problem can be solved by the feature selection methods used in contemporary Big Data science: the correct reference class is that determined by the features relevant to the target, and relevance is measured by correlation (that is, a feature is relevant if it makes a difference to the frequency of the target)

    Data quality: Some comments on the NASA software defect datasets

    Get PDF
    Background-Self-evidently empirical analyses rely upon the quality of their data. Likewise, replications rely upon accurate reporting and using the same rather than similar versions of datasets. In recent years, there has been much interest in using machine learners to classify software modules into defect-prone and not defect-prone categories. The publicly available NASA datasets have been extensively used as part of this research. Objective-This short note investigates the extent to which published analyses based on the NASA defect datasets are meaningful and comparable. Method-We analyze the five studies published in the IEEE Transactions on Software Engineering since 2007 that have utilized these datasets and compare the two versions of the datasets currently in use. Results-We find important differences between the two versions of the datasets, implausible values in one dataset and generally insufficient detail documented on dataset preprocessing. Conclusions-It is recommended that researchers 1) indicate the provenance of the datasets they use, 2) report any preprocessing in sufficient detail to enable meaningful replication, and 3) invest effort in understanding the data prior to applying machine learners

    A decision support methodology to enhance the competitiveness of the Turkish automotive industry

    Get PDF
    This is the post-print (final draft post-refereeing) version of the article. Copyright @ 2013 Elsevier B.V. All rights reserved.Three levels of competitiveness affect the success of business enterprises in a globally competitive environment: the competitiveness of the company, the competitiveness of the industry in which the company operates and the competitiveness of the country where the business is located. This study analyses the competitiveness of the automotive industry in association with the national competitiveness perspective using a methodology based on Bayesian Causal Networks. First, we structure the competitiveness problem of the automotive industry through a synthesis of expert knowledge in the light of the World Economic Forum’s competitiveness indicators. Second, we model the relationships among the variables identified in the problem structuring stage and analyse these relationships using a Bayesian Causal Network. Third, we develop policy suggestions under various scenarios to enhance the national competitive advantages of the automotive industry. We present an analysis of the Turkish automotive industry as a case study. It is possible to generalise the policy suggestions developed for the case of Turkish automotive industry to the automotive industries in other developing countries where country and industry competitiveness levels are similar to those of Turkey

    TRIDEnT: Building Decentralized Incentives for Collaborative Security

    Full text link
    Sophisticated mass attacks, especially when exploiting zero-day vulnerabilities, have the potential to cause destructive damage to organizations and critical infrastructure. To timely detect and contain such attacks, collaboration among the defenders is critical. By correlating real-time detection information (alerts) from multiple sources (collaborative intrusion detection), defenders can detect attacks and take the appropriate defensive measures in time. However, although the technical tools to facilitate collaboration exist, real-world adoption of such collaborative security mechanisms is still underwhelming. This is largely due to a lack of trust and participation incentives for companies and organizations. This paper proposes TRIDEnT, a novel collaborative platform that aims to enable and incentivize parties to exchange network alert data, thus increasing their overall detection capabilities. TRIDEnT allows parties that may be in a competitive relationship, to selectively advertise, sell and acquire security alerts in the form of (near) real-time peer-to-peer streams. To validate the basic principles behind TRIDEnT, we present an intuitive game-theoretic model of alert sharing, that is of independent interest, and show that collaboration is bound to take place infinitely often. Furthermore, to demonstrate the feasibility of our approach, we instantiate our design in a decentralized manner using Ethereum smart contracts and provide a fully functional prototype.Comment: 28 page

    Why Simpler Computer Simulation Models Can Be Epistemically Better for Informing Decisions

    Get PDF
    For computer simulation models to usefully inform climate risk management, uncertainties in model projections must be explored and characterized. Because doing so requires running the model many ti..

    Ownership versus Management Effects on Performance in Family and Founder Companies: A Bayesian Analysis

    Get PDF
    There are ongoing debates in the literature concerning the performance of family firms: some studies find superior performance among these companies, others find negative or neutral per-formance effects. In this research we employ agency theory to argue that the effects of family ownership vs. family management will be quite different: the former is expected to contribute positively to performance, the latter is argued to erode performance. Previous studies, due to problems of omitted variables or multicollinearity have been unable to distinguish these effects. Using a Bayesian approach that avoids these problems, we find that whereas family and founder ownership are associated with superior performance, the results for family management and even founder management are far more ambiguous. Our results have implications regarding the own-ership and management of lone founder and family firms.Family firms; lone founder firms; performance; Bayesian analysis; agency theory
    corecore