976 research outputs found

    Uncovering Causal Relationships between Software Metrics and Bugs

    Get PDF
    International audienceBug prediction is an important challenge for software engineering research. It consist in looking for possible early indicators of the presence of bugs in a software. However, despite the relevance of the issue, most experiments designed to evaluate bug prediction only investigate whether there is a linear relation between the predictor and the presence of bugs. However, it is well known that standard regression models can not filter out spurious relations. Therefore, in this paper we describe an experiment to discover more robust evidences towards causality between software metrics (as predictors) and the occurrence of bugs. For this purpose, we have relied on Granger Causality Test to evaluate whether past changes in a given time series are useful to forecast changes in another series. As its name suggests, Granger Test is a better indication of causality between two variables. We present and discuss the results of experiments on four real world systems evaluated over a time frame of almost four years. Particularly, we have been able to discover in the history of metrics the causes - in the terms of the Granger Test - for 64% to 93% of the defects reported for the systems considered in our experiment

    BugMaps-Granger: a tool for visualizing and predicting bugs using Granger causality tests

    Get PDF
    International audienceBackgroundDespite the increasing number of bug analysis tools for exploring bugs in software systems, there are no tools supporting the investigation of causality relationships between internal quality metrics and bugs. In this paper, we propose an extension of the BugMaps tool called BugMaps-Granger that allows the analysis of source code properties that are more likely to cause bugs. For this purpose, we relied on the Granger Causality Test to evaluate whether past changes to a given time series of source code metrics can be used to forecast changes in a time series of defects. Our tool extracts source code versions from version control platforms, calculates source code metrics and defects time series, computes Granger Test results, and provides interactive visualizations for causal analysis of bugs.ResultsWe provide an example of use of BugMaps-Granger involving data from the Equinox Framework and Eclipse JDT Core systems collected during three years. For these systems, the tool was able to identify the modules with more bugs, the average lifetime and complexity of the bugs, and the source code properties that are more likely to cause bugs.ConclusionsWith the results provided by the tool in hand, a maintainer can perform at least two main software quality assurance activities: (a) refactoring the source code properties that Granger-caused bugs and (b) improving unit tests coverage in classes with more bugs

    Activity Report 2012. Project-Team RMOD. Analyses and Languages Constructs for Object-Oriented Application Evolution

    Get PDF
    Activity Report 2012 Project-Team RMOD Analyses and Languages Constructs for Object-Oriented Application Evolutio

    Fairness Testing: A Comprehensive Survey and Analysis of Trends

    Full text link
    Unfair behaviors of Machine Learning (ML) software have garnered increasing attention and concern among software engineers. To tackle this issue, extensive research has been dedicated to conducting fairness testing of ML software, and this paper offers a comprehensive survey of existing studies in this field. We collect 100 papers and organize them based on the testing workflow (i.e., how to test) and testing components (i.e., what to test). Furthermore, we analyze the research focus, trends, and promising directions in the realm of fairness testing. We also identify widely-adopted datasets and open-source tools for fairness testing

    MAAT: A Novel Ensemble Approach to Addressing Fairness and Performance Bugs for Machine Learning Software

    Get PDF
    Machine Learning (ML) software can lead to unfair and unethical decisions, making software fairness bugs an increasingly significant concern for software engineers. However, addressing fairness bugs often comes at the cost of introducing more ML performance (e.g., accuracy) bugs. In this paper, we propose MAAT, a novel ensemble approach to improving fairness-performance trade-off for ML software. Conventional ensemble methods combine different models with identical learning objectives. MAAT, instead, combines models optimized for different objectives: fairness and ML performance. We conduct an extensive evaluation of MAAT with 5 state-of-the-art methods, 9 software decision tasks, and 15 fairness-performance measurements. The results show that MAAT significantly outperforms the state-of-the-art. In particular, MAAT beats the trade-off baseline constructed by a recent benchmarking tool in 92.2% of the overall cases evaluated, 12.2 percentage points more than the best technique currently available. Moreover, the superiority of MAAT over the state-of-the-art holds on all the tasks and measurements that we study. We have made publicly available the code and data of this work to allow for future replication and extension

    Causal impact analysis for app releases in google play

    Get PDF
    App developers would like to understand the impact of their own and their competitors' software releases. To address this we introduce Causal Impact Release Analysis for app stores, and our tool, CIRA, that implements this analysis. We mined 38,858 popular Google Play apps, over a period of 12 months. For these apps, we identified 26,339 releases for which there was adequate prior and posterior time series data to facilitate causal impact analysis. We found that 33% of these releases caused a statistically significant change in user ratings. We use our approach to reveal important characteristics that distinguish causal significance in Google Play. To explore the actionability of causal impact analysis, we elicited the opinions of app developers: 56 companies responded, 78% concurred with the causal assessment, of which 33% claimed that their company would consider changing its app release strategy as a result of our findings

    Use and misuse of the term "Experiment" in mining software repositories research

    Get PDF
    The significant momentum and importance of Mining Software Repositories (MSR) in Software Engineering (SE) has fostered new opportunities and challenges for extensive empirical research. However, MSR researchers seem to struggle to characterize the empirical methods they use into the existing empirical SE body of knowledge. This is especially the case of MSR experiments. To provide evidence on the special characteristics of MSR experiments and their differences with experiments traditionally acknowledged in SE so far, we elicited the hallmarks that differentiate an experiment from other types of empirical studies and characterized the hallmarks and types of experiments in MSR. We analyzed MSR literature obtained from a small-scale systematic mapping study to assess the use of the term experiment in MSR. We found that 19% of the papers claiming to be an experiment are indeed not an experiment at all but also observational studies, so they use the term in a misleading way. From the remaining 81% of the papers, only one of them refers to a genuine controlled experiment while the others stand for experiments with limited control. MSR researchers tend to overlook such limitations, compromising the interpretation of the results of their studies. We provide recommendations and insights to support the improvement of MSR experiments.This work has been partially supported by the Spanish project: MCI PID2020-117191RB-I00.Peer ReviewedPostprint (author's final draft

    Software Runtime Data: Visualization and Integration with Development Data – A Case Study

    Get PDF
    Tarkvara kvaliteet on tarkvaraarenduse protsessi üks peamisi aspekte. Kuigi tarkvaraarenduse ja kasutuse (käitusaja) protsessid toodavad erinevat tüüpi andmeid, on ettevõtetel vähe toetust, et saada õigel ajal andmete põhjal arusaadavat ja tegutsema panevat teavet. Praktikud seisavad silmitsi tarkvaraprobleemide kindlakstegemise väljakutsega varase tarkvaraarenduse etappide ajal. Magistritöö eesmärk oli pakkuda reaalajas tegutsevat teavet tarkvarasüsteemide kasutamise ajal esinevate käitusvigade ja krahhide kohta ning uurida selle integreerimist arendusteabega. See töö on tehtud projekti Q-Rapids raames Fraunhoferi Eksperimentaalse Tarkvaratehnika Instituudis (IESE). Valitud juhtum on sise-nutika küla projekt - Digitale Dörfer (DD). Uurimistöö peamisteks panusteks on: a) DD projektist saadaolevate käitusaja andmete kogumine; b) sprintide planeerimise käigus otsuste tegemiseks juhtpaneelide loomine; c) CRISP-DM meetodi rakendamine tarkvara käitusaja ja arendusteabe integreerimiseks. Pakutavad ühendused ja integratsiooni skriptid on korduvkasutatavad. Edasisteks uuringuteks võib kasutada kaudseid raskusi ja õppetunde, mis on saadud tarkvara käitusaja ja arendusteabe integreerimisest.Software quality is one of the key aspects of the software development process. Although software development and usage (runtime) processes produce a different type of data, there is little support for companies to obtain insightful and actionable information from data at the right time. Practitioners face a challenge in identifying software problems during the early software development stages. The goal of the master thesis was to provide actionable real-time information about runtime errors and crashes during the usage of software systems and explore its integration with development data. This work has been done within the project Q-Rapids at Fraunhofer IESE. The selected case is the internal smart village project - Digitale Dörfer (DD). The main contributions of the thesis are: a) collecting available runtime data from the DD the project; b) creating dashboards to make decisions during sprint planning; c) applying CRISP-DM method to the integration of software runtime and development data. The provided connectors and integration scripts are reusable. Reported challenges and lessons learned from the integration of software runtime and development data may be used for further research

    Predicting Software Defects with Causality Tests

    Get PDF
    In this paper, we propose a defect prediction approach centered on more robust evidences towards causality between source code metrics (as predictors) and the occurrence of defects. More specifically, we rely on the Granger Causality Test to evaluate whether past variations in source code metrics values can be used to forecast changes in a time series of defects. Our approach triggers alarms when changes made to the source code of a target system have a high chance of producing defects. We evaluated our approach in several life stages of four Java-based systems. We reached an average precision of 50% in three out of the four systems we evaluated. Moreover, by comparing our approach with baselines that are not based on causality tests, it achieved a better precision

    The global vulnerability discovery and disclosure system: a thematic system dynamics approach

    Get PDF
    Vulnerabilities within software are the fundamental issue that provide both the means, and opportunity for malicious threat actors to compromise critical IT systems (Younis et al., 2016). Consequentially, the reduction of vulnerabilities within software should be of paramount importance, however, it is argued that software development practitioners have historically failed in reducing the risks associated with software vulnerabilities. This failure is illustrated in, and by the growth of software vulnerabilities over the past 20 years. This increase which is both unprecedented and unwelcome has led to an acknowledgement that novel and radical approaches to both understand the vulnerability discovery and disclosure system (VDDS) and to mitigate the risks associate with software vulnerability centred risk is needed (Bradbury, 2015; Marconato et al., 2012). The findings from this research show that whilst technological mitigations are vital, the social and economic features of the VDDS are of critical importance. For example, hitherto unknown systemic themes identified by this research are of key and include; Perception of Punishment; Vendor Interactions; Disclosure Stance; Ethical Considerations; Economic factors for Discovery and Disclosure and Emergence of New Vulnerability Markets. Each theme uniquely impacts the system, and ultimately the scale of vulnerability based risks. Within the research each theme within the VDDS is represented by several key variables which interact and shape the system. Specifically: Vender Sentiment; Vulnerability Removal Rate; Time to fix; Market Share; Participants within VDDS, Full and Coordinated Disclosure Ratio and Participant Activity. Each variable is quantified and explored, defining both the parameter space and progression over time. These variables are utilised within a system dynamic model to simulate differing policy strategies and assess the impact of these policies upon the VDDS. Three simulated vulnerability disclosure futures are hypothesised and are presented, characterised as depletion, steady and exponential with each scenario dependent upon the parameter space within the key variables