3,335 research outputs found
An Automatically Created Novel Bug Dataset and its Validation in Bug Prediction
Bugs are inescapable during software development due to frequent code
changes, tight deadlines, etc.; therefore, it is important to have tools to
find these errors. One way of performing bug identification is to analyze the
characteristics of buggy source code elements from the past and predict the
present ones based on the same characteristics, using e.g. machine learning
models. To support model building tasks, code elements and their
characteristics are collected in so-called bug datasets which serve as the
input for learning.
We present the \emph{BugHunter Dataset}: a novel kind of automatically
constructed and freely available bug dataset containing code elements (files,
classes, methods) with a wide set of code metrics and bug information. Other
available bug datasets follow the traditional approach of gathering the
characteristics of all source code elements (buggy and non-buggy) at only one
or more pre-selected release versions of the code. Our approach, on the other
hand, captures the buggy and the fixed states of the same source code elements
from the narrowest timeframe we can identify for a bug's presence, regardless
of release versions. To show the usefulness of the new dataset, we built and
evaluated bug prediction models and achieved F-measure values over 0.74
Massive Multi-Agent Data-Driven Simulations of the GitHub Ecosystem
Simulating and predicting planetary-scale techno-social systems poses heavy
computational and modeling challenges. The DARPA SocialSim program set the
challenge to model the evolution of GitHub, a large collaborative
software-development ecosystem, using massive multi-agent simulations. We
describe our best performing models and our agent-based simulation framework,
which we are currently extending to allow simulating other planetary-scale
techno-social systems. The challenge problem measured participant's ability,
given 30 months of meta-data on user activity on GitHub, to predict the next
months' activity as measured by a broad range of metrics applied to ground
truth, using agent-based simulation. The challenge required scaling to a
simulation of roughly 3 million agents producing a combined 30 million actions,
acting on 6 million repositories with commodity hardware. It was also important
to use the data optimally to predict the agent's next moves. We describe the
agent framework and the data analysis employed by one of the winning teams in
the challenge. Six different agent models were tested based on a variety of
machine learning and statistical methods. While no single method proved the
most accurate on every metric, the broadly most successful sampled from a
stationary probability distribution of actions and repositories for each agent.
Two reasons for the success of these agents were their use of a distinct
characterization of each agent, and that GitHub users change their behavior
relatively slowly
The effects of change decomposition on code review -- a controlled experiment
Background: Code review is a cognitively demanding and time-consuming
process. Previous qualitative studies hinted at how decomposing change sets
into multiple yet internally coherent ones would improve the reviewing process.
So far, literature provided no quantitative analysis of this hypothesis.
Aims: (1) Quantitatively measure the effects of change decomposition on the
outcome of code review (in terms of number of found defects, wrongly reported
issues, suggested improvements, time, and understanding); (2) Qualitatively
analyze how subjects approach the review and navigate the code, building
knowledge and addressing existing issues, in large vs. decomposed changes.
Method: Controlled experiment using the pull-based development model
involving 28 software developers among professionals and graduate students.
Results: Change decomposition leads to fewer wrongly reported issues,
influences how subjects approach and conduct the review activity (by increasing
context-seeking), yet impacts neither understanding the change rationale nor
the number of found defects.
Conclusions: Change decomposition reduces the noise for subsequent data
analyses but also significantly supports the tasks of the developers in charge
of reviewing the changes. As such, commits belonging to different concepts
should be separated, adopting this as a best practice in software engineering
The Role of Data Filtering in Open Source Software Ranking and Selection
Faced with over 100M open source projects most empirical investigations
select a subset. Most research papers in leading venues investigated filtering
projects by some measure of popularity with explicit or implicit arguments that
unpopular projects are not of interest, may not even represent "real" software
projects, or that less popular projects are not worthy of study. However, such
filtering may have enormous effects on the results of the studies if and
precisely because the sought-out response or prediction is in any way related
to the filtering criteria.
We exemplify the impact of this practice on research outcomes: how filtering
of projects listed on GitHub affects the assessment of their popularity. We
randomly sample over 100,000 repositories and use multiple regression to model
the number of stars (a proxy for popularity) based on the number of commits,
the duration of the project, the number of authors, and the number of core
developers. Comparing control with the entire dataset with a filtered model
projects having ten or more authors we find that while certain characteristics
of the repository consistently predict popularity, the filtering process
significantly alters the relation ships between these characteristics and the
response. The number of commits exhibited a positive correlation with
popularity in the control sample but showed a negative correlation in the
filtered sample. These findings highlight the potential biases introduced by
data filtering and emphasize the need for careful sample selection in empirical
research of mining software repositories. We recommend that empirical work
should either analyze complete datasets such as World of Code, or employ
stratified random sampling from a complete dataset to ensure that filtering is
not biasing the results.Comment: International Workshop on Methodological Issues with Empirical
Studies in Software Engineering (WSESE 2024
- …