162 research outputs found
Opinion Mining for Software Development: A Systematic Literature Review
Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies.
SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in
code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take
considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils
these approaches entail.
We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion
mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in
other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4)
concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques.
The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide
critical insights for the further development of opinion mining techniques in the SE domain
Determining the Intrinsic Structure of Public Software Development History
Background. Collaborative software development has produced a wealth of
version control system (VCS) data that can now be analyzed in full. Little is
known about the intrinsic structure of the entire corpus of publicly available
VCS as an interconnected graph. Understanding its structure is needed to
determine the best approach to analyze it in full and to avoid methodological
pitfalls when doing so. Objective. We intend to determine the most salient
network topol-ogy properties of public software development history as captured
by VCS. We will explore: degree distributions, determining whether they are
scale-free or not; distribution of connect component sizes; distribution of
shortest path lengths.Method. We will use Software Heritage-which is the
largest corpus of public VCS data-compress it using webgraph compression
techniques, and analyze it in-memory using classic graph algorithms. Analyses
will be performed both on the full graph and on relevant subgraphs.
Limitations. The study is exploratory in nature; as such no hypotheses on the
findings is stated at this time. Chosen graph algorithms are expected to scale
to the corpus size, but it will need to be confirmed experimentally. External
validity will depend on how representative Software Heritage is of the software
commons.Comment: MSR 2020 - 17th International Conference on Mining Software
Repositories, Oct 2020, Seoul, South Kore
A survey on software defect prediction using deep learning
Defect prediction is one of the key challenges in software development and programming language research for improving software quality and reliability. The problem in this area is to properly identify the defective source code with high accuracy. Developing a fault prediction model is a challenging problem, and many approaches have been proposed throughout history. The recent breakthrough in machine learning technologies, especially the development of deep learning techniques, has led to many problems being solved by these methods. Our survey focuses on the deep learning techniques for defect prediction. We analyse the recent works on the topic, study the methods for automatic learning of the semantic and structural features from the code, discuss the open problems and present the recent trends in the field. © 2021 by the authors. Licensee MDPI, Basel, Switzerland
Logram: Efficient Log Parsing Using n-Gram Dictionaries
Software systems usually record important runtime information in their logs.
Logs help practitioners understand system runtime behaviors and diagnose field
failures. As logs are usually very large in size, automated log analysis is
needed to assist practitioners in their software operation and maintenance
efforts. Typically, the first step of automated log analysis is log parsing,
i.e., converting unstructured raw logs into structured data. However, log
parsing is challenging, because logs are produced by static templates in the
source code (i.e., logging statements) yet the templates are usually
inaccessible when parsing logs. Prior work proposed automated log parsing
approaches that have achieved high accuracy. However, as the volume of logs
grows rapidly in the era of cloud computing, efficiency becomes a major concern
in log parsing. In this work, we propose an automated log parsing approach,
Logram, which leverages n-gram dictionaries to achieve efficient log parsing.
We evaluated Logram on 16 public log datasets and compared Logram with five
state-of-the-art log parsing approaches. We found that Logram achieves a
similar parsing accuracy to the best existing approaches while outperforms
these approaches in efficiency (i.e., 1.8 to 5.1 times faster than the second
fastest approaches). Furthermore, we deployed Logram on Spark and we found that
Logram scales out efficiently with the number of Spark nodes (e.g., with
near-linear scalability) without sacrificing parsing accuracy. In addition, we
demonstrated that Logram can support effective online parsing of logs,
achieving similar parsing results and efficiency with the offline mode.Comment: 13 pages, IEEE journal forma
A Survey on Query-based API Recommendation
Application Programming Interfaces (APIs) are designed to help developers
build software more effectively. Recommending the right APIs for specific tasks
has gained increasing attention among researchers and developers in recent
years. To comprehensively understand this research domain, we have surveyed to
analyze API recommendation studies published in the last 10 years. Our study
begins with an overview of the structure of API recommendation tools.
Subsequently, we systematically analyze prior research and pose four key
research questions. For RQ1, we examine the volume of published papers and the
venues in which these papers appear within the API recommendation field. In
RQ2, we categorize and summarize the prevalent data sources and collection
methods employed in API recommendation research. In RQ3, we explore the types
of data and common data representations utilized by API recommendation
approaches. We also investigate the typical data extraction procedures and
collection approaches employed by the existing approaches. RQ4 delves into the
modeling techniques employed by API recommendation approaches, encompassing
both statistical and deep learning models. Additionally, we compile an overview
of the prevalent ranking strategies and evaluation metrics used for assessing
API recommendation tools. Drawing from our survey findings, we identify current
challenges in API recommendation research that warrant further exploration,
along with potential avenues for future research
Machine Learning practices and infrastructures
Machine Learning (ML) systems, particularly when deployed in high-stakes
domains, are deeply consequential. They can exacerbate existing inequities,
create new modes of discrimination, and reify outdated social constructs.
Accordingly, the social context (i.e. organisations, teams, cultures) in which
ML systems are developed is a site of active research for the field of AI
ethics, and intervention for policymakers. This paper focuses on one aspect of
social context that is often overlooked: interactions between practitioners and
the tools they rely on, and the role these interactions play in shaping ML
practices and the development of ML systems. In particular, through an
empirical study of questions asked on the Stack Exchange forums, the use of
interactive computing platforms (e.g. Jupyter Notebook and Google Colab) in ML
practices is explored. I find that interactive computing platforms are used in
a host of learning and coordination practices, which constitutes an
infrastructural relationship between interactive computing platforms and ML
practitioners. I describe how ML practices are co-evolving alongside the
development of interactive computing platforms, and highlight how this risks
making invisible aspects of the ML life cycle that AI ethics researchers' have
demonstrated to be particularly salient for the societal impact of deployed ML
systems
“Won’t we fix this issue?” : qualitative characterization and automated identification of wontfix issues on GitHub
Context: Addressing user requests in the form of bug reports and Github issues represents a crucial task of any successful software project. However, user-submitted issue reports tend to widely differ in their quality, and developers spend a considerable amount of time handling them.
Objective: By collecting a dataset of around 6,000 issues of 279 GitHub projects, we observe that developers take significant time (i.e., about five months, on average) before labeling an issue as a wontfix. For this reason, in this paper, we empirically investigate the nature of wontfix issues and methods to facilitate issue management process.
Method: We first manually analyze a sample of 667 wontfix issues, extracted from heterogeneous projects, investigating the common reasons behind a “wontfix decision”, the main characteristics of wontfix issues and the potential factors that could be connected with the time to close them. Furthermore, we experiment with approaches enabling the prediction of wontfix issues by analyzing the titles and descriptions of reported issues when submitted.
Results and conclusion: Our investigation sheds some light on the wontfix issues’ characteristics, as well as the potential factors that may affect the time required to make a “wontfix decision”. Our results also demonstrate that it is possible to perform prediction of wontfix issues with high average values of precision, recall, and F-measure (90%-93%)
- …