207 research outputs found
Software Evolution Understanding: Automatic Extraction of Software Identifiers Map for Object-Oriented Software Systems
Software companies usually develop a set of product variants within the same family that share certain functions and differ in others. Variations across software variants occur to meet different customer requirements. Thus, software product variants evolve overtime to cope with new requirements. A software engineer who deals with this family may find it difficult to understand the evolution scenarios that have taken place over time. In addition, software identifier names are important resources to understand the evolution scenarios in this family. This paper introduces an automatic approach called Juana’s approach to detect the evolution scenario across two product variants at the source code level and identifies the common and unique software identifier names across software variants source code. Juana’s approach refers to common and unique identifier names as a software identifiers map and computes it by comparing software variants to each other. Juana considers all software identifier names such as package, class, attribute, and method. The novelty of this approach is that it exploits common and unique identifier names across the source code of software variants, to understand the evolution scenarios across software family in an efficient way. For validity, Juana was applied on ArgoUML and Mobile Media software variants. The results of this evaluation validate the relevance and the performance of the approach as all evolution scenarios were correctly detected via a software identifiers map
Requirements Traceability: Recovering and Visualizing Traceability Links Between Requirements and Source Code of Object-oriented Software Systems
Requirements traceability is an important activity to reach an effective
requirements management method in the requirements engineering.
Requirement-to-Code Traceability Links (RtC-TLs) shape the relations between
requirement and source code artifacts. RtC-TLs can assist engineers to know
which parts of software code implement a specific requirement. In addition,
these links can assist engineers to keep a correct mental model of software,
and decreasing the risk of code quality degradation when requirements change
with time mainly in large sized and complex software. However, manually
recovering and preserving of these TLs puts an additional burden on engineers
and is error-prone, tedious, and costly task. This paper introduces YamenTrace,
an automatic approach and implementation to recover and visualize RtC-TLs in
Object-Oriented software based on Latent Semantic Indexing (LSI) and Formal
Concept Analysis (FCA). The originality of YamenTrace is that it exploits all
code identifier names, comments, and relations in TLs recovery process.
YamenTrace uses LSI to find textual similarity across software code and
requirements. While FCA employs to cluster similar code and requirements
together. Furthermore, YamenTrace gives a visualization of recovered TLs. To
validate YamenTrace, it applied on three case studies. The findings of this
evaluation prove the importance and performance of YamenTrace proposal as most
of RtC-TLs were correctly recovered and visualized.Comment: 17 pages, 14 figure
Opinion Mining for Software Development: A Systematic Literature Review
Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies.
SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in
code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take
considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils
these approaches entail.
We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion
mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in
other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4)
concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques.
The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide
critical insights for the further development of opinion mining techniques in the SE domain
Personalized First Issue Recommender for Newcomers in Open Source Projects
Many open source projects provide good first issues (GFIs) to attract and
retain newcomers. Although several automated GFI recommenders have been
proposed, existing recommenders are limited to recommending generic GFIs
without considering differences between individual newcomers. However, we
observe mismatches between generic GFIs and the diverse background of
newcomers, resulting in failed attempts, discouraged onboarding, and delayed
issue resolution. To address this problem, we assume that personalized first
issues (PFIs) for newcomers could help reduce the mismatches. To justify the
assumption, we empirically analyze 37 newcomers and their first issues resolved
across multiple projects. We find that the first issues resolved by the same
newcomer share similarities in task type, programming language, and project
domain. These findings underscore the need for a PFI recommender to improve
over state-of-the-art approaches. For that purpose, we identify features that
influence newcomers' personalized selection of first issues by analyzing the
relationship between possible features of the newcomers and the characteristics
of the newcomers' chosen first issues. We find that the expertise preference,
OSS experience, activeness, and sentiment of newcomers drive their personalized
choice of the first issues. Based on these findings, we propose a Personalized
First Issue Recommender (PFIRec), which employs LamdaMART to rank candidate
issues for a given newcomer by leveraging the identified influential features.
We evaluate PFIRec using a dataset of 68,858 issues from 100 GitHub projects.
The evaluation results show that PFIRec outperforms existing first issue
recommenders, potentially doubling the probability that the top recommended
issue is suitable for a specific newcomer and reducing one-third of a
newcomer's unsuccessful attempts to identify suitable first issues, in the
median.Comment: The 38th IEEE/ACM International Conference on Automated Software
Engineering (ASE 2023
Automating Method Naming with Context-Aware Prompt-Tuning
Method names are crucial to program comprehension and maintenance. Recently,
many approaches have been proposed to automatically recommend method names and
detect inconsistent names. Despite promising, their results are still
sub-optimal considering the three following drawbacks: 1) These models are
mostly trained from scratch, learning two different objectives simultaneously.
The misalignment between two objectives will negatively affect training
efficiency and model performance. 2) The enclosing class context is not fully
exploited, making it difficult to learn the abstract function of the method. 3)
Current method name consistency checking methods follow a generate-then-compare
process, which restricts the accuracy as they highly rely on the quality of
generated names and face difficulty measuring the semantic consistency.
In this paper, we propose an approach named AUMENA to AUtomate MEthod NAming
tasks with context-aware prompt-tuning. Unlike existing deep learning based
approaches, our model first learns the contextualized representation(i.e.,
class attributes) of PL and NL through the pre-training model, then fully
exploits the capacity and knowledge of large language model with prompt-tuning
to precisely detect inconsistent method names and recommend more accurate
names. To better identify semantically consistent names, we model the method
name consistency checking task as a two-class classification problem, avoiding
the limitation of previous similarity-based consistency checking approaches.
The experimental results reflect that AUMENA scores 68.6%, 72.0%, 73.6%, 84.7%
on four datasets of method name recommendation, surpassing the state-of-the-art
baseline by 8.5%, 18.4%, 11.0%, 12.0%, respectively. And our approach scores
80.8% accuracy on method name consistency checking, reaching an 5.5%
outperformance. All data and trained models are publicly available.Comment: Accepted by ICPC-202
A Survey of Performance Optimization for Mobile Applications
Nowadays there is a mobile application for almost everything a user may think of, ranging from paying bills and gathering information to playing games and watching movies. In order to ensure user satisfaction and success of applications, it is important to provide high performant applications. This is particularly important for resource constraint systems such as mobile devices. Thereby, non-functional performance characteristics, such as energy and memory consumption, play an important role for user satisfaction. This paper provides a comprehensive survey of non-functional performance optimization for Android applications. We collected 155 unique publications, published between 2008 and 2020, that focus on the optimization of non-functional performance of mobile applications. We target our search at four performance characteristics, in particular: responsiveness, launch time, memory and energy consumption. For each performance characteristic, we categorize optimization approaches based on the method used in the corresponding publications. Furthermore, we identify research gaps in the literature for future work
Causality-based Neural Network Repair
Neural networks have had discernible achievements in a wide range of
applications. The wide-spread adoption also raises the concern of their
dependability and reliability. Similar to traditional decision-making programs,
neural networks can have defects that need to be repaired. The defects may
cause unsafe behaviors, raise security concerns or unjust societal impacts. In
this work, we address the problem of repairing a neural network for desirable
properties such as fairness and the absence of backdoor. The goal is to
construct a neural network that satisfies the property by (minimally) adjusting
the given neural network's parameters (i.e., weights). Specifically, we propose
CARE (\textbf{CA}usality-based \textbf{RE}pair), a causality-based neural
network repair technique that 1) performs causality-based fault localization to
identify the `guilty' neurons and 2) optimizes the parameters of the identified
neurons to reduce the misbehavior. We have empirically evaluated CARE on
various tasks such as backdoor removal, neural network repair for fairness and
safety properties. Our experiment results show that CARE is able to repair all
neural networks efficiently and effectively. For fairness repair tasks, CARE
successfully improves fairness by on average. For backdoor removal
tasks, CARE reduces the attack success rate from over to less than
. For safety property repair tasks, CARE reduces the property violation
rate to less than . Results also show that thanks to the causality-based
fault localization, CARE's repair focuses on the misbehavior and preserves the
accuracy of the neural networks
- …