Search CORE

165 research outputs found

Opinion Mining for Software Development: A Systematic Literature Review

Author: Alexander Serebrenik
Bin Lin
Gabriele Bavota
Michele Lanza
Nathan Cassee
Nicole Novielli
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies. SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils these approaches entail. We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide critical insights for the further development of opinion mining techniques in the SE domain

Pure OAI Repository

Archivio istituzionale della ricerca - Università di Bari

An NLP-based tool for software artifacts analysis

Author: Canfora Gerardo
Di Penta Massimiliano
Di Sorbo Andrea
Panichella Sebastiano
Visaggio Corrado A.
Publication venue: ZHAW Zürcher Hochschule für Angewandte Wissenschaften
Publication date: 01/01/2021
Field of study

Software developers rely on various repositories and communication channels to exchange relevant information about their ongoing tasks and the status of overall project progress. In this context, semi-structured and unstructured software artifacts have been leveraged by researchers to build recommender systems aimed at supporting developers in different tasks, such as transforming user feedback in maintenance and evolution tasks, suggesting experts, or generating software documentation. More specifically, Natural Language (NL) parsing techniques have been successfully leveraged to automatically identify (or extract) the relevant information embedded in unstructured software artifacts. However, such techniques require the manual identification of patterns to be used for classification purposes. To reduce such a manual effort, we propose an NL parsingbased tool for software artifacts analysis named NEON that can automate the mining of such rules, minimizing the manual effort of developers and researchers. Through a small study involving human subjects with NL processing and parsing expertise, we assess the performance of NEON in identifying rules useful to classify app reviews for software maintenance purposes. Our results show that more than one-third of the rules inferred by NEON are relevant for the proposed task. Demo webpage: https://github.com/adisorbo/NEON too

ZHAW digitalcollection

API Usage Recommendation via Multi-View Heterogeneous Graph Representation Learning

Author: Chen Yujia
Gao Cuiyun
Lyu Michael R.
Peng Yun
Ren Xiaoxue
Xia Xin
Publication venue
Publication date: 03/08/2022
Field of study

Developers often need to decide which APIs to use for the functions being implemented. With the ever-growing number of APIs and libraries, it becomes increasingly difficult for developers to find appropriate APIs, indicating the necessity of automatic API usage recommendation. Previous studies adopt statistical models or collaborative filtering methods to mine the implicit API usage patterns for recommendation. However, they rely on the occurrence frequencies of APIs for mining usage patterns, thus prone to fail for the low-frequency APIs. Besides, prior studies generally regard the API call interaction graph as homogeneous graph, ignoring the rich information (e.g., edge types) in the structure graph. In this work, we propose a novel method named MEGA for improving the recommendation accuracy especially for the low-frequency APIs. Specifically, besides call interaction graph, MEGA considers another two new heterogeneous graphs: global API co-occurrence graph enriched with the API frequency information and hierarchical structure graph enriched with the project component information. With the three multi-view heterogeneous graphs, MEGA can capture the API usage patterns more accurately. Experiments on three Java benchmark datasets demonstrate that MEGA significantly outperforms the baseline models by at least 19% with respect to the Success Rate@1 metric. Especially, for the low-frequency APIs, MEGA also increases the baselines by at least 55% regarding the Success Rate@1

arXiv.org e-Print Archive

Parasol: Efficient Parallel Synthesis of Large Model Spaces

Author: Bagheri Hamid
Stevens Clay
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 03/09/2022
Field of study

Formal analysis is an invaluable tool for software engineers, yet state-of-the-art formal analysis techniques suffer from well-known limitations in terms of scalability. In particular, some software design domains—such as tradeoff analysis and security analysis—require systematic exploration of potentially huge model spaces, which further exacerbates the problem. Despite this present and urgent challenge, few techniques exist to support the systematic exploration of large model spaces. This paper introduces Parasol, an approach and accompanying tool suite, to improve the scalability of large-scale formal model space exploration. Parasol presents a novel parallel model space synthesis approach, backed with unsupervised learning to automatically derive domain knowledge, guiding a balanced partitioning of the model space. This allows Parasol to synthesize the models in each partition in parallel, significantly reducing synthesis time and making large-scale systematic model space exploration for real-world systems more tractable. Our empirical results corroborate that Parasol substantially reduces (by 460% on average) the time required for model space synthesis, compared to state-of-the-art model space synthesis techniques relying on both incremental and parallel constraint solving technologies as well as competing, non-learning-based partitioning methods

DigitalCommons@University of Nebraska

Automatic Performance Testing using Input-Sensitive Profiling

Author: Cohen W. W.
Emilio Coppa I. F.
Law J.
Luo Q.
Publication venue: W&M ScholarWorks
Publication date: 01/01/2016
Field of study

During performance testing, software engineers commonly perform application profiling to analyze an application\u27s traces with different inputs to understand performance behaviors, such as time and space consumption. However, a non-trivial application commonly has a large number of inputs, and it is mostly manual to identify the specific inputs leading to performance bottlenecks. Thus, it is challenge is to automate profiling and find these specific inputs. To solve these problems, we propose novel approaches, FOREPOST, GA-Prof and PerfImpact, which automatically profile applications for finding the specific combinations of inputs triggering performance bottlenecks, and further analyze the corresponding traces to identify problematic methods. Specially, our approaches work in two different types of real-world scenarios of performance testing: i) a single-version scenario, in which performance bottlenecks are detected in a single software release, and ii) a two-version scenario, in which code changes responsible for performance regressions are detected by considering two consecutive software releases

Crossref

College of William & Mary: W&M Publish

Input-Sensitive Performance Testing

Author: Cohen W. W.
Emilio Coppa I. F.
Law J.
Luo Q.
Schwaber C.
Publication venue: W&M ScholarWorks
Publication date: 01/01/2016
Field of study

One goal of performance testing is to find specific test input data for exposing performance bottlenecks and identify the methods responsible for these performance bottlenecks. A big and important challenges of performance testing is how to deeply understand the performance behaviors of a non-trivial software system in terms of test input data to properly select the specific test input values for finding the problematic methods. Thus, we propose this research program to automatically analyze performance behaviors in software and link these behaviors with test input data for selecting the specific ones that can expose performance bottlenecks. In addition, this research further examines the corresponding execution traces of selected inputs for targeting the problematic methods

Crossref

College of William & Mary: W&M Publish

The Three Pillars of Machine Programming

Author: Albarghouthi Aws
Alur Rajeev
Babble Labble Chris Re.
Bielik Pavol
Clinton Whaley R.
D’Antoni Loris
Frigo Matteo
Inala Jeevana Priya
Iyer Srinivasan
Le Xuan-Bach D.
Lei Tao
Meng Na
Peleg Hila
Singh Rishabh
Thien Nguyen Hoang Duong
Zheng Wenting
Publication venue: ScholarlyCommons
Publication date: 01/01/2018
Field of study

In this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research. Those pillars are:(i) intention,(ii) invention, and (iii) adaptation. Intention emphasizes advancements in the human-to-computer and computer-to-machine-learning interfaces. Invention emphasizes the creation or refinement of algorithms or core hardware and software building blocks through machine learning (ML). Adaptation emphasizes advances in the use of ML-based constructs to autonomously evolve software

Crossref

ScholarlyCommons@Penn