454 research outputs found
Improving Traceability Link Recovery Using Fine-grained Requirements-to-Code Relations
Traceability information is a fundamental prerequisite for many essential software maintenance and evolution tasks, such as change impact and software reusability analyses. However, manually generating traceability information is costly and error-prone. Therefore, researchers have developed automated approaches that utilize textual similarities between artifacts to establish trace links. These approaches tend to achieve low precision at reasonable recall levels, as they are not able to bridge the semantic gap between high-level natural language requirements and code. We propose to overcome this limitation by leveraging fine-grained, method and sentence level, similarities between the artifacts for traceability link recovery. Our approach uses word embeddings and a Word Mover\u27s Distance-based similarity to bridge the semantic gap. The fine-grained similarities are aggregated according to the artifacts structure and participate in a majority vote to retrieve coarse-grained, requirement-to-class, trace links. In a comprehensive empirical evaluation, we show that our approach is able to outperform state-of-the-art unsupervised traceability link recovery approaches. Additionally, we illustrate the benefits of fine-grained structural analyses to word embedding-based trace link generation
Semantically Enhanced Software Traceability Using Deep Learning Techniques
In most safety-critical domains the need for traceability is prescribed by
certifying bodies. Trace links are generally created among requirements,
design, source code, test cases and other artifacts, however, creating such
links manually is time consuming and error prone. Automated solutions use
information retrieval and machine learning techniques to generate trace links,
however, current techniques fail to understand semantics of the software
artifacts or to integrate domain knowledge into the tracing process and
therefore tend to deliver imprecise and inaccurate results. In this paper, we
present a solution that uses deep learning to incorporate requirements artifact
semantics and domain knowledge into the tracing solution. We propose a tracing
network architecture that utilizes Word Embedding and Recurrent Neural Network
(RNN) models to generate trace links. Word embedding learns word vectors that
represent knowledge of the domain corpus and RNN uses these word vectors to
learn the sentence semantics of requirements artifacts. We trained 360
different configurations of the tracing network using existing trace links in
the Positive Train Control domain and identified the Bidirectional Gated
Recurrent Unit (BI-GRU) as the best model for the tracing task. BI-GRU
significantly out-performed state-of-the-art tracing methods including the
Vector Space Model and Latent Semantic Indexing.Comment: 2017 IEEE/ACM 39th International Conference on Software Engineering
(ICSE
What have we learnt from the challenges of (semi-) automated requirements traceability? A discussion on blockchain applicability.
Over the last 3 decades, researchers have attempted to shed light into the requirements traceability problem by introducing tracing tools, techniques, and methods with the vision of achieving ubiquitous traceability. Despite the technological advances, requirements traceability remains problematic for researchers and practitioners. This study aims to identify and investigate the main challenges in implementing (semi-)automated requirements traceability, as reported in the recent literature. A systematic literature review was carried out based on the guidelines for systematic literature reviews in software engineering, proposed by Kitchenham. We retrieved 4530 studies by searching five major bibliographic databases and selected 70 primary studies. These studies were analysed and classified according to the challenges they present and/or address. Twenty-one challenges were identified and were classified into five categories. Findings reveal that the most frequent challenges are technological challenges, in particular, low accuracy of traceability recovery methods. Findings also suggest that future research efforts should be devoted to the human facet of tracing, to explore traceability practices in organisational settings, and to develop traceability approaches that support agile and DevOps practices. Finally, it is recommended that researchers leverage blockchain technology as a suitable technical solution to ensure the trustworthiness of traceability information in interorganisational software projects.publishedVersio
Large Language Models for Software Engineering: A Systematic Literature Review
Large Language Models (LLMs) have significantly impacted numerous domains,
notably including Software Engineering (SE). Nevertheless, a well-rounded
understanding of the application, effects, and possible limitations of LLMs
within SE is still in its early stages. To bridge this gap, our systematic
literature review takes a deep dive into the intersection of LLMs and SE, with
a particular focus on understanding how LLMs can be exploited in SE to optimize
processes and outcomes. Through a comprehensive review approach, we collect and
analyze a total of 229 research papers from 2017 to 2023 to answer four key
research questions (RQs). In RQ1, we categorize and provide a comparative
analysis of different LLMs that have been employed in SE tasks, laying out
their distinctive features and uses. For RQ2, we detail the methods involved in
data collection, preprocessing, and application in this realm, shedding light
on the critical role of robust, well-curated datasets for successful LLM
implementation. RQ3 allows us to examine the specific SE tasks where LLMs have
shown remarkable success, illuminating their practical contributions to the
field. Finally, RQ4 investigates the strategies employed to optimize and
evaluate the performance of LLMs in SE, as well as the common techniques
related to prompt optimization. Armed with insights drawn from addressing the
aforementioned RQs, we sketch a picture of the current state-of-the-art,
pinpointing trends, identifying gaps in existing research, and flagging
promising areas for future study
Fundamental Approaches to Software Engineering
This open access book constitutes the proceedings of the 25th International Conference on Fundamental Approaches to Software Engineering, FASE 2022, which was held during April 4-5, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 17 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. The proceedings also contain 3 contributions from the Test-Comp Competition. The papers deal with the foundations on which software engineering is built, including topics like software engineering as an engineering discipline, requirements engineering, software architectures, software quality, model-driven development, software processes, software evolution, AI-based software engineering, and the specification, design, and implementation of particular classes of systems, such as (self-)adaptive, collaborative, AI, embedded, distributed, mobile, pervasive, cyber-physical, or service-oriented applications
Recovering from a Decade: A Systematic Mapping of Information Retrieval Approaches to Software Traceability
Engineers in large-scale software development have to manage large amounts of information, spread across many artifacts. Several researchers have proposed expressing retrieval of trace links among artifacts, i.e. trace recovery, as an Information Retrieval (IR) problem. The objective of this study is to produce a map of work on IR-based trace recovery, with a particular focus on previous evaluations and strength of evidence. We conducted a systematic mapping of IR-based trace recovery. Of the 79 publications classified, a majority applied algebraic IR models. While a set of studies on students indicate that IR-based trace recovery tools support certain work tasks, most previous studies do not go beyond reporting precision and recall of candidate trace links from evaluations using datasets containing less than 500 artifacts. Our review identified a need of industrial case studies. Furthermore, we conclude that the overall quality of reporting should be improved regarding both context and tool details, measures reported, and use of IR terminology. Finally, based on our empirical findings, we present suggestions on how to advance research on IR-based trace recovery
- …