18 research outputs found

    Investigating ChatGPT's Potential to Assist in Requirements Elicitation Processes

    Full text link
    Natural Language Processing (NLP) for Requirements Engineering (RE) (NLP4RE) seeks to apply NLP tools, techniques, and resources to the RE process to increase the quality of the requirements. There is little research involving the utilization of Generative AI-based NLP tools and techniques for requirements elicitation. In recent times, Large Language Models (LLM) like ChatGPT have gained significant recognition due to their notably improved performance in NLP tasks. To explore the potential of ChatGPT to assist in requirements elicitation processes, we formulated six questions to elicit requirements using ChatGPT. Using the same six questions, we conducted interview-based surveys with five RE experts from academia and industry and collected 30 responses containing requirements. The quality of these 36 responses (human-formulated + ChatGPT-generated) was evaluated over seven different requirements quality attributes by another five RE experts through a second round of interview-based surveys. In comparing the quality of requirements generated by ChatGPT with those formulated by human experts, we found that ChatGPT-generated requirements are highly Abstract, Atomic, Consistent, Correct, and Understandable. Based on these results, we present the most pressing issues related to LLMs and what future research should focus on to leverage the emergent behaviour of LLMs more effectively in natural language-based RE activities.Comment: Accepted at SEAA 2023. 8 pages, 5 figure

    Supporting the Development of Cyber-Physical Systems with Natural Language Processing: A Report

    Get PDF
    Software has become the driving force for innovations in any technical system that observes the environment with different sensors and influence it by controlling a number of actuators; nowadays called Cyber-Physical System (CPS). The development of such systems is inherently inter-disciplinary and often contains a number of independent subsystems. Due to this diversity, the majority of development information is expressed in natural language artifacts of all kinds. In this paper, we report on recent results that our group has developed to support engineers of CPSs in working with the large amount of information expressed in natural language. We cover the topics of automatic knowledge extraction, expert systems, and automatic requirements classification. Furthermore, we envision that natural language processing will be a key component to connect requirements with simulation models and to explain tool-based decisions. We see both areas as promising for supporting engineers of CPSs in the future

    Improving Requirements Completeness: Automated Assistance through Large Language Models

    Full text link
    Natural language (NL) is arguably the most prevalent medium for expressing systems and software requirements. Detecting incompleteness in NL requirements is a major challenge. One approach to identify incompleteness is to compare requirements with external sources. Given the rise of large language models (LLMs), an interesting question arises: Are LLMs useful external sources of knowledge for detecting potential incompleteness in NL requirements? This article explores this question by utilizing BERT. Specifically, we employ BERT's masked language model (MLM) to generate contextualized predictions for filling masked slots in requirements. To simulate incompleteness, we withhold content from the requirements and assess BERT's ability to predict terminology that is present in the withheld content but absent in the disclosed content. BERT can produce multiple predictions per mask. Our first contribution is determining the optimal number of predictions per mask, striking a balance between effectively identifying omissions in requirements and mitigating noise present in the predictions. Our second contribution involves designing a machine learning-based filter to post-process BERT's predictions and further reduce noise. We conduct an empirical evaluation using 40 requirements specifications from the PURE dataset. Our findings indicate that: (1) BERT's predictions effectively highlight terminology that is missing from requirements, (2) BERT outperforms simpler baselines in identifying relevant yet missing terminology, and (3) our filter significantly reduces noise in the predictions, enhancing BERT's effectiveness as a tool for completeness checking of requirements.Comment: Submitted to Requirements Engineering Journal (REJ) - REFSQ'23 Special Issue. arXiv admin note: substantial text overlap with arXiv:2302.0479

    Knowledge Extraction from Natural Language Requirements into a Semantic Relation Graph

    Get PDF
    Knowledge extraction and representation aims to identify information and to transform it into a machine-readable format. Knowledge representations support Information Retrieval tasks such as searching for single statements, documents, or metadata. Requirements specifications of complex systems such as automotive software systems are usually divided into different subsystem specifications. Nevertheless, there are semantic relations between individual documents of the separated subsystems, which have to be considered in further processes (e.g. dependencies). If requirements engineers or other developers are not aware of these relations, this can lead to inconsistencies or malfunctions of the overall system. Therefore, there is a strong need for tool support in order to detects semantic relations in a set of large natural language requirements specifications. In this work we present a knowledge extraction approach based on an explicit knowledge representation of the content of natural language requirements as a semantic relation graph. Our approach is fully automated and includes an NLP pipeline to transform unrestricted natural language requirements into a graph. We split the natural language into different parts and relate them to each other based on their semantic relation. In addition to semantic relations, other relationships can also be included in the graph. We envision to use a semantic search algorithm like spreading activation to allow users to search different semantic relations in the graph

    NLG4RE:How NL generation can support validation in RE

    Get PDF

    NLG4RE:How NL generation can support validation in RE

    Get PDF

    Automated Handling of Anaphoric Ambiguity in Requirements: A Multi-solution Study

    Get PDF
    Ambiguity is a pervasive issue in natural-language requirements. A common source of ambiguity in requirements is when a pronoun is anaphoric. In requirements engineering, anaphoric ambiguity occurs when a pronoun can plausibly refer to different entities and thus be interpreted differently by different readers. In this paper, we develop an accurate and practical automated approach for handling anaphoric ambiguity in requirements, addressing both ambiguity detection and anaphora interpretation. In view of the multiple competing natural language processing (NLP) and machine learning (ML) technologies that one can utilize, we simultaneously pursue six alternative solutions, empirically assessing each using a collection of ~1,350 industrial requirements. The alternative solution strategies that we consider are natural choices induced by the existing technologies; these choices frequently arise in other automation tasks involving natural-language requirements. A side-by-side empirical examination of these choices helps develop insights about the usefulness of different state-of-the-art NLP and ML technologies for addressing requirements engineering problems. For the ambiguity detection task, we observe that supervised ML outperforms both a large-scale language model, SpanBERT (a variant of BERT), as well as a solution assembled from off-the-shelf NLP coreference resolvers. In contrast, for anaphora interpretation, SpanBERT yields the most accurate solution. In our evaluation, (1) the best solution for anaphoric ambiguity detection has an average precision of ~60% and a recall of 100%, and (2) the best solution for anaphora interpretation (resolution) has an average success rate of ~98%

    NLG4RE:How NL generation can support validation in RE

    Get PDF
    Context and motivation: All too frequently functional requirements (FRs) for a (software) system are unclear. Written in natural language, FRs are underspecified for software developers; when written in formal language, FRs are insufficiently comprehensible for users. This is a well-known problem in RE. As long as this either/or dichotomy exists, FRs cannot be a “basis for common agreement among all parties involved”, as Barry Boehm puts it.Question/problem: On the one hand, FRs should unambiguously specify the functional behaviour of the system to be written or adapted, and on the other hand be fully understandable by the customer that must agree with them. What is required to achieve this goal?Principal ideas/results: A specification must describe the Statics as well as the Dynamics. In our approach it consists of a Conceptual Data Model (the data structure, i.e., the Statics) plus a set of System Sequence Descriptions (SSDs) representing the processes (i.e., the Dynamics). SSDs schematically depict the interactions between the primary actor (user), the system (as a black box), and other actors (if any), including the messages between them. We provide a set of rules to generate natural language expressions from both the ConceptualData Model and the SSDs that are understandable by the user (‘Informalisation of formal requirements’). Generating understandable representations of a specification is relevant for requirements validation tasks.Contribution to validation: We introduce a form of Natural Language Generation (the NLG in the title) by defining a grammar and mapping rules to precise and unambiguous expressions in natural language, in order to improve understandability of the FRs and the data model by the user community
    corecore