Natural language (NL) is arguably the most prevalent medium for expressing
systems and software requirements. Detecting incompleteness in NL requirements
is a major challenge. One approach to identify incompleteness is to compare
requirements with external sources. Given the rise of large language models
(LLMs), an interesting question arises: Are LLMs useful external sources of
knowledge for detecting potential incompleteness in NL requirements? This
article explores this question by utilizing BERT. Specifically, we employ
BERT's masked language model (MLM) to generate contextualized predictions for
filling masked slots in requirements. To simulate incompleteness, we withhold
content from the requirements and assess BERT's ability to predict terminology
that is present in the withheld content but absent in the disclosed content.
BERT can produce multiple predictions per mask. Our first contribution is
determining the optimal number of predictions per mask, striking a balance
between effectively identifying omissions in requirements and mitigating noise
present in the predictions. Our second contribution involves designing a
machine learning-based filter to post-process BERT's predictions and further
reduce noise. We conduct an empirical evaluation using 40 requirements
specifications from the PURE dataset. Our findings indicate that: (1) BERT's
predictions effectively highlight terminology that is missing from
requirements, (2) BERT outperforms simpler baselines in identifying relevant
yet missing terminology, and (3) our filter significantly reduces noise in the
predictions, enhancing BERT's effectiveness as a tool for completeness checking
of requirements.Comment: Submitted to Requirements Engineering Journal (REJ) - REFSQ'23
Special Issue. arXiv admin note: substantial text overlap with
arXiv:2302.0479