47 research outputs found

    Natural Language Processing: Emerging Neural Approaches and Applications

    Get PDF
    This Special Issue highlights the most recent research being carried out in the NLP field to discuss relative open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding interactively or autonomously from data in cognitive and neural systems, as well as on their potential or real applications in different domains

    Why and How to Extract Conditional Statements From Natural Language Requirements

    Get PDF
    Functional requirements often describe system behavior by relating events to each other, e.g. "If the system detects an error (e_1), an error message shall be shown (e_2)". Such conditionals consist of two parts: the antecedent (see e_1) and the consequent (e_2), which convey strong, semantic information about the intended behavior of a system. Automatically extracting conditionals from texts enables several analytical disciplines and is already used for information retrieval and question answering. We found that automated conditional extraction can also provide added value to Requirements Engineering (RE) by facilitating the automatic derivation of acceptance tests from requirements. However, the potential of extracting conditionals has not yet been leveraged for RE. We are convinced that this has two principal reasons: 1) The extent, form, and complexity of conditional statements in RE artifacts is not well understood. We do not know how conditionals are formulated and logically interpreted by RE practitioners. This hinders the development of suitable approaches for extracting conditionals from RE artifacts. 2) Existing methods fail to extract conditionals from Unrestricted Natural Language (NL) in fine-grained form. That is, they do not consider the combinatorics between antecedents and consequents. They also do not allow to split them into more fine-granular text fragments (e.g., variable and condition), rendering the extracted conditionals unsuitable for RE downstream tasks such as test case derivation. This thesis contributes to both areas. In Part I, we present empirical results on the prevalence and logical interpretation of conditionals in RE artifacts. Our case study corroborates that conditionals are widely used in both traditional and agile requirements such as acceptance criteria. We found that conditionals in requirements mainly occur in explicit, marked form and may include up to three antecedents and two consequents. Hence, the extraction approach needs to understand conjunctions, disjunctions, and negations to fully capture the relation between antecedents and consequents. We also found that conditionals are a source of ambiguity and there is not just one way to interpret them formally. This affects any automated analysis that builds upon formalized requirements (e.g., inconsistency checking) and may also influence guidelines for writing requirements. Part II presents our tool-supported approach CiRA capable of detecting conditionals in NL requirements and extracting them in fine-grained form. For the detection, CiRA uses syntactically enriched BERT embeddings combined with a softmax classifier and outperforms existing methods (macro-F_1: 82%). Our experiments show that a sigmoid classifier built on RoBERTa embeddings is best suited to extract conditionals in fine-grained form (macro-F_1: 86%). We disclose our code, data sets, and trained models to facilitate replication. CiRA is available at http://www.cira.bth.se/demo/. In Part III, we highlight how the extraction of conditionals from requirements can help to create acceptance tests automatically. First, we motivate this use case in an empirical study and demonstrate that the lack of adequate acceptance tests is one of the major problems in agile testing. Second, we show how extracted conditionals can be mapped to a Cause-Effect-Graph from which test cases can be derived automatically. We demonstrate the feasibility of our approach in a case study with three industry partners. In our study, out of 578 manually created test cases, 71.8% can be generated automatically. Furthermore, our approach discovered 80 relevant test cases that were missed in manual test case design. At the end of this thesis, the reader will have an understanding of (1) the notion of conditionals in RE artifacts, (2) how to extract them in fine-grained form, and (3) the added value that the extraction of conditionals can provide to RE

    Signaling coherence relations in text generation: A case study of German temporal discourse markers

    Get PDF
    This thesis addresses the question of discourse marker choice in automatic (multilingual) text generation (MLG), in particular the issue of signaling temporal coherence relations on the linguistic surface by means of discourse markers such as nachdem, als, bevor . Current text generation systems do not pay attention to the fine-grained differences in meaning (semantic and pragmatic) between similar discourse markers. Yet, choosing the appropriate marker in a given context requires detailed knowledge of the function and form of a wide range of discourse markers, and a generation architecture that integrates discourse marker choice into the overall generation process. This thesis makes contributions to these two distinct areas of research. (1) Linguistic description and representation: The thesis provides a comprehensive analysis of the semantic, pragmatic and syntactic properties of German temporal discourse markers. The results are merged into a functional classification of German temporal conjunctive relations (following the Systemic functional linguistics (SFL) approach to language). This classification is compared to existing accounts for English and Dutch. Further, the thesis addresses the question of the nature of coherence relations and proposes a paradigmatic description of coherence relations along three dimensions (ideation, interpersonal, textual), yielding composite coherence relations. (2) Discourse marker choice in text generation: The thesis proposes a discourse marker lexicon as a generic resource for storing discourse marker meaning and usage, and defines the shape of individual lexicon entries and the global organisation of the lexicon. Sample entries for German and English temporal discourse markers are given. Finally, a computational model for automatic discourse marker choice that exploits the discourse marker lexicon is presente

    Essential Speech and Language Technology for Dutch: Results by the STEVIN-programme

    Get PDF
    Computational Linguistics; Germanic Languages; Artificial Intelligence (incl. Robotics); Computing Methodologie

    How does rumination impact cognition? A first mechanistic model.

    Get PDF

    How does rumination impact cognition? A first mechanistic model.

    Get PDF
    Rumination is a process of uncontrolled, narrowly-foused neg- ative thinking that is often self-referential, and that is a hall- mark of depression. Despite its importance, little is known about its cognitive mechanisms. Rumination can be thought of as a specific, constrained form of mind-wandering. Here, we introduce a cognitive model of rumination that we devel- oped on the basis of our existing model of mind-wandering. The rumination model implements the hypothesis that rumina- tion is caused by maladaptive habits of thought. These habits of thought are modelled by adjusting the number of memory chunks and their associative structure, which changes the se- quence of memories that are retrieved during mind-wandering, such that during rumination the same set of negative memo- ries is retrieved repeatedly. The implementation of habits of thought was guided by empirical data from an experience sam- pling study in healthy and depressed participants. On the ba- sis of this empirically-derived memory structure, our model naturally predicts the declines in cognitive task performance that are typically observed in depressed patients. This study demonstrates how we can use cognitive models to better un- derstand the cognitive mechanisms underlying rumination and depression

    Automated code compliance checking in the construction domain using semantic natural language processing and logic-based reasoning

    Get PDF
    Construction projects must comply with various regulations. The manual process of checking the compliance with regulations is costly, time consuming, and error prone. With the advancement in computing technology, there have been many research efforts in automating the compliance checking process, and many software development efforts led by industry bodies/associations, software companies, and/or government organizations to develop automated compliance checking (ACC) systems. However, two main gaps in the existing ACC efforts are: (1) manual effort is needed for extracting requirements from regulatory documents and encoding these requirements in a computer-processable rule format; and (2) there is a lack of a semantic representation for supporting automated compliance reasoning that is non-proprietary, non-hidden, and user-understandable and testable. To address these gaps, this thesis proposes a new ACC method that: (1) utilizes semantic natural language processing (NLP) techniques to automatically extract regulatory information from building codes and design information from building information models (BIMs); and (2) utilizes a semantic logic-based representation to represent and reason about the extracted regulatory information and design information for compliance checking. The proposed method is composed of four main methods/algorithms that are combined in one computational framework: (1) a semantic, rule-based method and algorithm that leverage NLP techniques to automatically extract regulatory information from building codes and represent the extracted information into semantic tuples, (2) a semantic, rule-based method and algorithm that leverage NLP techniques to automatically transform the extracted regulatory information into logic rules to prepare for automated reasoning, (3) a semantic, rule-based information extraction and information transformation method and algorithm to automatically extract design information from BIMs and transform the extracted information into logic facts to prepare for automated reasoning, and (4) a logic-based information representation and compliance reasoning schema to represent regulatory and design information for enabling the automated compliance reasoning process. To test the proposed method, a building information model test case was developed based on the Duplex Apartment Project from buildingSMARTalliance of the National Institute of Building Sciences. The test case was checked for compliance with a randomly selected chapter, Chapter 19, of the International Building Code 2009. Comparing to a manually developed gold standard, 87.6% precision and 98.7% recall in noncompliance detection were achieved, on the testing data

    The role of background knowledge in reading comprehension of subject-specific texts

    Get PDF
    This thesis investigates the impact of background knowledge in L2 reading comprehension of subject-specific texts, in particular its interaction with grammar knowledge. It explores how different levels of discipline-related background knowledge, grammar knowledge, and self-reported familiarity affect individual differences in L2 reading comprehension in terms of its outcomes and process. A number of studies made assumptions about readers’ knowledge based on their study disciplines or reports by readers themselves, and thus this study also explores the difference between two operationalizations: tested background knowledge and self-reported familiarity. A mixed-methods approach was used by combining two studies: a testing study and a think-aloud study. Altogether 404 students of the School of Economics and Business, University of Ljubljana, Slovenia took part in the study; 22 in the piloting study and 382 in the main study, out of which 358 were engaged in the testing study and 24 in the think-aloud study. The quantitative and qualitative datasets were obtained from five research instruments: a grammar test, a test of discipline-related background knowledge, a reading comprehension test based on three finance texts, a post-reading questionnaire, and think-aloud verbal protocols. The results of multiple regression revealed that tested background knowledge was a significant medium strength predictor of reading comprehension, slightly stronger than grammar knowledge. In contrast, self-reported familiarity was not found to impact reading comprehension and was not its predictor. This evidence casts doubt over self-reporting as an operationalization of knowledge in L2 reading. Apart from having a facilitative effect on L2 reading comprehension, background knowledge was found to have compensatory and additive roles when interacting with grammar knowledge. The findings showed that readers with higher discipline-related background knowledge could use it to make up for lower grammar knowledge and vice versa, thus suggesting the compensation effect between the two variables. In addition, the results revealed that readers were able to use their background knowledge regardless of their level of grammar knowledge, albeit slightly less at higher levels of grammar knowledge. This finding suggests that the threshold hypothesis could not be supported. Finally, the group of students with both high background knowledge and high grammar knowledge outperformed other groups in reading comprehension, which suggests that the two variables affect reading comprehension in an additive way. The qualitative data from verbal protocols in the think-aloud study and readers’ scores from the testing study were obtained to compare the processing patterns and strategies used by readers with high and low background knowledge. Although both groups were found to use the same types of strategies, they differed in the frequency of their use. While the high background knowledge group used more correct paraphrases, elaboration, inferences, and evaluating, the low knowledge group adopted a more local-level approach by paying more attention to individual words, phrases, and sentences and reporting on various comprehension problems and inability to see the bigger picture. The results suggest differences between the groups with different levels of background knowledge with regard to semantic and pragmatic processing at the local and global level. Analysis of the verbal protocol and post-questionnaire data revealed that specialist vocabulary was the main source of difficulty in L2 reading comprehension of subject-specific texts
    corecore