98 research outputs found

    Why and How to Extract Conditional Statements From Natural Language Requirements

    Get PDF
    Functional requirements often describe system behavior by relating events to each other, e.g. "If the system detects an error (e_1), an error message shall be shown (e_2)". Such conditionals consist of two parts: the antecedent (see e_1) and the consequent (e_2), which convey strong, semantic information about the intended behavior of a system. Automatically extracting conditionals from texts enables several analytical disciplines and is already used for information retrieval and question answering. We found that automated conditional extraction can also provide added value to Requirements Engineering (RE) by facilitating the automatic derivation of acceptance tests from requirements. However, the potential of extracting conditionals has not yet been leveraged for RE. We are convinced that this has two principal reasons: 1) The extent, form, and complexity of conditional statements in RE artifacts is not well understood. We do not know how conditionals are formulated and logically interpreted by RE practitioners. This hinders the development of suitable approaches for extracting conditionals from RE artifacts. 2) Existing methods fail to extract conditionals from Unrestricted Natural Language (NL) in fine-grained form. That is, they do not consider the combinatorics between antecedents and consequents. They also do not allow to split them into more fine-granular text fragments (e.g., variable and condition), rendering the extracted conditionals unsuitable for RE downstream tasks such as test case derivation. This thesis contributes to both areas. In Part I, we present empirical results on the prevalence and logical interpretation of conditionals in RE artifacts. Our case study corroborates that conditionals are widely used in both traditional and agile requirements such as acceptance criteria. We found that conditionals in requirements mainly occur in explicit, marked form and may include up to three antecedents and two consequents. Hence, the extraction approach needs to understand conjunctions, disjunctions, and negations to fully capture the relation between antecedents and consequents. We also found that conditionals are a source of ambiguity and there is not just one way to interpret them formally. This affects any automated analysis that builds upon formalized requirements (e.g., inconsistency checking) and may also influence guidelines for writing requirements. Part II presents our tool-supported approach CiRA capable of detecting conditionals in NL requirements and extracting them in fine-grained form. For the detection, CiRA uses syntactically enriched BERT embeddings combined with a softmax classifier and outperforms existing methods (macro-F_1: 82%). Our experiments show that a sigmoid classifier built on RoBERTa embeddings is best suited to extract conditionals in fine-grained form (macro-F_1: 86%). We disclose our code, data sets, and trained models to facilitate replication. CiRA is available at http://www.cira.bth.se/demo/. In Part III, we highlight how the extraction of conditionals from requirements can help to create acceptance tests automatically. First, we motivate this use case in an empirical study and demonstrate that the lack of adequate acceptance tests is one of the major problems in agile testing. Second, we show how extracted conditionals can be mapped to a Cause-Effect-Graph from which test cases can be derived automatically. We demonstrate the feasibility of our approach in a case study with three industry partners. In our study, out of 578 manually created test cases, 71.8% can be generated automatically. Furthermore, our approach discovered 80 relevant test cases that were missed in manual test case design. At the end of this thesis, the reader will have an understanding of (1) the notion of conditionals in RE artifacts, (2) how to extract them in fine-grained form, and (3) the added value that the extraction of conditionals can provide to RE

    DTCRSKG: A Deep Travel Conversational Recommender System Incorporating Knowledge Graph

    Get PDF
    In the era of information explosion, it is difficult for people to obtain their desired information effectively. In tourism, a travel recommender system based on big travel data has been developing rapidly over the last decade. However, most work focuses on click logs, visit history, or ratings, and dynamic prediction is absent. As a result, there are significant gaps in both dataset and recommender models. To address these gaps, in the first step of this study, we constructed two human-annotated datasets for the travel conversational recommender system. We provided two linked data sets, namely, interaction sequence and dialogue data sets. The usage of the former data set was done to fully explore the static preference characteristics of users based on it, while the latter identified the dynamics changes in user preference from it. Then, we proposed and evaluated BERT-based baseline models for the travel conversational recommender system and compared them with several representative non-conversational and conversational recommender system models. Extensive experiments demonstrated the effectiveness and robustness of our approach regarding conversational recommendation tasks. Our work can extend the scope of the travel conversational recommender system and our annotated data can also facilitate related research

    On reliability of patch correctness assessment

    Get PDF

    Survey over Existing Query and Transformation Languages

    Get PDF
    A widely acknowledged obstacle for realizing the vision of the Semantic Web is the inability of many current Semantic Web approaches to cope with data available in such diverging representation formalisms as XML, RDF, or Topic Maps. A common query language is the first step to allow transparent access to data in any of these formats. To further the understanding of the requirements and approaches proposed for query languages in the conventional as well as the Semantic Web, this report surveys a large number of query languages for accessing XML, RDF, or Topic Maps. This is the first systematic survey to consider query languages from all these areas. From the detailed survey of these query languages, a common classification scheme is derived that is useful for understanding and differentiating languages within and among all three areas

    Learning Code Change Semantics for Patch Correctness Assessment in Program Repair

    Get PDF
    State-of-the-art APR techniques currently produce patches that are manually evaluated as overfitting, and these overfitting patches often worsen the original program, leading to negative effects such as introducing security vulnerabilities and removing useful features. This obstructs the development of APR techniques that rely on feedback from correctly generated patches, and the expense of developers’ manual debugging has shifted to evaluating patch correctness. Automated assessment of patch correctness has the potential to reduce patch validation costs and accelerate the identification of practically correct patches, making it easier for developers to adopt APR techniques. While the proposed approaches have been demonstrated to be effective in the literature, several challenges remain unexplored and warrant further investigation. This thesis begins with an empirical analysis of a prevalent hypothesis concerning patch correctness, leading to the establishment of a patch correctness prediction framework based on representation learning. Second, we propose to validate correct patches by proposing a novel heuristic on the relationship between patches and their associated failing test cases. Lastly, we present a novel perspective to assess patch correctness with natural language processing. Our contributions to the research field through this thesis are as follows: 1) assessing the feasibility of utilizing advancements in deep representation learning to generate patch embeddings suitable for reasoning about correctness. Consequently, we establish Leopard, a supervised learning- based patch correctness prediction framework. 2) comparing code embeddings and engineered features for patch correctness prediction, and investigating their combination in Panther (an upgraded version of Leopard) for more accurate classification. Additionally, we use the SHAP explainability model to reveal the essential aspects of patch correctness by interpreting underlying causes of prediction performance across features and classifiers. 3) presenting and validating a key hypothesis: when different programs fail to pass similar test cases, it is likely that these programs require similar code changes. Based on this heuristic, we propose BATS, an approach predicting patch correctness by statically comparing generated patches against previous correct patches failing on similar tests. 4) proposing a novel perspective to patch correctness assessment: a correct patch implements changes that answer to the issue caused by the buggy behavior. By leveraging bug reports to offer an explicit description of the bug, we build Quatrain, a supervised learning approach that utilizes a deep NLP model to predict the relevance between a bug report and a patch description

    Flakify: A Black-Box, Language Model-Based Predictor for Flaky Tests

    Get PDF
    Software testing assures that code changes do not adversely affect existing functionality. However, a test case can be flaky, i.e., passing and failing across executions, even for the same version of the source code. Flaky test cases introduce overhead to software development as they can lead to unnecessary attempts to debug production or testing code. The state-of-the-art ML-based flaky test case predictors rely on pre-defined sets of features that are either project-specific, require access to production code, which is not always available to software test engineers. Therefore, in this paper, we propose Flakify, a black-box, language model-based predictor for flaky test cases. Flakify relies exclusively on the source code of test cases, thus not requiring to (a) access to production code (black-box), (b) rerun test cases, (c) pre-define features. To this end, we employed CodeBERT, a pre-trained language model, and fine-tuned it to predict flaky test cases using the source code of test cases. We evaluated Flakify on two publicly available datasets (FlakeFlagger and IDoFT) for flaky test cases and compared our technique with the FlakeFlagger approach using two different evaluation procedures: cross-validation and per-project validation. Flakify achieved high F1-scores on both datasets using cross-validation and per-project validation, and surpassed FlakeFlagger by 10 and 18 percentage points in terms of precision and recall, respectively, when evaluated on the FlakeFlagger dataset, thus reducing the cost bound to be wasted on unnecessarily debugging test cases and production code by the same percentages. Flakify also achieved significantly higher prediction results when used to predict test cases on new projects, suggesting better generalizability over FlakeFlagger. Our results further show that a black-box version of FlakeFlagger is not a viable option for predicting flaky test cases

    Automatic Generation of Personalized Recommendations in eCoaching

    Get PDF
    Denne avhandlingen omhandler eCoaching for personlig livsstilsstøtte i sanntid ved bruk av informasjons- og kommunikasjonsteknologi. Utfordringen er å designe, utvikle og teknisk evaluere en prototyp av en intelligent eCoach som automatisk genererer personlige og evidensbaserte anbefalinger til en bedre livsstil. Den utviklede løsningen er fokusert på forbedring av fysisk aktivitet. Prototypen bruker bærbare medisinske aktivitetssensorer. De innsamlede data blir semantisk representert og kunstig intelligente algoritmer genererer automatisk meningsfulle, personlige og kontekstbaserte anbefalinger for mindre stillesittende tid. Oppgaven bruker den veletablerte designvitenskapelige forskningsmetodikken for å utvikle teoretiske grunnlag og praktiske implementeringer. Samlet sett fokuserer denne forskningen på teknologisk verifisering snarere enn klinisk evaluering.publishedVersio

    Supporting learning object versioning

    Get PDF
    A current popular paradigm in e-learning is that of the "learning object". Broadly de-fined, a learning object is a reusable piece of educational material intended to be strung together with other learning objects to form larger educational units such as activities, lessons, or whole courses. This aggregating of learning objects together is a recursive process – small objects can be combined to form medium sized objects, medium sized objects can be combined to form large objects, and so on. Once objects have been com-bined appropriately, they are generally serialized into content packages, and deployed into an online course for delivery to learners.Learning objects are often stored in distributed and decentralized repositories throughout the Internet. This provides unique challenges when managing the history of such an ob-ject, as traditional versioning techniques (e.g. CVS, RCS, etc.) rely on centralized man-agement. These challenges have been largely ignored by the educational technology community, but are becoming more important as sharing of learning objects increases.This thesis explores these issues by providing a formal version model for learning ob-jects, a set of data bindings for this model, and a prototype authoring environment which implements these bindings. In addition, the work explores the potential benefits of ver-sion control by implementing a visualization of a learning object revision tree. This visualization includes the relationship between objects and their aggregates, the struc-tural history of an object, and the semantic changes that an object has undergone

    Software Engineering 2021 : Fachtagung vom 22.-26. Februar 2021 Braunschweig/virtuell

    Get PDF
    • …
    corecore