7 research outputs found

    Recovering Transitive Traceability Links among Software Artifacts

    Get PDF
    Abstract-Although many methods have been suggested to automatically recover traceability links in software development, they do not cover all link combinations (e.g., links between the source code and test cases) because specific documents or artifact features (e.g., log documents and structures of source code) are used. In this paper, we propose a method called the Connecting Links Method (CLM) to recover transitive traceability links between two artifacts using a third artifact. Because CLM uses a different artifact as a document, it can be applied to kinds of various data. Basically, CLM recovers traceability links using the Vector Space Model (VSM) in Information Retrieval (IR) methods. For example, by connecting links between A and B and between B and C, CLM retrieves the link between A and C transitively. In this way, CLM can recover transitive traceability links when a suggested method cannot. Here we demonstrate that CLM can effectively recover links that VSM cannot using Open Source Software

    Datasets Used in Fifteen Years of Automated Requirements Traceability Research

    Get PDF
    Datasets are crucial to advance automated software traceability research. Acquiring such datasets come in a high cost and require expert knowledge to manually collect and validate them. Obtaining such software development datasets has been one of the most frequently reported barrier for researchers in the software engineering domain in general. This problem is even more acute in field of requirement traceability, which plays crucial role in safety critical and highly regulated systems. Therefore, the main motivation behind this work is to analyze the current state of art of datasets used in the field of software traceability. This work presents a first-of-its-kind literature study to review and assess the datasets that have been used in software traceability research over the last fifteen years. It articulates several attributes related to these datasets such as their characteristics, threats and diversity. Firstly, 202 primary studies (refer Appendix A) were identified for purpose of this study, which were used to derive 73 unique datasets. These 73 datasets were studied in-depth and several attributes (size, type, domain, availability, artifacts) were extracted (refer Appendix B). Based on analysis of the primary studies, a threat to validity reference model, tailored to Software traceability datasets was derived (refer to figure 4.4). Furthermore, to put some light upon the dataset diversity trend in the Software traceability community, a metric called Dataset Diversity Ratio was derived for 38 authors (refer to figure 4.5) who have published more than one publication in field of software traceability

    Information Retrieval based requirement traceability recovery approaches- A systematic literature review

    Get PDF
    Abstract: The term traceability is an important concept regarding software development. It enables software engineers to trace requirements from their origin to fulfillment. Maintaining traceability manually is a time consuming and expensive job. Information retrieval methods provide a mean of automation for requirement traceability. A visible number of IR based traceability techniques have been proposed in the literature, but the adoption of these techniques in the industry is limited. In this paper, we examine the information retrieval-based traceability recovery approaches through systematic literature review. We presented a synthesis of these techniques. We also identified challenges that are potentially limiting the adoption of IR based traceability recovery approaches. We conclude that term mismatch is a major barrier faced by IR based approaches. We also did classify the approaches that are attempting to solve the term mismatch problem

    Exploring Organizations\u27 Software Quality Assurance Strategies

    Get PDF
    Poor software quality leads to lost profits and even loss of life. U.S. organizations lose billions of dollars annually because of poor software quality. The purpose of this multiple case study was to explore the strategies that quality assurance (QA) leaders in small software development organizations used for successful software quality assurance (SQA) processes. A case study provided the best research design to allow for the exploration of organizational and managerial processes. The target population group was the QA leaders of 3 small software development organizations who successfully implemented SQA processes, located in Saint John, New Brunswick, Canada. The conceptual framework that grounded this study was total quality management (TQM) established by Deming in 1980. Face-to-face semistructured interviews with 2 QA leaders from each organization and documentation including process and training materials provided all the data for analysis. NVivo software aided a qualitative analysis of all collected data using a process of disassembling the data into common codes, reassembling the data into themes, interpreting the meaning, and concluding the data. The resulting major themes were Agile practices, documentation, testing, and lost profits. The results were in contrast to the main themes discovered in the literature review, although there was some overlap. The implications for positive social change include the potential to provide QA leaders with the strategies to improve SQA processes, thereby allowing for improved profits, contributing to the organizations\u27 longevity in business, and strengthening the local economy

    Code Patterns for Automatically Validating Requirements-to-Code Traces

    No full text
    Traces between requirements and code reveal where requirements are implemented. Such traces are essential for code understandingandchangemanagement. Unfortunately,traces are known to be error prone. This paper introduces a novel approach for validating requirements-to-code traces through calling relationships withinthecode. Asinput, theapproach requires an executable system, the corresponding requirements, and the requirements-to-code traces that need validating. As output, the approach identifies likely incorrect or missingtracesbyinvestigatingpatternsoftraceswithcalling relationships. The empirical evaluation of four case study systems covering 150 KLOC and 59 requirements demonstrates that the approach detects most errors with 85-95% precision and 82-96 % recall and is able to handle traces of varying levels of correctness and completeness. The approach is fully automated, tool supported, and scalable

    Towards an Intelligent System for Software Traceability Datasets Generation

    Get PDF
    Software datasets and artifacts play a crucial role in advancing automated software traceability research. They can be used by researchers in different ways to develop or validate new automated approaches. Software artifacts, other than source code and issue tracking entities, can also provide a great deal of insight into a software system and facilitate knowledge sharing and information reuse. The diversity and quality of the datasets and artifacts within a research community have a significant impact on the accuracy, generalizability, and reproducibility of the results and consequently on the usefulness and practicality of the techniques under study. Collecting and assessing the quality of such datasets are not trivial tasks and have been reported as an obstacle by many researchers in the domain of software engineering. In this dissertation, we report our empirical work that aims to automatically generate and assess the quality of such datasets. Our goal is to introduce an intelligent system that can help researchers in the domain of software traceability in obtaining high-quality “training sets”, “testing sets” or appropriate “case studies” from open source repositories based on their needs. In the first project, we present a first-of-its-kind study to review and assess the datasets that have been used in software traceability research over the last fifteen years. It presents and articulates the current status of these datasets, their characteristics, and their threats to validity. Second, this dissertation introduces a Traceability-Dataset Quality Assessment (T-DQA) framework to categorize software traceability datasets and assist researchers to select appropriate datasets for their research based on different characteristics of the datasets and the context in which those datasets will be used. Third, we present the results of an empirical study with limited scope to generate datasets using three baseline approaches for the creation of training data. These approaches are (i) Expert-Based, (ii) Automated Web-Mining, which generates training sets by automatically mining tactic\u27s APIs from technical programming websites, and lastly, (iii) Automated Big-Data Analysis, which mines ultra-large-scale code repositories to generate training sets. We compare the trace-link creation accuracy achieved using each of these three baseline approaches and discuss the costs and benefits associated with them. Additionally, in a separate study, we investigate the impact of training set size on the accuracy of recovering trace links. Finally, we conduct a large-scale study to identify which types of software artifacts are produced by a wide variety of open-source projects at different levels of granularity. Then we propose an automated approach based on Machine Learning techniques to identify various types of software artifacts. Through a set of experiments, we report and compare the performance of these algorithms when applied to software artifacts. Finally, we conducted a study to understand how software traceability experts and practitioners evaluate the quality of their datasets. In addition, we aim at gathering experts’ opinions on all quality attributes and metrics proposed by T-DQA

    相互作用に着目したAjax Webアプリケーションの予防保守

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 萩谷 昌己,, 東京大学教授 須田 礼仁, 東京大学教授 小林 直樹, 東京大学講師 蓮尾 一郎, 東京大学教授 千葉 滋University of Tokyo(東京大学
    corecore