727 research outputs found

    Use of IBM Collaborative Lifecycle Management Solution to Demonstrate Traceability for Small, Real-World Software Development Project

    Get PDF
    The Standish Group Study of 1994 showed that 53 percent of software projects failed outright and another 31 percent were challenged by extreme budget and/or time overrun. Since then different responses to the high rate of software project failures have been proposed. SEI’s CMMI, the ISO’s 9001:2000 for software development, and the IEEE’s JSTD-016 are some examples of such responses. Traceability is the one common feature that these software development standards impose. Over the last decade, software and system engineering communities have been researching subjects such as developing more sophisticated tooling, applying information retrieval techniques capable of semi-automating the trace creation and maintenance process, developing new trace query languages and visualization techniques that use trace links, applying traceability in specific domains such as Model Driven Development, product line systems and agile project environment. These efforts have not been in vain. The 2012 CHAOS results show an increase in project success rate of 39% (delivered on time, on budget, with required features and functions), and a decrease of 18% in the number of failures (cancelled prior to completion or delivered and never used). Since research has shown traceability can improve a project’s success rate, the main purpose of this thesis is to demonstrate traceability for a small, real-world software development project using IBM Collaborative Lifecycle Management. The objective of this research was fulfilled since the case study of traceability was described in detail as applied to the design and development of the Value Adjustment Board Project (VAB) of City of Jacksonville using the scrum development approach within the IBM Rational Collaborative Lifecycle Management Solution. The results may benefit researchers and practitioners who are looking for evidence to use the IBM CLM solution to trace artifacts in a small project

    Preserving the Quality of Architectural Tactics in Source Code

    Get PDF
    In any complex software system, strong interdependencies exist between requirements and software architecture. Requirements drive architectural choices while also being constrained by the existing architecture and by what is economically feasible. This makes it advisable to concurrently specify the requirements, to devise and compare alternative architectural design solutions, and ultimately to make a series of design decisions in order to satisfy each of the quality concerns. Unfortunately, anecdotal evidence has shown that architectural knowledge tends to be tacit in nature, stored in the heads of people, and lost over time. Therefore, developers often lack comprehensive knowledge of underlying architectural design decisions and inadvertently degrade the quality of the architecture while performing maintenance activities. In practice, this problem can be addressed through preserving the relationships between the requirements, architectural design decisions and their implementations in the source code, and then using this information to keep developers aware of critical architectural aspects of the code. This dissertation presents a novel approach that utilizes machine learning techniques to recover and preserve the relationships between architecturally significant requirements, architectural decisions and their realizations in the implemented code. Our approach for recovering architectural decisions includes the two primary stages of training and classification. In the first stage, the classifier is trained using code snippets of different architectural decisions collected from various software systems. During this phase, the classifier learns the terms that developers typically use to implement each architectural decision. These ``indicator terms\u27\u27 represent method names, variable names, comments, or the development APIs that developers inevitably use to implement various architectural decisions. A probabilistic weight is then computed for each potential indicator term with respect to each type of architectural decision. The weight estimates how strongly an indicator term represents a specific architectural tactics/decisions. For example, a term such as \emph{pulse} is highly representative of the heartbeat tactic but occurs infrequently in the authentication. After learning the indicator terms, the classifier can compute the likelihood that any given source file implements a specific architectural decision. The classifier was evaluated through several different experiments including classical cross-validation over code snippets of 50 open source projects and on the entire source code of a large scale software system. Results showed that classifier can reliably recognize a wide range of architectural decisions. The technique introduced in this dissertation is used to develop the Archie tool suite. Archie is a plug-in for Eclipse and is designed to detect wide range of architectural design decisions in the code and to protect them from potential degradation during maintenance activities. It has several features for performing change impact analysis of architectural concerns at both the code and design level and proactively keep developers informed of underlying architectural decisions during maintenance activities. Archie is at the stage of technology transfer at the US Department of Homeland Security where it is purely used to detect and monitor security choices. Furthermore, this outcome is integrated into the Department of Homeland Security\u27s Software Assurance Market Place (SWAMP) to advance research and development of secure software systems

    Towards an Intelligent System for Software Traceability Datasets Generation

    Get PDF
    Software datasets and artifacts play a crucial role in advancing automated software traceability research. They can be used by researchers in different ways to develop or validate new automated approaches. Software artifacts, other than source code and issue tracking entities, can also provide a great deal of insight into a software system and facilitate knowledge sharing and information reuse. The diversity and quality of the datasets and artifacts within a research community have a significant impact on the accuracy, generalizability, and reproducibility of the results and consequently on the usefulness and practicality of the techniques under study. Collecting and assessing the quality of such datasets are not trivial tasks and have been reported as an obstacle by many researchers in the domain of software engineering. In this dissertation, we report our empirical work that aims to automatically generate and assess the quality of such datasets. Our goal is to introduce an intelligent system that can help researchers in the domain of software traceability in obtaining high-quality “training sets”, “testing sets” or appropriate “case studies” from open source repositories based on their needs. In the first project, we present a first-of-its-kind study to review and assess the datasets that have been used in software traceability research over the last fifteen years. It presents and articulates the current status of these datasets, their characteristics, and their threats to validity. Second, this dissertation introduces a Traceability-Dataset Quality Assessment (T-DQA) framework to categorize software traceability datasets and assist researchers to select appropriate datasets for their research based on different characteristics of the datasets and the context in which those datasets will be used. Third, we present the results of an empirical study with limited scope to generate datasets using three baseline approaches for the creation of training data. These approaches are (i) Expert-Based, (ii) Automated Web-Mining, which generates training sets by automatically mining tactic\u27s APIs from technical programming websites, and lastly, (iii) Automated Big-Data Analysis, which mines ultra-large-scale code repositories to generate training sets. We compare the trace-link creation accuracy achieved using each of these three baseline approaches and discuss the costs and benefits associated with them. Additionally, in a separate study, we investigate the impact of training set size on the accuracy of recovering trace links. Finally, we conduct a large-scale study to identify which types of software artifacts are produced by a wide variety of open-source projects at different levels of granularity. Then we propose an automated approach based on Machine Learning techniques to identify various types of software artifacts. Through a set of experiments, we report and compare the performance of these algorithms when applied to software artifacts. Finally, we conducted a study to understand how software traceability experts and practitioners evaluate the quality of their datasets. In addition, we aim at gathering experts’ opinions on all quality attributes and metrics proposed by T-DQA

    Recovering from a Decade: A Systematic Mapping of Information Retrieval Approaches to Software Traceability

    Get PDF
    Engineers in large-scale software development have to manage large amounts of information, spread across many artifacts. Several researchers have proposed expressing retrieval of trace links among artifacts, i.e. trace recovery, as an Information Retrieval (IR) problem. The objective of this study is to produce a map of work on IR-based trace recovery, with a particular focus on previous evaluations and strength of evidence. We conducted a systematic mapping of IR-based trace recovery. Of the 79 publications classified, a majority applied algebraic IR models. While a set of studies on students indicate that IR-based trace recovery tools support certain work tasks, most previous studies do not go beyond reporting precision and recall of candidate trace links from evaluations using datasets containing less than 500 artifacts. Our review identified a need of industrial case studies. Furthermore, we conclude that the overall quality of reporting should be improved regarding both context and tool details, measures reported, and use of IR terminology. Finally, based on our empirical findings, we present suggestions on how to advance research on IR-based trace recovery

    A document based traceability model for test management

    Get PDF
    Software testing has became more complicated in the emergence of distributed network, real-time environment, third party software enablers and the need to test system at multiple integration levels. These scenarios have created more concern over the quality of software testing. The quality of software has been deteriorating due to inefficient and ineffective testing activities. One of the main flaws is due to ineffective use of test management to manage software documentations. In documentations, it is difficult to detect and trace bugs in some related documents of which traceability is the major concern. Currently, various studies have been conducted on test management, however very few have focused on document traceability in particular to support the error propagation with respect to documentation. The objective of this thesis is to develop a new traceability model that integrates software engineering documents to support test management. The artefacts refer to requirements, design, source code, test description and test result. The proposed model managed to tackle software traceability in both forward and backward propagations by implementing multi-bidirectional pointer. This platform enabled the test manager to navigate and capture a set of related artefacts to support test management process. A new prototype was developed to facilitate observation of software traceability on all related artefacts across the entire documentation lifecycle. The proposed model was then applied to a case study of a finished software development project with a complete set of software documents called the On-Board Automobile (OBA). The proposed model was evaluated qualitatively and quantitatively using the feature analysis, precision and recall, and expert validation. The evaluation results proved that the proposed model and its prototype were justified and significant to support test management

    Un enfoque de recuperación de información para ayudar a los usuarios en los procesos de ingeniería de software

    Get PDF
    Documents written in natural language constitute a major part of the artifacts produced during the software engineering life cycle. There is a growing interest in creating tools that can assist users in all phases of the software life cycle. The assistance requires techniques that go beyond traditional static and dynamic analysis. An example of such a technique is the application of information retrieval (IR), which exploits information found in documents of a software engineering process. The increased availability of data created as part of the software development process allows managers to apply novel analysis techniques on the data and use the results to guide the project´s stakeholders. These data are then used to predict defects, gather insight into a project´s life-cycle, and other tasks. This work proposes an IR approach to assist users in software engineering processes according their profile. The approach consists in recommending them related documents to a retrieved one in order to users understand and follow the process in a correct way. Furthermore, the assistance concentrates on legacy systems in which engineers must acquire knowledge generated by others. Implementation of the approach and an overview of evaluation are also summarized.Os documentos escritos em linguagem natural constituem uma parte importante dos artefatos produzidos durante o ciclo de vida da engenharia de software. Há um interesse crescente em criar ferramentas que possam ajudar os usuários em todas as fases do ciclo de vida do software. A assistência requer técnicas que vão além da análise tradicional estática e dinâmica. Um exemplo de tal técnica é a aplicação de recuperação de informações (RI), que explora informações encontradas em documentos de um processo de engenharia de software. A maior disponibilidade de dados criados como parte do processo de desenvolvimento de software permite que os gerentes apliquem técnicas de análise inovadoras nos dados e usem os resultados para orientar as partes interessadas do projeto. Esses dados são usados para prever defeitos, coletar informações sobre o ciclo de vida de um projeto e outras tarefas. Este trabalho propõe uma abordagem de RI para auxiliar os usuários em processos de engenharia de software de acordo com o seu perfil. A abordagem consiste em lhes recomendar os documentos relacionados para que sejam recuperados, a fim de que os usuários compreendam e sigam o processo de maneira correta. Além disso, a assistência concentra-se em sistemas legados nos quais os engenheiros devem adquirir conhecimento gerado por outros. O desenvolvimento da abordagem e uma visão geral da avaliação são apresentados.Los documentos escritos en lenguaje natural constituyen una parte fundamental de los artefactos producidos durante el ciclo de ciclo de vida del desarrollo de software. Hay un creciente interés en las herramientas de creación que pueden ayudar a los usuarios en todas las fases del ciclo de vida del software. La asistencia requiere técnicas que van más allá del análisis estático y dinámico. Por ejemplo, la aplicación de recuperación de información (RI) explota información encontrada en los documentos del proceso de desarrollo de software. La creciente disponibilidad de datos creados a partir del proceso de desarrollo de software permite a los líderes de proyectos aplicar estrategias de análisis sobre los datos como parte del proceso y utilizar los resultados para guiar a los grupos de interesados. Estos datos se utilizan para evitar defectos, recolectar información sobre el ciclo de vida de un proyecto, y otras tareas. En este trabajo se propone un enfoque de RI para asistir a los usuarios en el proceso de desarrollo de software según su perfil. El enfoque consiste en recomendar los documentos relacionados a un usuario para que éstos entiendan y sigan el proceso en una dirección correcta. Además, la asistencia está pensada en contextos de sistemas heredados en los que los desarrolladores deben adquirir el conocimiento generado por terceros. Como conclusión, se muestran la implementación del enfoque y una descripción de la evaluación.Fil: Rodríguez, Guillermo Horacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentin

    Using Large Language Models To Analyze Software Architecture Documentation

    Get PDF
    Begrenzte Trainingsdaten stellen eine Herausforderung für Traceability Link Recovery (TLR) und Inconsistency Detection (ID) dar. Große Sprachmodelle (LLMs) können dieses Problem lösen, da sie oft kein spezifisches Training benötigen. In dieser Arbeit erforschen wir verschiedene Techniken und Methoden für den Einsatz von GPT-4 für TLR und ID. Im Vergleich mit State-of-the-Art-Ansätzen erzielen unsere Ansätze beim Unmentioned-Model-Element-ID ähnliche Leistung. In der Disziplin der Missing-Model-Element ID konnten wir ihre Leistung jedoch nicht erreichen. Beim TLR erzielt Chain-of-Thought-Prompting die besten Ergebnisse, schlägt jedoch auch schlechter ab als State-of-the-Art. Die Ergebnisse sind jedoch vielversprechend und es ist anzunehmen, dass fortschrittlichere LLMs und Techniken zu Verbesserungen führen

    Exploring Maintainability Assurance Research for Service- and Microservice-Based Systems: Directions and Differences

    Get PDF
    To ensure sustainable software maintenance and evolution, a diverse set of activities and concepts like metrics, change impact analysis, or antipattern detection can be used. Special maintainability assurance techniques have been proposed for service- and microservice-based systems, but it is difficult to get a comprehensive overview of this publication landscape. We therefore conducted a systematic literature review (SLR) to collect and categorize maintainability assurance approaches for service-oriented architecture (SOA) and microservices. Our search strategy led to the selection of 223 primary studies from 2007 to 2018 which we categorized with a threefold taxonomy: a) architectural (SOA, microservices, both), b) methodical (method or contribution of the study), and c) thematic (maintainability assurance subfield). We discuss the distribution among these categories and present different research directions as well as exemplary studies per thematic category. The primary finding of our SLR is that, while very few approaches have been suggested for microservices so far (24 of 223, ?11%), we identified several thematic categories where existing SOA techniques could be adapted for the maintainability assurance of microservices

    From Bugs to Decision Support – Leveraging Historical Issue Reports in Software Evolution

    Get PDF
    Software developers in large projects work in complex information landscapes and staying on top of all relevant software artifacts is an acknowledged challenge. As software systems often evolve over many years, a large number of issue reports is typically managed during the lifetime of a system, representing the units of work needed for its improvement, e.g., defects to fix, requested features, or missing documentation. Efficient management of incoming issue reports requires the successful navigation of the information landscape of a project. In this thesis, we address two tasks involved in issue management: Issue Assignment (IA) and Change Impact Analysis (CIA). IA is the early task of allocating an issue report to a development team, and CIA is the subsequent activity of identifying how source code changes affect the existing software artifacts. While IA is fundamental in all large software projects, CIA is particularly important to safety-critical development. Our solution approach, grounded on surveys of industry practice as well as scientific literature, is to support navigation by combining information retrieval and machine learning into Recommendation Systems for Software Engineering (RSSE). While the sheer number of incoming issue reports might challenge the overview of a human developer, our techniques instead benefit from the availability of ever-growing training data. We leverage the volume of issue reports to develop accurate decision support for software evolution. We evaluate our proposals both by deploying an RSSE in two development teams, and by simulation scenarios, i.e., we assess the correctness of the RSSEs' output when replaying the historical inflow of issue reports. In total, more than 60,000 historical issue reports are involved in our studies, originating from the evolution of five proprietary systems for two companies. Our results show that RSSEs for both IA and CIA can help developers navigate large software projects, in terms of locating development teams and software artifacts. Finally, we discuss how to support the transfer of our results to industry, focusing on addressing the context dependency of our tool support by systematically tuning parameters to a specific operational setting
    corecore