280 research outputs found

    Code Reviewer Recommendation Based on a Hypergraph with Multiplex Relationships

    Full text link
    Code review is an essential component of software development, playing a vital role in ensuring a comprehensive check of code changes. However, the continuous influx of pull requests and the limited pool of available reviewer candidates pose a significant challenge to the review process, making the task of assigning suitable reviewers to each review request increasingly difficult. To tackle this issue, we present MIRRec, a novel code reviewer recommendation method that leverages a hypergraph with multiplex relationships. MIRRec encodes high-order correlations that go beyond traditional pairwise connections using degree-free hyperedges among pull requests and developers. This way, it can capture high-order implicit connectivity and identify potential reviewers. To validate the effectiveness of MIRRec, we conducted experiments using a dataset comprising 48,374 pull requests from ten popular open-source software projects hosted on GitHub. The experiment results demonstrate that MIRRec, especially without PR-Review Commenters relationship, outperforms existing stateof-the-art code reviewer recommendation methods in terms of ACC and MRR, highlighting its significance in improving the code review process.Comment: The 31st IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER

    Code Reviewer Recommendation for Architecture Violations: An Exploratory Study

    Full text link
    Code review is a common practice in software development and often conducted before code changes are merged into the code repository. A number of approaches for automatically recommending appropriate reviewers have been proposed to match such code changes to pertinent reviewers. However, such approaches are generic, i.e., they do not focus on specific types of issues during code reviews. In this paper, we propose an approach that focuses on architecture violations, one of the most critical type of issues identified during code review. Specifically, we aim at automating the recommendation of code reviewers, who are potentially qualified to review architecture violations, based on reviews of code changes. To this end, we selected three common similarity detection methods to measure the file path similarity of code commits and the semantic similarity of review comments. We conducted a series of experiments on finding the appropriate reviewers through evaluating and comparing these similarity detection methods in separate and combined ways with the baseline reviewer recommendation approach, RevFinder. The results show that the common similarity detection methods can produce acceptable performance scores and achieve a better performance than RevFinder. The sampling techniques used in recommending code reviewers can impact the performance of reviewer recommendation approaches. We also discuss the potential implications of our findings for both researchers and practitioners.Comment: The 27th International Conference on Evaluation and Assessment in Software Engineering (EASE

    Harnessing customizationinWeb Annotation: ASoftwareProduct Line approach

    Get PDF
    222 p.La anotación web ayuda a mediar la interacción de lectura y escritura al transmitir información, agregar comentarios e inspirar conversaciones en documentos web. Se utiliza en áreas de Ciencias Sociales y Humanidades, Investigación Periodística, Ciencias Biológicas o Educación, por mencionar algunas. Las actividades de anotación son heterogéneas, donde los usuarios finales (estudiantes, periodistas, conservadores de datos, investigadores, etc.) tienen requisitos muy diferentes para crear, modificar y reutilizar anotaciones. Esto resulta en una gran cantidad de herramientas de anotación web y diferentes formas de representar y almacenar anotaciones web. Para facilitar la reutilización y la interoperabilidad, se han realizado varios intentos durante las últimas décadas para estandarizar las anotaciones web (por ejemplo, Annotea u Open Annotation), lo que ha dado como resultado las recomendaciones de anotaciones del W3C publicadas en 2017. Las recomendaciones del W3C proporcionan un marco para la representación de anotaciones (modelo de datos y vocabulario) y transporte (protocolo). Sin embargo, todavía hay una brecha en cómo se desarrollan los clientes de anotación (herramientas e interfaces de usuario), lo que hace que los desarrolladores vuelvan a re-implementar funcionalidades comunes (esdecir, resaltar, comentar, almacenar,¿) para crear su herramienta de anotación personalizada.Esta tesis tiene como objetivo proporcionar una plataforma de reutilización para el desarrollo de herramientas de anotación web para la revisión. Con este fin, hemos desarrollado una línea de productos de software llamada WACline. WACline es una familia de productos de anotación que permite a los desarrolladores crear extensiones de navegador de anotación web personalizadas, lo que facilita la reutilización de los activos principales y su adaptación a su contexto de revisión específico. Se ha creado siguiendo un proceso de acumulación de conocimientos en el que cada producto de anotación aprende de los productos de anotación creados previamente. Finalmente, llegamos a una familia de clientes de anotación que brinda soporte para tres prácticas de revisión: extracción de datos de revisión sistemática de literatura (Highlight&Go), revisión de tareas de estudiantes en educación superior (Mark&Go), y revisión por pares de conferencias y revistas (Review&Go). Para cada uno de los contextos de revisión, se ha llevado a cabo una evaluación con partes interesadas reales para validar las mejoras de eficiencia y eficacia aportadas por las herramientas de anotación personalizadas en su práctica

    A First Look at Fairness of Machine Learning Based Code Reviewer Recommendation

    Full text link
    The fairness of machine learning (ML) approaches is critical to the reliability of modern artificial intelligence systems. Despite extensive study on this topic, the fairness of ML models in the software engineering (SE) domain has not been well explored yet. As a result, many ML-powered software systems, particularly those utilized in the software engineering community, continue to be prone to fairness issues. Taking one of the typical SE tasks, i.e., code reviewer recommendation, as a subject, this paper conducts the first study toward investigating the issue of fairness of ML applications in the SE domain. Our empirical study demonstrates that current state-of-the-art ML-based code reviewer recommendation techniques exhibit unfairness and discriminating behaviors. Specifically, male reviewers get on average 7.25% more recommendations than female code reviewers compared to their distribution in the reviewer set. This paper also discusses the reasons why the studied ML-based code reviewer recommendation systems are unfair and provides solutions to mitigate the unfairness. Our study further indicates that the existing mitigation methods can enhance fairness by 100% in projects with a similar distribution of protected and privileged groups, but their effectiveness in improving fairness on imbalanced or skewed data is limited. Eventually, we suggest a solution to overcome the drawbacks of existing mitigation techniques and tackle bias in datasets that are imbalanced or skewed

    OntoTouTra: tourist traceability ontology based on big data analytics

    Get PDF
    Tourist traceability is the analysis of the set of actions, procedures, and technical measures that allows us to identify and record the space–time causality of the tourist’s touring, from the beginning to the end of the chain of the tourist product. Besides, the traceability of tourists has implications for infrastructure, transport, products, marketing, the commercial viability of the industry, and the management of the destination’s social, environmental, and cultural impact. To this end, a tourist traceability system requires a knowledge base for processing elements, such as functions, objects, events, and logical connectors among them. A knowledge base provides us with information on the preparation, planning, and implementation or operation stages. In this regard, unifying tourism terminology in a traceability system is a challenge because we need a central repository that promotes standards for tourists and suppliers in forming a formal body of knowledge representation. Some studies are related to the construction of ontologies in tourism, but none focus on tourist traceability systems. For the above, we propose OntoTouTra, an ontology that uses formal specifications to represent knowledge of tourist traceability systems. This paper outlines the development of the OntoTouTra ontology and how we gathered and processed data from ubiquitous computing using Big Data analysis techniquesThis research was financially supported by the Ministry of Science, Technology, and Innovation of Colombia (733-2015) and by the Universidad Santo Tomás Seccional Tunja

    LLaMA-Reviewer: Advancing Code Review Automation with Large Language Models through Parameter-Efficient Fine-Tuning (Practical Experience Report)

    Full text link
    The automation of code review activities, a long-standing pursuit in software engineering, has been primarily addressed by numerous domain-specific pre-trained models. Despite their success, these models frequently demand extensive resources for pre-training from scratch. In contrast, Large Language Models (LLMs) provide an intriguing alternative, given their remarkable capabilities when supplemented with domain-specific knowledge. However, their potential for automating code review tasks remains largely unexplored. In response to this research gap, we present LLaMA-Reviewer, an innovative framework that leverages the capabilities of LLaMA, a popular LLM, in the realm of code review. Mindful of resource constraints, this framework employs parameter-efficient fine-tuning (PEFT) methods, delivering high performance while using less than 1% of trainable parameters. An extensive evaluation of LLaMA-Reviewer is conducted on two diverse, publicly available datasets. Notably, even with the smallest LLaMA base model consisting of 6.7B parameters and a limited number of tuning epochs, LLaMA-Reviewer equals the performance of existing code-review-focused models. The ablation experiments provide insights into the influence of various fine-tuning process components, including input representation, instruction tuning, and different PEFT methods. To foster continuous progress in this field, the code and all PEFT-weight plugins have been made open-source.Comment: Accepted to the 34th IEEE International Symposium on Software Reliability Engineering (ISSRE 2023

    Continuous assessment of software traceability

    Get PDF
    Traceability is a critical element of any rigorous software development process. It is required by numerous software lifecycle activities such as, for example, safety analysis, change impact analysis, coverage analysis, and compliance verification. Safety guidelines such as ISO 61508 and its domain specific derivatives explicitly require the implementation of software traceability. Although the crucial importance of traceability is commonly acknowledged, software development projects rarely follow explicit traceability strategies. Traceability is rarely planned or systematically created but should rather be regarded as a desultory ad-hoc effort. In result, existing traces are potentially of dubious quality but serve as the foundation for high impact development decisions. To ensure that traceability is trustworthy, the fitness for purpose of a project’s traceability implementation must be thoroughly ascertained, especially within the context of safety-critical software. Assessing the fitness for purpose is an intricate problem for several reasons. Depending on the project specific traceability goals, different ways of traceability are applied within multiple projects. The development of safety-critical software is subject to different regulations with diverse provisions that need to be regarded. This thesis will present an approach to systematically assess the fitness for purpose of a project’s traceability implementation, comprising two parts. The first part supports the planning of purposed traceability, which is a prerequisite for the traceability assessment. Based on the planning results, the second part supports the actual assessments. It defines an analytical traceability assessment model. This model provides a comprehensive classification of possible traceability problems and defines assessment criteria to systematically detect these problems. The results of a traceability expert survey suggest that proposed traceability problem classification is complete and defines relevant assessment criteria. The proposed assessment approach was applied in two studies. The study results indicate that the proposed assessment provides support for multiple purposes. It can be used in order to determine the feasibility of important software lifecycle activities and the cost effectiveness of a project’s traceability implementation. Safety-critical software projects can be supported with their safety argument. The compliance of projects’ traceability implementations to safety guidelines can be determined.Die Nachvollziehbarkeit von Anforderungen ist wichtiges Qualitätsmerkmal der Softwareentwicklung. Für eine Vielzahl von Softwareentwicklungsaktivitäten ist die Nachvollziehbarkeit von Anforderungen eine notwenige Voraussetzung. Dazu gehören unter anderem die Analyse funktionaler Sicherheit, die Einflussanalyse, die Analyse des Abdeckungsgrades oder die Compliance. Für die Entwicklung sicherheitskritischer Softwaresysteme ist dieses Qualitätsmerkmal von besonderer Bedeutung. Daher wird dieses von entsprechenden Richtlinien zur Entwicklung sicherheitskritischer Software explizit vorgeschrieben. Obwohl die Relevanz der Nachvollziehbarkeit in Softwareprojekten allgemein bekannt ist, findet nur in wenigen Fällen eine systematische Planung zur Erreichung dieses Qualitätsmerkmals Anwendung. Häufig wird Nachvollziehbarkeit erst nachträglich umgesetzt. Daraus resultieren oft unvollständige Implementierungen der Nachvollziehbarkeit, die trotzdem als Grundlage für schwerwiegende Entscheidungen herangezogen werden. Aus diesem Grunde sollten die entsprechenden Implementierungen einer eingehenden Prüfung unterzogen werden, besonders im Rahmen der Entwicklung sicherheitskritischer Systeme. Dazu ist jedoch eine Vielzahl von Herausforderungen zu meistern. Zum einen hängt die Nachvollziehbarkeit von den projektspezifischen Zielen ab. Bei sicherheitskritischen Systemen müssen oft Vorgaben aus Richtlinien erfüllt werden. Auch die Nutzung der Nachvollziehbarkeit ist sehr stark von den jeweiligen Zielen abhängig. In dieser Arbeit wird ein Ansatz zur systematischen Prüfung von Softwareprojekten im Hinblick auf deren Nachvollziehbarkeit der Anforderungen vorgeschlagen. Eine notwendige Voraussetzung für den Prüfansatz ist die präzise Planung und Definition der Nachvollziehbarkeit von Anforderungen in einem Softwareprojekt. Daher wird im Rahmen dieser Arbeit ein entsprechender Planungsansatz präsentiert. Weiterhin wird ein analytisches Modell zur systematischen Prüfung der Nachvollziehbarkeit in Softwareprojekten präsentiert. Dieses Modell umfasst eine vollständige Klassifikation möglicher Fehlertypen. Außerdem werden Kriterien zur systematischen Erkennung dieser Fehler vorgeschlagen. Die Ergebnisse einer Expertenbefragung bestätigen die Vollständigkeit des analytischen Prüfmodells. Zudem wurde der vorgeschlagene Ansatz zur systematischen Prüfung der Nachvollziehbarkeit von Anforderungen in zwei Studien evaluiert. Dabei konnte der Nutzen des Ansatzes für die Entwicklung von sicherheitskritischer und nicht sicherheitskritischer Software nachgewiesen werden

    A Fault-Based Model of Fault Localization Techniques

    Get PDF
    Every day, ordinary people depend on software working properly. We take it for granted; from banking software, to railroad switching software, to flight control software, to software that controls medical devices such as pacemakers or even gas pumps, our lives are touched by software that we expect to work. It is well known that the main technique/activity used to ensure the quality of software is testing. Often it is the only quality assurance activity undertaken, making it that much more important. In a typical experiment studying these techniques, a researcher will intentionally seed a fault (intentionally breaking the functionality of some source code) with the hopes that the automated techniques under study will be able to identify the fault\u27s location in the source code. These faults are picked arbitrarily; there is potential for bias in the selection of the faults. Previous researchers have established an ontology for understanding or expressing this bias called fault size. This research captures the fault size ontology in the form of a probabilistic model. The results of applying this model to measure fault size suggest that many faults generated through program mutation (the systematic replacement of source code operators to create faults) are very large and easily found. Secondary measures generated in the assessment of the model suggest a new static analysis method, called testability, for predicting the likelihood that code will contain a fault in the future. While software testing researchers are not statisticians, they nonetheless make extensive use of statistics in their experiments to assess fault localization techniques. Researchers often select their statistical techniques without justification. This is a very worrisome situation because it can lead to incorrect conclusions about the significance of research. This research introduces an algorithm, MeansTest, which helps automate some aspects of the selection of appropriate statistical techniques. The results of an evaluation of MeansTest suggest that MeansTest performs well relative to its peers. This research then surveys recent work in software testing using MeansTest to evaluate the significance of researchers\u27 work. The results of the survey indicate that software testing researchers are underreporting the significance of their work
    corecore