216 research outputs found

    Software Design Change Artifacts Generation through Software Architectural Change Detection and Categorisation

    Get PDF
    Software is solely designed, implemented, tested, and inspected by expert people, unlike other engineering projects where they are mostly implemented by workers (non-experts) after designing by engineers. Researchers and practitioners have linked software bugs, security holes, problematic integration of changes, complex-to-understand codebase, unwarranted mental pressure, and so on in software development and maintenance to inconsistent and complex design and a lack of ways to easily understand what is going on and what to plan in a software system. The unavailability of proper information and insights needed by the development teams to make good decisions makes these challenges worse. Therefore, software design documents and other insightful information extraction are essential to reduce the above mentioned anomalies. Moreover, architectural design artifacts extraction is required to create the developer’s profile to be available to the market for many crucial scenarios. To that end, architectural change detection, categorization, and change description generation are crucial because they are the primary artifacts to trace other software artifacts. However, it is not feasible for humans to analyze all the changes for a single release for detecting change and impact because it is time-consuming, laborious, costly, and inconsistent. In this thesis, we conduct six studies considering the mentioned challenges to automate the architectural change information extraction and document generation that could potentially assist the development and maintenance teams. In particular, (1) we detect architectural changes using lightweight techniques leveraging textual and codebase properties, (2) categorize them considering intelligent perspectives, and (3) generate design change documents by exploiting precise contexts of components’ relations and change purposes which were previously unexplored. Our experiment using 4000+ architectural change samples and 200+ design change documents suggests that our proposed approaches are promising in accuracy and scalability to deploy frequently. Our proposed change detection approach can detect up to 100% of the architectural change instances (and is very scalable). On the other hand, our proposed change classifier’s F1 score is 70%, which is promising given the challenges. Finally, our proposed system can produce descriptive design change artifacts with 75% significance. Since most of our studies are foundational, our approaches and prepared datasets can be used as baselines for advancing research in design change information extraction and documentation

    Chatbots for Modelling, Modelling of Chatbots

    Full text link
    Tesis Doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingeniería Informática. Fecha de Lectura: 28-03-202

    CodePlan: Repository-level Coding using LLMs and Planning

    Full text link
    Software engineering activities such as package migration, fixing errors reports from static analysis or testing, and adding type annotations or other specifications to a codebase, involve pervasively editing the entire repository of code. We formulate these activities as repository-level coding tasks. Recent tools like GitHub Copilot, which are powered by Large Language Models (LLMs), have succeeded in offering high-quality solutions to localized coding problems. Repository-level coding tasks are more involved and cannot be solved directly using LLMs, since code within a repository is inter-dependent and the entire repository may be too large to fit into the prompt. We frame repository-level coding as a planning problem and present a task-agnostic framework, called CodePlan to solve it. CodePlan synthesizes a multi-step chain of edits (plan), where each step results in a call to an LLM on a code location with context derived from the entire repository, previous code changes and task-specific instructions. CodePlan is based on a novel combination of an incremental dependency analysis, a change may-impact analysis and an adaptive planning algorithm. We evaluate the effectiveness of CodePlan on two repository-level tasks: package migration (C#) and temporal code edits (Python). Each task is evaluated on multiple code repositories, each of which requires inter-dependent changes to many files (between 2-97 files). Coding tasks of this level of complexity have not been automated using LLMs before. Our results show that CodePlan has better match with the ground truth compared to baselines. CodePlan is able to get 5/6 repositories to pass the validity checks (e.g., to build without errors and make correct code edits) whereas the baselines (without planning but with the same type of contextual information as CodePlan) cannot get any of the repositories to pass them

    A Survey of Learning-based Automated Program Repair

    Full text link
    Automated program repair (APR) aims to fix software bugs automatically and plays a crucial role in software development and maintenance. With the recent advances in deep learning (DL), an increasing number of APR techniques have been proposed to leverage neural networks to learn bug-fixing patterns from massive open-source code repositories. Such learning-based techniques usually treat APR as a neural machine translation (NMT) task, where buggy code snippets (i.e., source language) are translated into fixed code snippets (i.e., target language) automatically. Benefiting from the powerful capability of DL to learn hidden relationships from previous bug-fixing datasets, learning-based APR techniques have achieved remarkable performance. In this paper, we provide a systematic survey to summarize the current state-of-the-art research in the learning-based APR community. We illustrate the general workflow of learning-based APR techniques and detail the crucial components, including fault localization, patch generation, patch ranking, patch validation, and patch correctness phases. We then discuss the widely-adopted datasets and evaluation metrics and outline existing empirical studies. We discuss several critical aspects of learning-based APR techniques, such as repair domains, industrial deployment, and the open science issue. We highlight several practical guidelines on applying DL techniques for future APR studies, such as exploring explainable patch generation and utilizing code features. Overall, our paper can help researchers gain a comprehensive understanding about the achievements of the existing learning-based APR techniques and promote the practical application of these techniques. Our artifacts are publicly available at \url{https://github.com/QuanjunZhang/AwesomeLearningAPR}

    Detection of microservice smells through static analysis

    Get PDF
    A arquitetura de microsserviços é um modelo arquitetural promissor na área de software, atraindo desenvolvedores e empresas para os seus princípios convincentes. As suas vantagens residem no potencial para melhorar a escalabilidade, a flexibilidade e a agilidade, alinhando se com as exigências em constante evolução da era digital. No entanto, navegar entre as complexidades dos microsserviços pode ser uma tarefa desafiante, especialmente à medida que este campo continua a evoluir. Um dos principais desafios advém da complexidade inerente aos microsserviços, em que o seu grande número e interdependências podem introduzir novas camadas de complexidade. Além disso, a rápida expansão dos microsserviços, juntamente com a necessidade de aproveitar as suas vantagens de forma eficaz, exige uma compreensão mais profunda das potenciais ameaças e problemas que podem surgir. Para tirar verdadeiramente partido das vantagens dos microsserviços, é essencial enfrentar estes desafios e garantir que o desenvolvimento e a adoção de microsserviços sejam bem-sucedidos. O presente documento pretende explorar a área dos smells da arquitetura de microsserviços que desempenham um papel tão importante na dívida técnica dirigida à área dos microsserviços. Embarca numa exploração de investigação abrangente, explorando o domínio dos smells de microsserviços. Esta investigação serve como base para melhorar um catálogo de smells de microsserviços. Esta investigação abrangente obtém dados de duas fontes primárias: systematic mapping study e um questionário a profissionais da área. Este último envolveu 31 profissionais experientes com uma experiência substancial no domínio dos microsserviços. Além disso, são descritos o desenvolvimento e o aperfeiçoamento de uma ferramenta especificamente concebida para identificar e resolver problemas relacionados com os microsserviços. Esta ferramenta destina-se a melhorar o desempenho dos programadores durante o desenvolvimento e a implementação da arquitetura de microsserviços. Por último, o documento inclui uma avaliação do desempenho da ferramenta. Trata-se de uma análise comparativa efetuada antes e depois das melhorias introduzidas na ferramenta. A eficácia da ferramenta será avaliada utilizando o mesmo benchmarking de microsserviços utilizado anteriormente, para além de outro benchmarking para garantir uma avaliação abrangente.The microservices architecture stands as a beacon of promise in the software landscape, drawing developers and companies towards its compelling principles. Its appeal lies in the potential for improved scalability, flexibility, and agility, aligning with the ever-evolving demands of the digital age. However, navigating the intricacies of microservices can be a challenging task, especially as this field continues to evolve. A key challenge arises from the inherent complexity of microservices, where their sheer number and interdependencies can introduce new layers of intricacy. Furthermore, the rapid expansion of microservices, coupled with the need to harness their advantages effectively, demands a deeper understanding of the potential pitfalls and issues that may emerge. To truly unlock the benefits of microservices, it is essential to address these challenges head-on and ensure a successful journey in the world of microservices development and adoption. The present document intends to explore the area of microservice architecture smells that play such an important role in the technical debt directed to the area of microservices. It embarks on a comprehensive research exploration, delving into the realm of microservice smells. This research serves as the cornerstone for enhancing a microservice smell catalogue. This comprehensive research draws data from two primary sources: a systematic mapping research and an industry survey. The latter involves 31 seasoned professionals with substantial experience in the field of microservices. Moreover, the development and enhancement of a tool specifically designed to identify and address issues related to microservices is described. This tool is aimed at improving developers' performance throughout the development and implementation of microservices architecture. Finally, the document includes an evaluation of the tool's performance. This involves a comparative analysis conducted before and after the tool's enhancements. The tool's effectiveness will be assessed using the same microservice benchmarking as previously employed, in addition to another benchmark to ensure a comprehensive evaluation

    How do Microservices Evolve?:An Empirical Analysis of Changes in Open-Source Microservice Repositories

    Get PDF
    Context.Microservice architectures are an emergent service-oriented paradigm widely used in industry to develop and deploy scalable software systems. The underlying idea is to design highly independent services that implement small units of functionality and can interact with each other through lightweight interfaces.Objective.Even though microservices are often used with success, their design and maintenance pose novel challenges to software engineers. In particular, it is questionable whether the intended independence of microservices can actually be achieved in practice.Method.So, it is important to understand how and why microservices evolve during a system’s life-cycle, for instance, to scope refactorings and improvements of a system’s architecture or to develop supporting tools. To provide insights into how microservices evolve, we report a large-scale empirical study on the (co-)evolution of microservices in 11 open-source systems, involving quantitative and qualitative analyses of 7,319 commits.Findings.Our quantitative results show that there are recurring patterns of (co-)evolution across all systems, for instance, “shotgun surgery” commits and microservices that are largely independent, evolve in tuples, or are evolved in almost all changes. We refine our results by analyzing service-evolving commits qualitatively to explore the (in-)dependence of microservices and the causes for their specific evolution.Conclusion.The contributions in this article provide an understanding for practitioners and researchers on how microservices evolve in what way, and how microservice-based systems may be improved

    A Reference Structure for Modular Model-based Analyses

    Get PDF
    Kontext: In dieser Arbeit haben wir die Evolvierbarkeit, Verständlichkeit und Wiederverwendbarkeit von modellbasierten Analysen untersucht. Darum untersuchten wir die Wechselbeziehungen zwischen Modellen und Analysen, insbesondere die Struktur und Abhängigkeiten von Artefakten und die Dekomposition und Komposition von modellbasierten Analysen. Herausforderungen: Softwareentwickler verwenden Modelle von Softwaresystemen, um die Evolvierbarkeit und Wiederverwendbarkeit eines Architekturentwurfs zu bestimmen. Diese Modelle ermöglichen die Softwarearchitektur zu analysieren, bevor die erste Zeile Code geschreiben wird. Aufgrund evolutionärer Veränderungen sind modellbasierte Analysen jedoch auch anfällig für eine Verschlechterung der Evolvierbarkeit, Verständlichkeit und Wiederverwendbarkeit. Diese Probleme lassen sich auf die Ko-Evolution von Modellierungssprache und Analyse zurückführen. Der Zweck einer Analyse ist die systematische Untersuchung bestimmter Eigenschaften eines zu untersuchenden Systems. Nehmen wir zum Beispiel an, dass Softwareentwickler neue Eigenschaften eines Softwaresystems analysieren wollen. In diesem Fall müssen sie Merkmale der Modellierungssprache und die entsprechenden modellbasierten Analysen anpassen, bevor sie neue Eigenschaften analysieren können. Merkmale in einer modellbasierten Analyse sind z.\,B. eine Analysetechnik, die eine solche Qualitätseigenschaft analysiert. Solche Änderungen führen zu einer erhöhten Komplexität der modellbasierten Analysen und damit zu schwer zu pflegenden modellbasierten Analysen. Diese steigende Komplexität verringert die Verständlichkeit der modellbasierten Analysen. Infolgedessen verlängern sich die Entwicklungszyklen, und die Softwareentwickler benötigen mehr Zeit, um das Softwaresystem an veränderte Anforderungen anzupassen. Stand der Technik: Derzeitige Ansätze ermöglichen die Kopplung von Analysen auf einem System oder über verteilte Systeme hinweg. Diese Ansätze bieten die technische Struktur für die Kopplung von Simulationen, nicht aber eine Struktur wie Komponenten (de)komponiert werden können. Eine weitere Herausforderung beim Komponieren von Analysen ist der Verhaltensaspekt, der sich darin äußert, wie sich die Analysekomponenten gegenseitig beeinflussen. Durch die Synchronisierung jeder beteiligten Simulation erhöht die Modularisierung von Simulationen den Kommunikationsbedarf. Derzeitige Ansätze erlauben es, den Kommunikationsaufwand zu reduzieren; allerdings werden bei diesen Ansätzen die Dekomposition und Komposition dem Benutzer überlassen. Beiträge: Ziel dieser Arbeit ist es, die Evolvierbarkeit, Verständlichkeit und Wiederverwendbarkeit von modellbasierten Analysen zu verbessern. Zu diesem Zweck wird die Referenzarchitektur für domänenspezifische Modellierungssprachen als Grundlage genommen und die Übertragbarkeit der Struktur der Referenzarchitektur auf modellbasierte Analysen untersucht. Die geschichtete Referenzarchitektur bildet die Abhängigkeiten der Analysefunktionen und Analysekomponenten ab, indem sie diese bestimmten Schichten zuordnet. Wir haben drei Prozesse für die Anwendung der Referenzarchitektur entwickelt: (i) Refactoring einer bestehenden modellbasierten Analyse, (ii) Entwurf einer neuen modellbasierten Analyse und (iii) Erweiterung einer bestehenden modellbasierten Analyse. Zusätzlich zur Referenzarchitektur für modellbasierte Analysen haben wir wiederkehrende Strukturen identifiziert, die zu Problemen bei der Evolvierbarkeit, Verständlichkeit und Wiederverwendbarkeit führen; in der Literatur werden diese wiederkehrenden Strukturen auch als Bad Smells bezeichnet. Wir haben etablierte modellbasierte Analysen untersucht und dreizehn Bad Smells identifiziert und spezifiziert. Neben der Spezifizierung der Bad Smells bieten wir einen Prozess zur automatischen Identifizierung dieser Bad Smells und Strategien für deren Refactoring, damit Entwickler diese Bad Smells vermeiden oder beheben können. In dieser Arbeit haben wir auch eine Modellierungssprache zur Spezifikation der Struktur und des Verhaltens von Simulationskomponenten entwickelt. Simulationen sind Analysen, um ein System zu untersuchen, wenn das Experimentieren mit dem bestehenden System zu zeitaufwändig, zu teuer, zu gefährlich oder einfach unmöglich ist, weil das System (noch) nicht existiert. Entwickler können die Spezifikation nutzen, um Simulationskomponenten zu vergleichen und so identische Komponenten zu identifizieren. Validierung: Die Referenzarchitektur für modellbasierte Analysen, haben wir evaluiert, indem wir vier modellbasierte Analysen in die Referenzarchitektur überführt haben. Wir haben eine szenariobasierte Evaluierung gewählt, die historische Änderungsszenarien aus den Repositories der modellbasierten Analysen ableitet. In der Auswertung können wir zeigen, dass sich die Evolvierbarkeit und Verständlichkeit durch die Bestimmung der Komplexität, der Kopplung und der Kohäsion verbessert. Die von uns verwendeten Metriken stammen aus der Informationstheorie, wurden aber bereits zur Bewertung der Referenzarchitektur für DSMLs verwendet. Die Bad Smells, die durch die Co-Abhängigkeit von modellbasierten Analysen und ihren entsprechenden DSMLs entstehen, haben wir evaluiert, indem wir vier modellbasierte Analysen nach dem Auftreten unserer schlechten Gerüche durchsucht und dann die gefundenen Bad Smells behoben haben. Wir haben auch eine szenariobasierte Auswertung gewählt, die historische Änderungsszenarien aus den Repositories der modellbasierten Analysen ableitet. Wir können zeigen, dass die Bad Smells die Evolvierbarkeit und Verständlichkeit negativ beeinflussen, indem wir die Komplexität, Kopplung und Kohäsion vor und nach der Refaktorisierung bestimmen. Den Ansatz zum Spezifizieren und Finden von Komponenten modellbasierter Analysen haben wir evaluiert, indem wir Komponenten von zwei modellbasierten Analysen spezifizieren und unseren Suchalgorithmus verwenden, um ähnliche Analysekomponenten zu finden. Die Ergebnisse der Evaluierung zeigen, dass wir in der Lage sind, ähnliche Analysekomponenten zu finden und dass unser Ansatz die Suche nach Analysekomponenten mit ähnlicher Struktur und ähnlichem Verhalten und damit die Wiederverwendung solcher Komponenten ermöglicht. Nutzen: Die Beiträge unserer Arbeit unterstützen Architekten und Entwickler bei ihrer täglichen Arbeit, um wartbare und wiederverwendbare modellbasierte Analysen zu entwickeln. Zu diesem Zweck stellen wir eine Referenzarchitektur bereit, die die modellbasierte Analyse und die domänenspezifische Modellierungssprache aufeinander abstimmt und so die Koevolution erleichtert. Zusätzlich zur Referenzarchitektur bieten wir auch Refaktorisierungsoperationen an, die es Architekten und Entwicklern ermöglichen, eine bestehende modellbasierte Analyse an die Referenzarchitektur anzupassen. Zusätzlich zu diesem technischen Aspekt haben wir drei Prozesse identifiziert, die es Architekten und Entwicklern ermöglichen, eine neue modellbasierte Analyse zu entwickeln, eine bestehende modellbasierte Analyse zu modularisieren und eine bestehende modellbasierte Analyse zu erweitern. Dies geschieht natürlich so, dass die Ergebnisse mit der Referenzarchitektur konform sind. Darüber hinaus ermöglicht unsere Spezifikation den Entwicklern, bestehende Simulationskomponenten zu vergleichen und sie bei Bedarf wiederzuverwenden. Dies erspart den Entwicklern die Neuimplementierung von Komponenten

    An Improved Approach for Extracting Frequently Extracted Code Idioms

    Get PDF
    Source code refactoring is a process of restructuring or changing the existing codes without changing their external behaviour. This is a continuous process done by the developers to improve code quality, readability, maintainability of the source code, and address technical debt. There have been studies and tools to aid developers to refactor effectively their source code and to understand the motivations behind refactorings applied by developers. We aim to find Code Idioms that developers tend to refactor more frequently and investigate whether there are unique refactored code idioms for production code and test code. We use the RefactoringMiner tool to detect and collect EXTRACT METHOD refactoring from the commit history of the projects and propose a technique to represent the code fragments as structure-preserving context-free independent graphs and apply graph-similarity measure techniques to find similar code idioms among 65,742 EXTRACT METHOD instances. We measure both exact matching and partial matching with constraint checking from the associated metadata of the nodes and edges of the graphs. We divide our data set into production code and test code and found a total of 489 code idiom patterns. We present in detail 22 of the most frequently refactored code idioms. There are unique patterns to production code and test code and patterns shared among them. We limit our study to only Java-based open-source projects and EXTRACT METHOD refactoring, but we believe the approach can be applied to other object-oriented languages or refactorings. The findings can be useful to design an effective refactoring recommender system, help developers gain confidence in refactoring recommendation tools, and help researchers understand refactoring motivations and API usage patterns

    An Empirical Study on the Impact of Deep Parameters on Mobile App Energy Usage

    Get PDF
    Improving software performance through configuration parameter tuning is a common activity during software maintenance. Beyond traditional performance metrics like latency, mobile app developers are interested in reducing app energy usage. Some mobile apps have centralized locations for parameter tuning, similar to databases and operating systems, but it is common for mobile apps to have hundreds of parameters scattered around the source code. The correlation between these deep parameters and app energy usage is unclear. Researchers have studied the energy effects of deep parameters in specific modules, but we lack a systematic understanding of the energy impact of mobile deep parameters. In this paper we empirically investigate this topic, combining a developer survey with systematic energy measurements. Our motivational survey of 25 Android developers suggests that developers do not understand, and largely ignore, the energy impact of deep parameters. To assess the potential implications of this practice, we propose a deep parameter energy profiling framework that can analyze the energy impact of deep parameters in an app. Our framework identifies deep parameters, mutates them based on our parameter value selection scheme, and performs reliable energy impact analysis. Applying the framework to 16 popular Android apps, we discovered that deep parameter-induced energy inefficiency is rare. We found only 2 out of 1644 deep parameters for which a different value would significantly improve its app\u27s energy efficiency. A detailed analysis found that most deep parameters have either no energy impact, limited energy impact, or an energy impact only under extreme values. Our study suggests that it is generally safe for developers to ignore the energy impact when choosing deep parameter values in mobile apps
    corecore