1,110 research outputs found

    Understanding, Analysis, and Handling of Software Architecture Erosion

    Get PDF
    Architecture erosion occurs when a software system's implemented architecture diverges from the intended architecture over time. Studies show erosion impacts development, maintenance, and evolution since it accumulates imperceptibly. Identifying early symptoms like architectural smells enables managing erosion through refactoring. However, research lacks comprehensive understanding of erosion, unclear which symptoms are most common, and lacks detection methods. This thesis establishes an erosion landscape, investigates symptoms, and proposes identification approaches. A mapping study covers erosion definitions, symptoms, causes, and consequences. Key findings: 1) "Architecture erosion" is the most used term, with four perspectives on definitions and respective symptom types. 2) Technical and non-technical reasons contribute to erosion, negatively impacting quality attributes. Practitioners can advocate addressing erosion to prevent failures. 3) Detection and correction approaches are categorized, with consistency and evolution-based approaches commonly mentioned.An empirical study explores practitioner perspectives through communities, surveys, and interviews. Findings reveal associated practices like code review and tools identify symptoms, while collected measures address erosion during implementation. Studying code review comments analyzes erosion in practice. One study reveals architectural violations, duplicate functionality, and cyclic dependencies are most frequent. Symptoms decreased over time, indicating increased stability. Most were addressed after review. A second study explores violation symptoms in four projects, identifying 10 categories. Refactoring and removing code address most violations, while some are disregarded.Machine learning classifiers using pre-trained word embeddings identify violation symptoms from code reviews. Key findings: 1) SVM with word2vec achieved highest performance. 2) fastText embeddings worked well. 3) 200-dimensional embeddings outperformed 100/300-dimensional. 4) Ensemble classifier improved performance. 5) Practitioners found results valuable, confirming potential.An automated recommendation system identifies qualified reviewers for violations using similarity detection on file paths and comments. Experiments show common methods perform well, outperforming a baseline approach. Sampling techniques impact recommendation performance

    Towards Automatic Identification of Violation Symptoms of Architecture Erosion

    Full text link
    Architecture erosion has a detrimental effect on maintenance and evolution, as the implementation drifts away from the intended architecture. To prevent this, development teams need to understand early enough the symptoms of erosion, and particularly violations of the intended architecture. One way to achieve this, is through the automatic identification of architecture violations from textual artifacts, and particularly code reviews. In this paper, we developed 15 machine learning-based and 4 deep learning-based classifiers with three pre-trained word embeddings to identify violation symptoms of architecture erosion from developer discussions in code reviews. Specifically, we looked at code review comments from four large open-source projects from the OpenStack (Nova and Neutron) and Qt (Qt Base and Qt Creator) communities. We then conducted a survey to acquire feedback from the involved participants who discussed architecture violations in code reviews, to validate the usefulness of our trained classifiers. The results show that the SVM classifier based on word2vec pre-trained word embedding performs the best with an F1-score of 0.779. In most cases, classifiers with the fastText pre-trained word embedding model can achieve relatively good performance. Furthermore, 200-dimensional pre-trained word embedding models outperform classifiers that use 100 and 300-dimensional models. In addition, an ensemble classifier based on the majority voting strategy can further enhance the classifier and outperforms the individual classifiers. Finally, an online survey of the involved developers reveals that the violation symptoms identified by our approaches have practical value and can provide early warnings for impending architecture erosion.Comment: 20 pages, 4 images, 7 tables, Revision submitted to TSE (2023

    Identifying Design Problems in the Source Code:A Grounded Theory

    Get PDF
    The prevalence of design problems may cause re-engineering or even discontinuation of the system. Due to missing, informal or outdated design documentation, developers often have to rely on the source code to identify design problems. Therefore, developers have to analyze different symptoms that manifest in several code elements, which may quickly turn into a complex task. Although researchers have been investigating techniques to help developers in identifying design problems, there is little knowledge on how developers actually proceed to identify design problems. In order to tackle this problem, we conducted a multi-trial industrial experiment with professionals from 5 software companies to build a grounded theory. The resulting theory offers explanations on how developers identify design problems in practice. For instance, it reveals the characteristics of symptoms that developers consider helpful. Moreover, developers often combine different types of symptoms to identify a single design problem. This knowledge serves as a basis to further understand the phenomena and advance towards more effective identification techniques

    Architecture Smells vs. Concurrency Bugs: an Exploratory Study and Negative Results

    Full text link
    Technical debt occurs in many different forms across software artifacts. One such form is connected to software architectures where debt emerges in the form of structural anti-patterns across architecture elements, namely, architecture smells. As defined in the literature, ``Architecture smells are recurrent architectural decisions that negatively impact internal system quality", thus increasing technical debt. In this paper, we aim at exploring whether there exist manifestations of architectural technical debt beyond decreased code or architectural quality, namely, whether there is a relation between architecture smells (which primarily reflect structural characteristics) and the occurrence of concurrency bugs (which primarily manifest at runtime). We study 125 releases of 5 large data-intensive software systems to reveal that (1) several architecture smells may in fact indicate the presence of concurrency problems likely to manifest at runtime but (2) smells are not correlated with concurrency in general -- rather, for specific concurrency bugs they must be combined with an accompanying articulation of specific project characteristics such as project distribution. As an example, a cyclic dependency could be present in the code, but the specific execution-flow could be never executed at runtime

    Technology Readiness Levels for Machine Learning Systems

    Full text link
    The development and deployment of machine learning (ML) systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. The lack of diligence can lead to technical debt, scope creep and misaligned objectives, model misuse and failures, and expensive consequences. Engineering systems, on the other hand, follow well-defined processes and testing standards to streamline development for high-quality, reliable results. The extreme is spacecraft systems, where mission critical measures and robustness are ingrained in the development process. Drawing on experience in both spacecraft engineering and ML (from research through product across domain areas), we have developed a proven systems engineering approach for machine learning development and deployment. Our "Machine Learning Technology Readiness Levels" (MLTRL) framework defines a principled process to ensure robust, reliable, and responsible systems while being streamlined for ML workflows, including key distinctions from traditional software engineering. Even more, MLTRL defines a lingua franca for people across teams and organizations to work collaboratively on artificial intelligence and machine learning technologies. Here we describe the framework and elucidate it with several real world use-cases of developing ML methods from basic research through productization and deployment, in areas such as medical diagnostics, consumer computer vision, satellite imagery, and particle physics

    Classes-Chave em sistemas orientados a objetos: detecção e uso

    Get PDF
    Several object-oriented systems, such as Lucene, Tomcat, Javac have their respective design documented using key-classes, defined as important/central classes to understand the object-oriented design. Considering this fact, and considering that, in general, software architecture is not formally documented to help developers understanding and assessing software design, Keecle is proposed as an approach based on dynamic and static analysis for detection of key classes in a semi-automatic way. The application of filtering mechanisms on the search space of the dynamic data is proposed in order to obtain a reduced set of key classes. The approach is evaluated with fourteen proprietary and open source systems in order to verify that the found classes correspond to the key classes of the ground-truth, which is defined from the documentation or defined by the developers. The results were analyzed in terms of precision and recall, and have shown to be superior to the state-of-the-art approach. The role of key classes in assessing design has also been investigated. The organization of the key classes in a dependency graph, which highlights explicit dependency relations in the source code, was evaluated to be adequate for design comprehension and assessment. Key classes were evaluated whether they are more prone to bad smells, and whether specific types of bad smells are associated with different levels of cohesion and coupling metrics. In addition, the ownership of key classes was shown to be more concentrated in a reduced set of developers. Finally, we conducted an experimental study with students and a survey with developers to evaluate documentation based on key classes. The results indicate that the documentation based on key classes are a feasible alternative for use as complementary documentation to the existing one, or for use as main documentation in environments where documentation is not available.FAPEG - Fundação de Amparo à Pesquisa do Estado de GoiásTese (Doutorado)Vários sistemas orientados a objetos, tais como Lucene, Tomcat, Javac tem seus respectivos projetos (designs) documentados usando classes-chave, definidas como sendo classes importantes/centrais para compreender o projeto de sistemas orientados a objetos. Considerando este fato, e considerando que geralmente a arquitetura não é formalmente documentada para auxiliar os desenvolvedores a entenderem e avaliarem o projeto do software, é proposta Keecle, uma abordagem baseada em análise dinâmica e estática para detecção de classes-chave de maneira semi-automática. É proposta a aplicação de mecanismos de filtragem no espaço de busca dos dados dinâmicos, para obter um conjunto reduzido de classes-chave. A abordagem é avaliada com quatorze sistemas de código aberto e proprietários, a fim de verificar se as classes encontradas correspondem às classes-chave definidas na documentação ou definidas pelos desenvolvedores. Os resultados foram analisados em termos de precisão e recall e são superiores às abordagens da literatura. O papel das classes-chave para avaliar o projeto também foi investigado. Foi avaliado se a organização das classes-chave em um grafo de dependências, o qual destaca relações de dependência explícitas no código fonte, é um mecanismo adequado para avaliar o design. Foi analisado estatisticamente, se classes-chave são mais propensas a bad smells, e se tipos específicos de bad smells estão associados a diferentes níveis de métricas de coesão e acoplamento. Além disso, a propriedade (ownership) das classes-chave foi analisada, indicando concentração em um conjunto reduzido de desenvolvedores. Por fim, foram conduzidos um estudo experimental com estudantes e um survey com desenvolvedores para avaliar a documentação baseada em classes-chave. Os resultados demonstram que a documentação baseada em classes-chave apresenta resultados que indicam a viabilidade de uso como documentação complementar à existente ou como documentação principal em ambientes onde a documentação não está disponível

    Software restructuring: understanding longitudinal architectural changes and refactoring

    Get PDF
    The complexity of software systems increases as the systems evolve. As the degradation of the system's structure accumulates, maintenance effort and defect-proneness tend to increase. In addition, developers often opt to employ sub-optimal solutions in order to achieve short-time goals, in a phenomenon that has been recently called technical debt. In this context, software restructuring serves as a way to alleviate and/or prevent structural degradation. Restructuring of software is usually performed in either higher or lower levels of granularity, where the first indicates broader changes in the system's structural architecture and the latter indicates refactorings performed to fewer and localised code elements. Although tools to assist architectural changes and refactoring are available, there is still no evidence these approaches are widely adopted by practitioners. Hence, an understanding of how developers perform architectural changes and refactoring in their daily basis and in the context of the software development processes they adopt is necessary. Current software development is iterative and incremental with short cycles of development and release. Thus, tools and processes that enable this development model, such as continuous integration and code review, are widespread among software engineering practitioners. Hence, this thesis investigates how developers perform longitudinal and incremental architectural changes and refactoring during code review through a wide range of empirical studies that consider different moments of the development lifecycle, different approaches, different automated tools and different analysis mechanisms. Finally, the observations and conclusions drawn from these empirical investigations extend the existing knowledge on how developers restructure software systems, in a way that future studies can leverage this knowledge to propose new tools and approaches that better fit developers' working routines and development processes
    corecore