    Automatic Identification of Assumptions from the Hibernate Developer Mailing List

    During the software development life cycle, assumptions are an important type of software development knowledge that can be extracted from textual artifacts. Analyzing assumptions can help to, for example, comprehend software design and further facilitate software maintenance. Manual identification of assumptions by stakeholders is rather time-consuming, especially when analyzing a large dataset of textual artifacts. To address this problem, one promising way is to use automatic techniques for assumption identification. In this study, we conducted an experiment to evaluate the performance of existing machine learning classification algorithms for automatic assumption identification, through a dataset extracted from the Hibernate developer mailing list. The dataset is composed of 400 'Assumption' sentences and 400 'Non-Assumption' sentences. Seven classifiers using different machine learning algorithms were selected and evaluated. The experiment results show that the SVM algorithm achieved the best performance (with a precision of 0.829, a recall of 0.812, and an F1-score of 0.819). Additionally, according to the ROC curves and related AUC values, the SVM-based classifier comparatively performed better than other classifiers for the binary classification of assumptions.</p

    Architecture Information Communication in Two OSS Projects: the Why, Who, When, and What

    Architecture information is vital for Open Source Software (OSS) development, and mailing list is one of the widely used channels for developers to share and communicate architecture information. This work investigates the nature of architecture information communication (i.e., why, who, when, and what) by OSS developers via developer mailing lists. We employed a multiple case study approach to extract and analyze the architecture information communication from the developer mailing lists of two OSS projects, ArgoUML and Hibernate, during their development life-cycle of over 18 years. Our main findings are: (a) architecture negotiation and interpretation are the two main reasons (i.e., why) of architecture communication; (b) the amount of architecture information communicated in developer mailing lists decreases after the first stable release (i.e., when); (c) architecture communications centered around a few core developers (i.e., who); (d) and the most frequently communicated architecture elements (i.e., what) are Architecture Rationale and Architecture Model. There are a few similarities of architecture communication between the two OSS projects. Such similarities point to how OSS developers naturally gravitate towards the four aspects of architecture communication in OSS development.Comment: Preprint accepted for publication in Journal of Systems and Software, 202

    End-to-End Rationale Reconstruction

    The logic behind design decisions, called design rationale, is very valuable. In the past, researchers have tried to automatically extract and exploit this information, but prior techniques are only applicable to specific contexts and there is insufficient progress on an end-to-end rationale information extraction pipeline. Here we outline a path towards such a pipeline that leverages several Machine Learning (ML) and Natural Language Processing (NLP) techniques. Our proposed context-independent approach, called Kantara, produces a knowledge graph representation of decisions and of their rationales, which considers their historical evolution and traceability. We also propose validation mechanisms to ensure the correctness of the extracted information and the coherence of the development process. We conducted a preliminary evaluation of our proposed approach on a small example sourced from the Linux Kernel, which shows promising results

    Software Design Change Artifacts Generation through Software Architectural Change Detection and Categorisation

    Software is solely designed, implemented, tested, and inspected by expert people, unlike other engineering projects where they are mostly implemented by workers (non-experts) after designing by engineers. Researchers and practitioners have linked software bugs, security holes, problematic integration of changes, complex-to-understand codebase, unwarranted mental pressure, and so on in software development and maintenance to inconsistent and complex design and a lack of ways to easily understand what is going on and what to plan in a software system. The unavailability of proper information and insights needed by the development teams to make good decisions makes these challenges worse. Therefore, software design documents and other insightful information extraction are essential to reduce the above mentioned anomalies. Moreover, architectural design artifacts extraction is required to create the developer’s profile to be available to the market for many crucial scenarios. To that end, architectural change detection, categorization, and change description generation are crucial because they are the primary artifacts to trace other software artifacts. However, it is not feasible for humans to analyze all the changes for a single release for detecting change and impact because it is time-consuming, laborious, costly, and inconsistent. In this thesis, we conduct six studies considering the mentioned challenges to automate the architectural change information extraction and document generation that could potentially assist the development and maintenance teams. In particular, (1) we detect architectural changes using lightweight techniques leveraging textual and codebase properties, (2) categorize them considering intelligent perspectives, and (3) generate design change documents by exploiting precise contexts of components’ relations and change purposes which were previously unexplored. Our experiment using 4000+ architectural change samples and 200+ design change documents suggests that our proposed approaches are promising in accuracy and scalability to deploy frequently. Our proposed change detection approach can detect up to 100% of the architectural change instances (and is very scalable). On the other hand, our proposed change classifier’s F1 score is 70%, which is promising given the challenges. Finally, our proposed system can produce descriptive design change artifacts with 75% significance. Since most of our studies are foundational, our approaches and prepared datasets can be used as baselines for advancing research in design change information extraction and documentation

    Towards using fluctuations in internal quality metrics to find design intents

    Le contrôle de version est la pierre angulaire des processus de développement de logiciels modernes. Tout en construisant des logiciels de plus en plus complexes, les développeurs doivent comprendre des sous-systèmes de code source qui leur sont peu familier. Alors que la compréhension de la logique d'un code étranger est relativement simple, la compréhension de sa conception et de sa genèse est plus compliquée. Elle n'est souvent possible que par les descriptions des révisions et de la documentation du projet qui sont dispersées et peu fiables -- quand elles existent. Ainsi, les développeurs ont besoin d'une base de référence fiable et pertinente pour comprendre l'historique des projets logiciels. Dans cette thèse, nous faisons les premiers pas vers la compréhension des motifs de changement dans les historiques de révision. Nous étudions les changements prenant place dans les métriques logicielles durant l'évolution d'un projet. Au travers de multiples études exploratoires, nous réalisons des expériences quantitatives et qualitatives sur plusieurs jeux de données extraits à partir d'un ensemble de 13 projets. Nous extrayons les changements dans les métriques logicielles de chaque commit et construisons un jeu de donnée annoté manuellement comme vérité de base. Nous avons identifié plusieurs catégories en analysant ces changements. Un motif en particulier nommé "compromis", dans lequel certaines métriques peuvent s'améliorer au détriment d'autres, s'est avéré être un indicateur prometteur de changements liés à la conception -- dans certains cas, il laisse également entrevoir une intention de conception consciente de la part des auteurs des changements. Pour démontrer les observations de nos études exploratoires, nous construisons un modèle général pour identifier l'application d'un ensemble bien connu de principes de conception dans de nouveaux projets. Nos résultats suggèrent que les fluctuations de métriques ont le potentiel d'être des indicateurs pertinents pour gagner des aperçus macroscopiques sur l'évolution de la conception dans l'historique de développement d'un projet.Version control is the backbone of the modern software development workflow. While building more and more complex systems, developers have to understand unfamiliar subsystems of source code. Understanding the logic of unfamiliar code is relatively straightforward. However, understanding its design and its genesis is often only possible through scattered and unreliable commit messages and project documentation -- when they exist. Thus, developers need a reliable and relevant baseline to understand the history of software projects. In this thesis, we take the first steps towards understanding change patterns in commit histories. We study the changes in software metrics through the evolution of projects. Through multiple exploratory studies, we conduct quantitative and qualitative experiments on several datasets extracted from a pool of 13 projects. We mine the changes in software metrics for each commit of the respective projects and manually build oracles to represent ground truth. We identified several categories by analyzing these changes. One pattern, in particular, dubbed "tradeoffs", where some metrics may improve at the expense of others, proved to be a promising indicator of design-related changes -- in some cases, also hinting at a conscious design intent from the authors of the changes. Demonstrating the findings of our exploratory studies, we build a general model to identify the application of a well-known set of design principles in new projects. Our overall results suggest that metric fluctuations have the potential to be relevant indicators for valuable macroscopic insights about the design evolution in a project's development history

    Where and What do Software Architects blog?:An Exploratory Study on Architectural Knowledge in Blogs, and their Relevance to Design Steps

    Software engineers share their architectural knowledge (AK) in different places on the Web. Recent studies show that architectural blogs contain the most relevant AK, which can help software engineers to make design steps. Nevertheless, we know little about blogs, and specifically architectural blogs, where software engineers share their AK. In this paper, we conduct an exploratory study on architectural blogs to explore their types, topics, and their AK. Moreover, we determine the relevance of architectural blogs to make design steps. Our results support researchers and practitioners to find and re-use AK from blogs.</p

    Towards a trustworthness model for Open Source software.

    Trustworthiness is one of the main aspects that contribute to the adoption/rejection of a software product. This is actually true for any product in general, but it is especially true for Open Source Software (OSS), whose trustworthiness is sometimes still regarded as not as guaranteed as that of closed source products. Only recently, several industrial software organizations have started investigating the potential of OSS products as users or even producers. As they are now getting more and more involved in the OSS world, these software organizations are clearly interested in ways to assess the trustworthiness of OSS products, so as to choose OSS products that are adequate for their goals and needs. Trustworthiness is a major issue when people and organizations are faced with the selection and the adoption of new software. Although some ad-hoc methods have been proposed, there is not yet general agreement about which software characteristics contribute to trustworthiness. Such methods \u2013like the OpenBQR [30] and other similar approaches [58][59]\u2013 assess the trustworthiness of a software product by means of a weighted sum of specific quality evaluations. None of the existing methods based on weighted sums has been widely adopted. In fact, these methods are limited in that they typically leave the user with two hard problems, which are common to models built by means of weighted sums: identify the factors that should be taken into account, and assign to each of these factors the \u201ccorrect\u201d weight to adequately quantify its relative importance. Therefore, this work focuses on defining an adequate notion of trustworthiness of Open Source products and artifacts and identifying a number of factors that influence it to help and guide both developers and users when deciding whether a given program (or library or other piece of software) is \u201cgood enough\u201d and can be trusted in order to be used in an industrial or professional context. The result of this work is a set of estimation models for the perceived trustworthiness of OSS. This work has been carried out in the context of the IST project QualiPSo (http://www.qualipso.eu/), funded by the EU in the 6th FP (IST-034763). The first step focuses on defining an adequate notion of trustworthiness of software products and artifacts and identifying a number of factors that influence it. The definition of the trustworthiness factors is driven by specific business goals for each organization. So, we carried out a survey to elicit these goals and factors directly from industrial players, trying to derive the factors from the real user needs instead of deriving them from our own personal beliefs and/or only by reading the available literature. The questions in the questionnaire were mainly classified in three different categories: 1) Organization, project, and role. 2) Actual problems, actual trustworthiness evaluation processes, and factors. 3) Wishes. These questions are needed to understand what information should be available but is not, and what indicators should be provided for an OSS product to help its adoption. To test the applicability of the trustworthiness factors identified by means of the questionnaires, we selected a set of OSS projects, widely adopted and generally considered trustable, to be used as references. Afterwards, a first quick analysis was carried out, to check which factors were readily available on each project\u2019s web site. The idea was to emulate the search for information carried out by a potential user, who browses the project\u2019s web sites, but is not willing to spend too much effort and time in carrying out a complete analysis. By analyzing the results of this investigation, we discovered that most of the trustworthiness factors are not generally available with information that is enough to make an objective assessment, although some factors have been ranked as very important by the respondents of our survey. To fill this gap, we defined a set of different proxy-measures to use whenever a factor cannot be directly assessed on the basis of readily available information. Moreover, some factors are not measurable if developers do not explicitly provide essential information. For instance, this happens for all factors that refer to countable data (e.g., the number of downloads cannot be evaluated in a reliable way if the development community does not publish it). Then, by taking into account the trustworthiness factors and the experience gained through the project analysis, we defined a Goal/Question/Metric (GQM[29]) model for trustworthiness, to identify the qualities and metrics that determine the perception of trustworthiness by users. In order to measure the metrics identified in the GQM model, we identified a set of tools. When possible, tools were obtained by adapting, extending, and integrating existing tools. Considering that most of metrics were not available via the selected tools, we developed MacXim, a static code analysis tool. The selected tools integrate a number of OSS tools that support the creation of a measurement plan, starting from the main actors\u2019 and stakeholders\u2019 objectives and goals (developer community, user community, business needs, specific users, etc.), down to the specific static and dynamic metrics that will need to be collected to fulfill the goals. To validate the GQM model and build quantitative models of perceived trustworthiness and reliability, we collected both subjective evaluations and objective measures on a sample of 22 Java and 22 C/C++ OSS products. Objective measures were collected by means of MacXim and the other identified tools while subjective evaluations were collected by means of more than 500 questionnaires. Specifically, the subjective evaluations concerned how users evaluate the trustworthiness, reliability and other qualities of OSS; objective measures concerned software attributes like size, complexity, modularity, and cohesion. Finally, we correlated the objective code measures to users\u2019 and developers\u2019 evaluations of OSS products. The result is a set of quantitative models that account for the dependence of the perceivable qualities of OSS on objectively observable qualities of the code. Unlike the models based on weighted sums usually available in the literature, we have obtained estimation models [87], so the relevant factors and their specific weights are identified via statistical analysis, and not in a somewhat more subjective way, as usually happens. Qualitatively, our results may not be totally surprising. For instance, it may be generally expected that bigger and more complex products are less trustworthy than smaller and simpler products; likewise, it is expected that well modularized products are more reliable. For instance, our analyses indicate that the OSS products are most likely to be trustworthy if: \u2022 Their size is not greater than 100,000 effective LOC; \u2022 The number of java packages is lower than 228. These models derived in our work can be used by end-users and developers that would like to evaluate the level of trustworthiness and reliability of existing OSS products and components they would like to use or reuse, based on measurable OSS code characteristics. These models can also be used by the developers of OSS products themselves, when setting code quality targets based on the level of trustworthiness and reliability they want to achieve. So, the information obtained via our models can be used as an additional piece of information that can be used when making informed decisions. Thus, unlike several discussions that are based on \u2013sometimes interested\u2013 opinions about the quality of OSS, this study aims at deriving statistically significant models that are based on repeatable measures and user evaluations provided by a reasonably large sample of OSS users. The detailed results are reported in the next sections as follows: \u2022Chapter 1 reports the introduction to this work \u2022Chapter 2 reports the related literature review \u2022Chapter 3 reports the identified trustworthiness factors \u2022Chapter 4 describe how we built the trustworthiness model \u2022Chapter 5 shows the tools we developed for this activity \u2022Chapter 6 reports on the experimentation phase \u2022Chapter 7 shows the results of the experimentation \u2022Chapter 8 draws conclusions and highlights future works \u2022Chapter 9 lists the publication made during the Ph

    Assisting Software Developers With License Compliance

    Open source licensing determines how open source systems are reused, distributed, and modified from a legal perspective. While it facilitates rapid development, it can present difficulty for developers in understanding due to the legal language of these licenses. Because of misunderstandings, systems can incorporate licensed code in a way that violates the terms of the license. Such incompatibilities between licensing can result in the inability to reuse a particular library without either relicensing the system or redesigning the architecture of the system. Prior efforts have predominantly focused on license identification or understanding the underlying phenomena without reasoning about compatibility in a broad scale. The work in this dissertation first investigates the rationale of developers and identifies the areas that developers struggle with respect to free/open source software licensing. First, we investigate the diffusion of licenses and the prevalence of license changes in a large scale empirical study of 16,221 Java systems. We observed a clear lack of traceability and a lack of standardized licensing that led to difficulties and confusion for developers trying to reuse source code. We further investigated the difficulty by surveying the developers of the systems with license changes to understand why they first adopted a license and then changed licenses. Additionally, we performed an analysis on issue trackers and legal mailing lists to extract licensing bugs. From these works, we identified key areas in which developers struggled and needed support. While developers need support to identify license incompatibilities and understand both the cause and implications of the incompatibilities, we observed that state-of-the-art license identification tools did not identify license exceptions. Since these exceptions directly modify the license terms (either the permissions granted by the license or the restrictions imposed by the license), we proposed an approach to complement current license identification techniques in order to classify license exceptions. The approach relies on supervised machine learners to classify the licensing text to identify the particular license exceptions or the lack of a license exception. Subsequently, we built an infrastructure to assist developers with evaluating license compliance warnings for their system. The infrastructure evaluates compliance across the dependency tree of a system to ensure it is compliant with all of the licenses of the dependencies. When an incompatibility is present, it notes the specific library/libraries and the conflicting license(s) so that the developers can investigate these compliance warnings, which would prevent distribution of their software, in their system. We conduct a study on 121,094 open source projects spanning 6 programming languages, and we demonstrate that the infrastructure is able to identify license incompatibilities between these projects and their dependencies

    Social aspects of collaboration in online software communities

