65 research outputs found

    Understanding and Tooling Framework API Evolution

    Get PDF
    RÉSUMÉ Les cadres d’applications sont intensivement utilisés dans le développement de logiciels modernes et sont accessibles au travers de leur Application Programming Interface (API), qui définit un ensemble de fonctionnalités que les programmes clients peuvent utiliser pour accomplir des tâches. Les cadres d’applications ne cessent d’évoluer au cours de leurs vies pour satisfaire la demande de nouvelles fonctions ou pour rapiécer des vulnérabilités de sécurité. L’évolution des cadres d’applications peut engendrer des modifications de l’API auxquelles les programmes clients doivent s’adapter. Les mises à jour vers les nouvelles versions des cadres d’applications prennent du temps et peuvent même interrompre le service. Aider les développeurs à mettre à jour leurs programmes est d’un grand intérêt pour les chercheurs académiques et industriels. Dans cette thèse, nous réalisons une étude exploratoire de la réalité des évolutions des API et de leurs usages dans le dépôt central de Maven et dans deux grands cadres d’applications avec de larges écosystèmes : Apache et Eclipse. Nous découvrons que les API changent dans environ 10 % des cadres d’applications et touchent 50 % des programmes clients. Il arrive plus souvent que des classes et des méthodes manquent et disparaissent dans les cadres d’applications. Ces classes et méthodes affectent les programmes clients plus souvent que les autres changements des API. Nous montrons aussi qu’environ 80 % des utilisations des API dans les programmes clients peuvent être réduits par refactoring. Forts de ce constat, nous faisons une expérience pour vérifier l’effectivité des règles de changement des API générés par les approches existantes, qui recommandent les remplacements pour les API disparues pendant l’évolution des cadres d’application. Nous confirmons que les règles de changement des API aident les développeurs à trouver des remplacements aux API manquantes plus précisément, en particulier pour des cadres d’applications difficiles à comprendre. Enfin, nous étudions l’efficacité des caractéristiques utilisées pour construire les règles de changement des API et différentes manières de combiner plusieurs caractéristiques. Nous soutenons et montrons que des approches basées sur l’optimisation multi-objective peuvent détecter des règles de changement des API plus précisément et qu’elles peuvent prendre en compte plus facilement de nouvelles caractéristiques que les approches précédentes.----------ABSTRACT Frameworks are widely used in modern software development and are accessed through their Application Programming Interfaces (APIs), which specify sets of functionalities that client programs can use to accomplish their tasks. Frameworks keep evolving during their lifespan to cope with new requirements, to patch security vulnerabilities, etc. Framework evolution may lead to API changes to which client programs must adapt. Upgrading to new releases of frameworks is time-consuming and can even interrupt services. Helping developers upgrade frameworks draws great interests from both academic and industrial researchers. In this dissertation, we first present an exploratory study to investigate the reality of API changes and usages in Maven repository and two framework ecosystems: Apache and Eclipse. We find that API changes in about 10% of frameworks affect about 50% of client programs. Missing classes and missing methods happen more often in frameworks and affect client programs more often than other API changes. About 80% API usages in client programs can be reduced by refactoring. Based on these findings, we conduct an empirical study to verify the usefulness of API change rules automatically built by previous approaches, which recommend the replacements for missing APIs due to framework evolution. We show that API change rules do help developers find the replacements of missing APIs more accurately, especially for frameworks difficult to understand. We describe another empirical study to evaluate the effectiveness of features used to build API change rules and of different ways combining multiple features. We argue and show that multi-objective-optimization-based approaches can detect more correct change rules and are easier to extend with new features than previous approaches

    Classification of changes in API evolution

    Get PDF
    Applications typically communicate with each other, accessing and exposing data and features by using Application Programming Interfaces (APIs). Even though API consumers expect APIs to be steady and well established, APIs are prone to continuous changes, experiencing different evolutive phases through their lifecycle. These changes are of different types, caused by different needs and are affecting consumers in different ways. In this paper, we identify and classify the changes that often happen to APIs, and investigate how all these changes are reflected in the documentation, release notes, issue tracker and API usage logs. The analysis of each step of a change, from its implementation to the impact that it has on API consumers, will help us to have a bigger picture of API evolution. Thus, we review the current state of the art in API evolution and, as a result, we define a classification framework considering both the changes that may occur to APIs and the reasons behind them. In addition, we exemplify the framework using a software platform offering a Web API, called District Health Information System (DHIS2), used collaboratively by several departments of World Health Organization (WHO).Peer ReviewedPostprint (author's final draft

    Software Development Analytics in Practice: A Systematic Literature Review

    Full text link
    Context:Software Development Analytics is a research area concerned with providing insights to improve product deliveries and processes. Many types of studies, data sources and mining methods have been used for that purpose. Objective:This systematic literature review aims at providing an aggregate view of the relevant studies on Software Development Analytics in the past decade (2010-2019), with an emphasis on its application in practical settings. Method:Definition and execution of a search string upon several digital libraries, followed by a quality assessment criteria to identify the most relevant papers. On those, we extracted a set of characteristics (study type, data source, study perspective, development life-cycle activities covered, stakeholders, mining methods, and analytics scope) and classified their impact against a taxonomy. Results:Source code repositories, experimental case studies, and developers are the most common data sources, study types, and stakeholders, respectively. Product and project managers are also often present, but less than expected. Mining methods are evolving rapidly and that is reflected in the long list identified. Descriptive statistics are the most usual method followed by correlation analysis. Being software development an important process in every organization, it was unexpected to find that process mining was present in only one study. Most contributions to the software development life cycle were given in the quality dimension. Time management and costs control were lightly debated. The analysis of security aspects suggests it is an increasing topic of concern for practitioners. Risk management contributions are scarce. Conclusions:There is a wide improvement margin for software development analytics in practice. For instance, mining and analyzing the activities performed by software developers in their actual workbench, the IDE

    Dependency Management 2.0 – A Semantic Web Enabled Approach

    Get PDF
    Software development and evolution are highly distributed processes that involve a multitude of supporting tools and resources. Application programming interfaces are commonly used by software developers to reduce development cost and complexity by reusing code developed by third-parties or published by the open source community. However, these application programming interfaces have also introduced new challenges to the Software Engineering community (e.g., software vulnerabilities, API incompatibilities, and software license violations) that not only extend beyond the traditional boundaries of individual projects but also involve different software artifacts. As a result, there is the need for a technology-independent representation of software dependency semantics and the ability to seamlessly integrate this representation with knowledge from other software artifacts. The Semantic Web and its supporting technology stack have been widely promoted to model, integrate, and support interoperability among heterogeneous data sources. This dissertation takes advantage of the Semantic Web and its enabling technology stack for knowledge modeling and integration. The thesis introduces five major contributions: (1) We present a formal Software Build System Ontology – SBSON, which captures concepts and properties for software build and dependency management systems. This formal knowledge representation allows us to take advantage of Semantic Web inference services forming the basis for a more flexibility API dependency analysis compared to traditional proprietary analysis approaches. (2) We conducted a user survey which involved 53 open source developers to allow us to gain insights on how actual developers manage API breaking changes. (3) We introduced a novel approach which integrates our SBSON model with knowledge about source code usage and changes within the Maven ecosystem to support API consumers and producers in managing (assessing and minimizing) the impacts of breaking changes. (4) A Security Vulnerability Analysis Framework (SV-AF) is introduced, which integrates builds system, source code, versioning system, and vulnerability ontologies to trace and assess the impact of security vulnerabilities across project boundaries. (5) Finally, we introduce an Ontological Trustworthiness Assessment Model (OntTAM). OntTAM is an integration of our build, source code, vulnerability and license ontologies which supports a holistic analysis and assessment of quality attributes related to the trustworthiness of libraries and APIs in open source systems. Several case studies are presented to illustrate the applicability and flexibility of our modelling approach, demonstrating that our knowledge modeling approach can seamlessly integrate and reuse knowledge extracted from existing build and dependency management systems with other existing heterogeneous data sources found in the software engineering domain. As part of our case studies, we also demonstrate how this unified knowledge model can enable new types of project dependency analysis

    Analyzing 2.3 Million Maven Dependencies to Reveal an Essential Core in APIs

    Full text link
    This paper addresses the following question: does a small, essential, core set of API members emerges from the actual usage of the API by client applications? To investigate this question, we study the 99 most popular libraries available in Maven Central and the 865,560 client programs that declare dependencies towards them, summing up to 2.3M dependencies. Our key findings are as follows: 43.5% of the dependencies declared by the clients are not used in the bytecode; all APIs contain a large part of rarely used types and a few frequently used types, and the ratio varies according to the nature of the API, its size and its design; we can systematically extract a reuse-core from APIs that is sufficient to provide for most clients, the median size of this subset is 17% of the API that can serve 83% of the clients. This study is novel both in its scale and its findings about unused dependencies and the reuse-core of APIs. Our results provide concrete insights to improve Maven's build process with a mechanism to detect unused dependencies. They also support the need to reduce the size of APIs to facilitate API learning and maintenance.Comment: 15 pages, 13 figures, 3 tables, 2 listing

    API Failures in Openstack Cloud Environments

    Get PDF
    Des histoires sur les pannes de service dans les environnements infonuagiques ont fait les manchettes récemment. Dans de nombreux cas, la fiabilité des interfaces de programmation d’applications (API) des infrastructures infonuagiques étaient en défaut. Par conséquent, la compréhension des facteurs qui influent sur la fiabilité de ces APIs est importante pour améliorer la disponibilité des services infonuagiques. Dans cette thèse, nous étudions les défaillances des APIs de la plateforme OpenStack ; qui est la plate-forme infonuagique à code source ouvert la plus populaire à ce jour. Nous examinons les bogues de 25 modules contenus dans les 5 APIs les plus importantes d’OpenStack, afin de comprendre les défaillances des APIs infonuagiques et leurs caractéristiques. Nos résultats montrent que dans OpenStack, un tiers de tous les changements au code des APIs a pour objectif la correction de fautes ; 7% de ces changements modifiants l’interface des APIs concernés (induisant un risque de défaillances des clients de ces APIs). Grâce à l’analyse qualitative d’un échantillon de 230 défaillances d’APIs et de 71 défaillances d’APIs ayant eu une incidence sur des applications tierces, nous avons constaté que la majorité des défaillances d’APIs sont attribuables à de petites erreurs de programmation. Nous avons également observé que les erreurs de programmation et les erreurs de configuration sont les principales causes des défaillances ayant une incidence sur des applications tierces. Nous avons mené un sondage auprès de 38 développeurs d’OpenStack et d’applications tierces, dans lequel les participants étaient invités à se prononcer sur la propagation de défaillances d’APIs à des applications tierces. Parmi les principales raisons fournies par les développeurs pour expliquer l’apparition et la propagation des défaillances d’APIs dans les écosystèmes infonuagiques figurent : les petites erreurs de programmation, les erreurs de configuration, une faible couverture de test, des examens de code peu fréquents, et une fréquence de production de nouvelles versions trop élevé. Nous avons exploré la possibilité d’utiliser des contrôleurs de style de code, pour détecter les petites erreurs de programmation et les erreurs de configuration tôt dans le processus de développement, mais avons constaté que dans la plupart des cas, ces outils sont incapables de localiser ces types d’erreurs. Heureusement, le sujet des rapports de bogues, les messages contenues dans ces rapports, les traces d’exécutions, et les délais de réponses entre les commentaires contenues dans les rapports de bogues se sont avérés très utiles pour la localisation des fautes conduisant aux défaillances d’APIs.----------ABSTRACT: Stories about service outages in cloud environments have been making the headlines recently. In many cases, the reliability of cloud infrastructure Application Programming Interfaces (APIs) were at fault. Hence, understanding the factors affecting the reliability of these APIs is important to improve the availability of cloud services. In this thesis, we investigate API failures in OpenStack ; the most popular open source cloud platform to date. We mine the bugs of 25 modules within the 5 most important OpenStack APIs to understand API failures and their characteristics. Our results show that in OpenStack, one third of all API-related changes are due to fixing failures, with 7% of all fixes even changing the API interface, potentially breaking clients. Through a qualitative analysis of 230 sampled API failures, and 71 API failures that impacted third parties applications, we observed that the majority of API-related failures are due to small programming faults. We also observed that small programming faults and configuration faults are the most frequent causes of failures that propagate to third parties applications. We conducted a survey with 38 OpenStack and third party developers, in which participants were asked about the causes of API failures that propagate to third party applications. These developers reported that small programming faults, configuration faults, low testing coverage, infrequent code reviews, and a rapid release frequency are the main reasons behind the appearance and propagation of API failures. We explored the possibility of using code style checkers to detect small programming and configuration faults early on, but found that in the majority of cases, they cannot be localized using the tools. Fortunately, the subject, message and stack trace as well as the reply lag between comments included in the failures’ bug reports provide a good indication of the cause of the failure
    • …
    corecore