82 research outputs found

    An Event-based Analysis Framework for Open Source Software Development Projects

    Get PDF
    The increasing popularity and success of Open Source Software (OSS) development projects has drawn significant attention of academics and open source participants over the last two decades. As one of the key areas in OSS research, assessing and predicting OSS performance is of great value to both OSS communities and organizations who are interested in investing in OSS projects. Most existing research, however, has considered OSS project performance as the outcome of static cross-sectional factors such as number of developers, project activity level, and license choice. While variance studies can identify some predictors of project outcomes, they tend to neglect the actual process of development. Without a closer examination of how events occur, an understanding of OSS projects is incomplete. This dissertation aims to combine both process and variance strategy, to investigate how OSS projects change over time through their development processes; and to explore how these changes affect project performance. I design, instantiate, and evaluate a framework and an artifact, EventMiner, to analyze OSS projects’ evolution through development activities. This framework integrates concepts from various theories such as distributed cognition (DCog) and complexity theory, applying data mining techniques such as decision trees, motif analysis, and hidden Markov modeling to automatically analyze and interpret the trace data of 103 OSS projects from an open source repository. The results support the construction of process theories on OSS development. The study contributes to literature in DCog, design routines, OSS development, and OSS performance. The resulting framework allows OSS researchers who are interested in OSS development processes to share and reuse data and data analysis processes in an open-source manner

    Software Supply Chain Development and Application

    Get PDF
    Motivation: Free Libre Open Source Software (FLOSS) has become a critical componentin numerous devices and applications. Despite its importance, it is not clear why FLOSS ecosystem works so well or if it may cease to function. Majority of existing research is focusedon studying a specific software project or a portion of an ecosystem, but FLOSS has not been investigated in its entirety. Such view is necessary because of the deep and complex technical and social dependencies that go beyond the core of an individual ecosystem and tight inter-dependencies among ecosystems within FLOSS.Aim: We, therefore, aim to discover underlying relations within and across FLOSS projects and developers in open source community, mitigate potential risks induced by the lack of such knowledge and enable systematic analysis over entire open source community through the lens of supply chain (SC).Method: We utilize concepts from an area of supply chains to model risks of FLOSS ecosystem. FLOSS, due to the distributed decision making of software developers, technical dependencies, and copying of the code, has similarities to traditional supply chain. Unlike in traditional supply chain, where data is proprietary and distributed among players, we aim to measure open-source software supply chain (OSSC) by operationalizing supply chain concept in software domain using traces reconstructed from version control data.Results: We create a very large and frequently updated collection of version control data in the entire FLOSS ecosystems named World of Code (WoC), that can completely cross-reference authors, projects, commits, blobs, dependencies, and history of the FLOSS ecosystems, and provide capabilities to efficiently correct, augment, query, and analyze that data. Various researches and applications (e.g., software technology adoption investigation) have been successfully implemented by leveraging the combination of WoC and OSSC.Implications: With a SC perspective in FLOSS development and the increased visibility and transparency in OSSC, our work provides potential opportunities for researchers to conduct wider and deeper studies on OSS over entire FLOSS community, for developers to build more robust software and for students to learn technologies more efficiently and improve programming skills

    Open source software ecosystems quality analysis from data sources

    Get PDF
    Background: Open source software (OSS) and software ecosystems (SECOs) are two consolidated research areas in software engineering. The adoption of OSS by firms, governments, researchers and practitioners has been increasing rapidly in the last decades, and in consequence, they find themselves in a new kind of ecosystem composed by software communities,foundations, developers and partners, namely Open Source Software Ecosystem (OSSECO). In order to perform a systematic quality evaluation of a SECO, it is necessary to define certain types of concrete elements. This means that measures and evaluations should be described (e.g., through thresholds or expert judgment). The quality evaluation of an OSSECO may serve several purposes, for example: adopters of the products of the OSSECO may want to know about the liveliness of the OSSECO (e.g., recent updates); software developers may want to know about the activeness (e.g., how many collaborators are involved and how active they are); and the OSSECO community itself to know about the OSSECO health (e.g., evolving in the right direction). However, the current approaches for evaluating software quality (even those specific for open source software) do not cover all the aspects relevant in an OSSECO from an ecosystem perspective. Goal: The main goal of this PhD thesis is to support the OSSECO quality evaluation by designing a framework that supports the quality evaluation of OSSECOs. Methods: To accomplish this goal, we have used and approach based on design science methodology by Wieringa [1] and the characterization of software engineering proposed by M. Shaw [2], in order to produce a set of artefacts to contribute in thequality evaluation of OSSECOs and to learn about the effects of using these artefacts in practice. Results: We have conducted a systematic mapping to characterize OSSECOs and designed the QuESo framework (a framework to evaluate the OSSECO quality) composed by three artifacts: (i) QuESo-model, a quality model for OSSECOs; (ii) QuESoprocess, a process for conducting OSSECO quality evaluations using the QuESo-model; and (iii) QuESo-tool, a software component to support semi-automatic quality evaluation of OSSECOs. Furthermore, this framework has been validated with a case study on Eclipse. Conclusions: This thesis has contributed to increase the knowledge and understanding of OSSECOs, and to support the qualityevaluation of OSSECOs. [ntecedentes: el software de código abierto (OSS, por sus siglas en inglés) y los ecosistemas de software (SECOs, por sus siglas en inglés) son dos áreas de investigación consolidadas en ingeniería de software. La adopción de OSS por parte de empresas, gobiernos, investigadores y profesionales se ha incrementado rápidamente en las últimas décadas, y, en consecuencia, todos ellos hacen parte de un nuevo tipo de ecosistema formado por comunidades de software, fundaciones, desarrolladores y socios denominado ecosistema de software de código abierto. (OSSECO, por sus siglas en inglés)). Para realizar una evaluación sistemática de la calidad de un SECO, es necesario definir ciertos tipos de elementos concretos. Esto significa que tanto las métricas como las evaluaciones deben ser descritos (por ejemplo, a través de datos históricos o el conocimiento de expertos). La evaluación de la calidad de un OSSECO puede ser de utilidad desde diferentes perspectivas, por ejemplo: los que adoptan los productos del OSSECO pueden querer conocer la vitalidad del OSSECO (por ejemplo, el número de actualizaciones recientes); los desarrolladores de software pueden querer saber sobre la actividad del OSSECO (por ejemplo, cuántos colaboradores están involucrados y qué tan activos son); incluso la propia comunidad del OSSECO para conocer el estado de salud del OSSECO (por ejemplo, si está evolucionando en la dirección correcta). Sin embargo, los enfoques actuales para evaluar la calidad del software (incluso aquellos específicos para el software de código abierto) no cubren todos los aspectos relevantes en un OSSECO desde una perspectiva ecosistémica. Objetivo: El objetivo principal de esta tesis doctoral es apoyar la evaluación de la calidad de OSSECO mediante el diseño de un marco de trabajo que ayude a la evaluación de la calidad de un OSSECO. Métodos: Para lograr este objetivo, hemos utilizado un enfoque basado en la metodología design science propuesta por Wieringa [1]. Adicionalmente, nos hemos basado en la caracterización de la ingeniería de software propuesta por M. Shaw [2], con el fin de construir un conjunto de artefactos que contribuyan en la evaluación de la calidad de un OSSECO y para conocer los efectos del uso de estos artefactos en la práctica. Resultados: Hemos realizado un mapeo sistemático para caracterizar los OSSECOs y hemos diseñado el marco de trabajo denominado QuESo (es un marco de trabajo para evaluar la calidad de los OSSECOs). QuESo a su vez está compuesto por tres artefactos: (i) QuESo-model, un modelo de calidad para OSSECOs; (ii) QuESo-process, un proceso para llevar a cabo las evaluaciones de calidad de OSSECOs utilizando el modelo QuESo; y (iii) QuESo-tool, un conjunto de componentes de software que apoyan la evaluación de calidad de los OSSECOs de manera semiautomática. QuESo ha sido validado con un estudio de caso sobre Eclipse. Conclusiones: esta tesis ha contribuido a aumentar el conocimiento y la comprensión de los OSSECOs, y tambien ha apoyado la evaluación de la calidad de los OSSECOsPostprint (published version

    Open source software ecosystems quality analysis from data sources

    Get PDF
    Background: Open source software (OSS) and software ecosystems (SECOs) are two consolidated research areas in software engineering. The adoption of OSS by firms, governments, researchers and practitioners has been increasing rapidly in the last decades, and in consequence, they find themselves in a new kind of ecosystem composed by software communities,foundations, developers and partners, namely Open Source Software Ecosystem (OSSECO). In order to perform a systematic quality evaluation of a SECO, it is necessary to define certain types of concrete elements. This means that measures and evaluations should be described (e.g., through thresholds or expert judgment). The quality evaluation of an OSSECO may serve several purposes, for example: adopters of the products of the OSSECO may want to know about the liveliness of the OSSECO (e.g., recent updates); software developers may want to know about the activeness (e.g., how many collaborators are involved and how active they are); and the OSSECO community itself to know about the OSSECO health (e.g., evolving in the right direction). However, the current approaches for evaluating software quality (even those specific for open source software) do not cover all the aspects relevant in an OSSECO from an ecosystem perspective. Goal: The main goal of this PhD thesis is to support the OSSECO quality evaluation by designing a framework that supports the quality evaluation of OSSECOs. Methods: To accomplish this goal, we have used and approach based on design science methodology by Wieringa [1] and the characterization of software engineering proposed by M. Shaw [2], in order to produce a set of artefacts to contribute in thequality evaluation of OSSECOs and to learn about the effects of using these artefacts in practice. Results: We have conducted a systematic mapping to characterize OSSECOs and designed the QuESo framework (a framework to evaluate the OSSECO quality) composed by three artifacts: (i) QuESo-model, a quality model for OSSECOs; (ii) QuESoprocess, a process for conducting OSSECO quality evaluations using the QuESo-model; and (iii) QuESo-tool, a software component to support semi-automatic quality evaluation of OSSECOs. Furthermore, this framework has been validated with a case study on Eclipse. Conclusions: This thesis has contributed to increase the knowledge and understanding of OSSECOs, and to support the qualityevaluation of OSSECOs. [ntecedentes: el software de código abierto (OSS, por sus siglas en inglés) y los ecosistemas de software (SECOs, por sus siglas en inglés) son dos áreas de investigación consolidadas en ingeniería de software. La adopción de OSS por parte de empresas, gobiernos, investigadores y profesionales se ha incrementado rápidamente en las últimas décadas, y, en consecuencia, todos ellos hacen parte de un nuevo tipo de ecosistema formado por comunidades de software, fundaciones, desarrolladores y socios denominado ecosistema de software de código abierto. (OSSECO, por sus siglas en inglés)). Para realizar una evaluación sistemática de la calidad de un SECO, es necesario definir ciertos tipos de elementos concretos. Esto significa que tanto las métricas como las evaluaciones deben ser descritos (por ejemplo, a través de datos históricos o el conocimiento de expertos). La evaluación de la calidad de un OSSECO puede ser de utilidad desde diferentes perspectivas, por ejemplo: los que adoptan los productos del OSSECO pueden querer conocer la vitalidad del OSSECO (por ejemplo, el número de actualizaciones recientes); los desarrolladores de software pueden querer saber sobre la actividad del OSSECO (por ejemplo, cuántos colaboradores están involucrados y qué tan activos son); incluso la propia comunidad del OSSECO para conocer el estado de salud del OSSECO (por ejemplo, si está evolucionando en la dirección correcta). Sin embargo, los enfoques actuales para evaluar la calidad del software (incluso aquellos específicos para el software de código abierto) no cubren todos los aspectos relevantes en un OSSECO desde una perspectiva ecosistémica. Objetivo: El objetivo principal de esta tesis doctoral es apoyar la evaluación de la calidad de OSSECO mediante el diseño de un marco de trabajo que ayude a la evaluación de la calidad de un OSSECO. Métodos: Para lograr este objetivo, hemos utilizado un enfoque basado en la metodología design science propuesta por Wieringa [1]. Adicionalmente, nos hemos basado en la caracterización de la ingeniería de software propuesta por M. Shaw [2], con el fin de construir un conjunto de artefactos que contribuyan en la evaluación de la calidad de un OSSECO y para conocer los efectos del uso de estos artefactos en la práctica. Resultados: Hemos realizado un mapeo sistemático para caracterizar los OSSECOs y hemos diseñado el marco de trabajo denominado QuESo (es un marco de trabajo para evaluar la calidad de los OSSECOs). QuESo a su vez está compuesto por tres artefactos: (i) QuESo-model, un modelo de calidad para OSSECOs; (ii) QuESo-process, un proceso para llevar a cabo las evaluaciones de calidad de OSSECOs utilizando el modelo QuESo; y (iii) QuESo-tool, un conjunto de componentes de software que apoyan la evaluación de calidad de los OSSECOs de manera semiautomática. QuESo ha sido validado con un estudio de caso sobre Eclipse. Conclusiones: esta tesis ha contribuido a aumentar el conocimiento y la comprensión de los OSSECOs, y tambien ha apoyado la evaluación de la calidad de los OSSECO

    Using Personality Detection Tools for Software Engineering Research: How Far Can We Go?

    Get PDF
    Assessing the personality of software engineers may help to match individual traits with the characteristics of development activities such as code review and testing, as well as support managers in team composition. However, self-assessment questionnaires are not a practical solution for collecting multiple observations on a large scale. Instead, automatic personality detection, while overcoming these limitations, is based on off-the-shelf solutions trained on non-technical corpora, which might not be readily applicable to technical domains like software engineering. In this paper, we first assess the performance of general-purpose personality detection tools when applied to a technical corpus of developers’ emails retrieved from the public archives of the Apache Software Foundation. We observe a general low accuracy of predictions and an overall disagreement among the tools. Second, we replicate two previous research studies in software engineering by replacing the personality detection tool used to infer developers’ personalities from pull-request discussions and emails. We observe that the original results are not confirmed, i.e., changing the tool used in the original study leads to diverging conclusions. Our results suggest a need for personality detection tools specially targeted for the software engineering domain

    An Introduction to Software Ecosystems

    Full text link
    This chapter defines and presents different kinds of software ecosystems. The focus is on the development, tooling and analytics aspects of software ecosystems, i.e., communities of software developers and the interconnected software components (e.g., projects, libraries, packages, repositories, plug-ins, apps) they are developing and maintaining. The technical and social dependencies between these developers and software components form a socio-technical dependency network, and the dynamics of this network change over time. We classify and provide several examples of such ecosystems. The chapter also introduces and clarifies the relevant terms needed to understand and analyse these ecosystems, as well as the techniques and research methods that can be used to analyse different aspects of these ecosystems.Comment: Preprint of chapter "An Introduction to Software Ecosystems" by Tom Mens and Coen De Roover, published in the book "Software Ecosystems: Tooling and Analytics" (eds. T. Mens, C. De Roover, A. Cleve), 2023, ISBN 978-3-031-36059-6, reproduced with permission of Springer. The final authenticated version of the book and this chapter is available online at: https://doi.org/10.1007/978-3-031-36060-
    corecore