18 research outputs found

    Identification-method research for open-source software ecosystems

    Get PDF
    In recent years, open-source software (OSS) development has grown, with many developers around the world working on different OSS projects. A variety of open-source software ecosystems have emerged, for instance, GitHub, StackOverflow, and SourceForge. One of the most typical social-programming and code-hosting sites, GitHub, has amassed numerous open-source-software projects and developers in the same virtual collaboration platform. Since GitHub itself is a large open-source community, it hosts a collection of software projects that are developed together and coevolve. The great challenge here is how to identify the relationship between these projects, i.e., project relevance. Software-ecosystem identification is the basis of other studies in the ecosystem. Therefore, how to extract useful information in GitHub and identify software ecosystems is particularly important, and it is also a research area in symmetry. In this paper, a Topic-based Project Knowledge Metrics Framework (TPKMF) is proposed. By collecting the multisource dataset of an open-source ecosystem, project-relevance analysis of the open-source software is carried out on the basis of software-ecosystem identification. Then, we used our Spectral Clustering algorithm based on Core Project (CP-SC) to identify software-ecosystem projects and further identify software ecosystems. We verified that most software ecosystems usually contain a core software project, and most other projects are associated with it. Furthermore, we analyzed the characteristics of the ecosystem, and we also found that interactive information has greater impact on project relevance. Finally, we summarize the Topic-based Project Knowledge Metrics Framework

    Analysing Big Data Projects Using Github and JavaScript Repositories

    Get PDF
    GitHub open source software developers remain in short supply. Successful GitHub projects offer multiple pathways for developers to contribute into their repositories. This study鈥檚 GitHub JavaScript big data is path modelled to provide understanding of the different significant developer contribution pathways towards raising the project鈥檚 activity level. Its significant pathways offer the project鈥檚 creator benchmark decision making capabilities that can be used to trigger faster project software development through to its next completion point. This approach has behavioural consumptive value connotations that may provide a future pathway towards tapping big data sources and to also delivering real business values

    Co-membership, Networks Ties, and OSS Success: An Investigation Controlling for Alternative Mechanisms for Knowledge Flow

    Get PDF
    Co-membership has been considered as a major mechanism for constructing social networks, but it has met many criticisms over time for failing to control for alternative mechanisms for knowledge flow. Although social networks constructed in online environment can reduce such possibilities, it is not without limitations. One possible mechanism for learning and knowledge flow is direct watching and observation. This study investigates the impact of co-membership taking into account the alternative mechanism of watching under the setting of OSS development at GitHub. It finds that both co-membership and watching contribute positively to OSS success, and thus shows the co-existence of both experiential learning and vicarious learning for OSS development. Moreover, it finds the impact of co-membership is much stronger than watching. While the impact of co-membership may be biased in prior literature, this study confirms that co-membership is indeed an effective mechanism for constructing online social networks for knowledge flow

    Seguimiento de proyectos de programaci贸n. Una aplicaci贸n de GitHub en la educaci贸n

    Get PDF
    Resumen:聽Esta investigaci贸n desarrolla y propone un m茅todo para la ejecuci贸n y se-guimiento de los proyectos de fin de curso de las materias de lenguajes de programaci贸n en carreras de Ingenier铆a Electr贸nica soportado tecnol贸gica-mente por GitHub. El objetivo es obtener un m茅todo que permita mejorar la calidad de los proyectos. Para la validaci贸n del nuevo m茅todo se realiza un contraste entre dos grupos de estudiantes, el primer grupo utilizando el m茅-todo tradicional de ejecuci贸n de proyectos y el segundo utilizando el m茅todo propuesto. Los resultados muestran que a pesar que la aplicaci贸n del nuevo m茅todo demanda de un esfuerzo adicional por parte del docente, evidente-mente mejora el proceso de seguimiento de los proyectos de fin de curso: facilitando el trabajo colaborativo, permitiendo una evaluaci贸n objetiva, visi-bilizando los h谩bitos de estudio de los estudiantes, transparentando y auto-matizando las actividades inherentes a la ejecuci贸n de proyectos de softwa-re, potenciando el seguimiento y la gu铆a a los estudiantes.聽Palabras clave:聽Propuesta metodol贸gica, Seguimiento de proyectos, Proyecto de fin de curso, M茅todo [MESEPP], sistema de versionamiento [Github], lenguajes de programaci贸n

    Law Smells - Defining and Detecting Problematic Patterns in Legal Drafting

    Get PDF

    Open source software GitHub ecosystem: a SEM approach

    Get PDF
    Open source software (OSS) is a collaborative effort. Getting affordable high-quality software with less probability of errors or fails is not far away. Thousands of open-source projects (termed repos) are alternatives to proprietary software development. More than two-thirds of companies are contributing to open source. Open source technologies like OpenStack, Docker and KVM are being used to build the next generation of digital infrastructure. An iconic example of OSS is 'GitHub' - a successful social site. GitHub is a hosting platform that host repositories (repos) based on the Git version control system. GitHub is a knowledge-based workspace. It has several features that facilitate user communication and work integration. Through this thesis I employ data extracted from GitHub, and seek to better understand the OSS ecosystem, and to what extent each of its deployed elements affects the successful development of the OSS ecosystem. In addition, I investigate a repo's growth over different time periods to test the changing behavior of the repo. From our observations developers do not follow one development methodology when developing, and growing their project, and such developers tend to cherry-pick from differing available software methodologies. GitHub API remains the main OSS location engaged to extract the metadata for this thesis's research. This extraction process is time-consuming - due to restrictive access limitations (even with authentication). I apply Structure Equation Modelling (termed SEM) to investigate the relative path relationships between the GitHub- deployed OSS elements, and I determine the path strength contributions of each element to determine the OSS repo's activity level. SEM is a multivariate statistical analysis technique used to analyze structural relationships. This technique is the combination of factor analysis and multiple regression analysis. It is used to analyze the structural relationship between measured variables and/or latent constructs. This thesis bridges the research gap around longitude OSS studies. It engages large sample-size OSS repo metadata sets, data-quality control, and multiple programming language comparisons. Querying GitHub is not direct (nor simple) yet querying for all valid repos remains important - as sometimes illegal, or unrepresentative outlier repos (which may even be quite popular) do arise, and these then need to be removed from each initial OSS's language-specific metadata set. Eight top GitHub programming languages, (selected as the most forked repos) are separately engaged in this thesis's research. This thesis observes these eight metadata sets of GitHub repos. Over time, it measures the different repo contributions of the deployed elements of each metadata set. The number of stars-provided to the repo delivers a weaker contribution to its software development processes. Sometimes forks work against the repo's progress by generating very minor negative total effects into its commit (activity) level, and by sometimes diluting the focus of the repo's software development strategies. Here, a fork may generate new ideas, create a new repo, and then draw some original repo developers off into this new software development direction, thus retarding the original repo's commit (activity) level progression. Multiple intermittent and minor version releases exert lesser GitHub JavaScript repo commit (or activity) changes because they often involve only slight OSS improvements, and because they only require minimal commit/commits contributions. More commit(s) also bring more changes to documentation, and again the GitHub OSS repo's commit (activity) level rises. There are both direct and indirect drivers of the repo's OSS activity. Pulls and commits are the strongest drivers. This suggests creating higher levels of pull requests is likely a preferred prime target consideration for the repo creator's core team of developers. This study offers a big data direction for future work. It allows for the deployment of more sophisticated statistical comparison techniques. It offers further indications around the internal and broad relationships that likely exist between GitHub's OSS big data. Its data extraction ideas suggest a link through to business/consumer consumption, and possibly how these may be connected using improved repo search algorithms that release individual business value components
    corecore