1,454 research outputs found

    Gendered behavior as a disadvantage in open source software development

    Get PDF
    Women are severely marginalized in software development, especially in open source. In this article we argue that disadvantage is more due to gendered behavior than to categorical discrimination: women are at a disadvantage because of what they do, rather than because of who they are. Using data on entire careers of users from GitHub.com, we develop a measure to capture the gendered pattern of behavior: We use a random forest prediction of being female (as opposed to being male) by behavioral choices in the level of activity, specialization in programming languages, and choice of partners. We test differences in success and survival along both categorical gender and the gendered pattern of behavior. We find that 84.5% of women's disadvantage (compared to men) in success and 34.8% of their disadvantage in survival are due to the female pattern of their behavior. Men are also disadvantaged along their interquartile range of the female pattern of their behavior, and users who don't reveal their gender suffer an even more drastic disadvantage in survival probability. Moreover, we do not see evidence for any reduction of these inequalities in time. Our findings are robust to noise in gender recognition, and to taking into account particular programming languages, or decision tree classes of gendered behavior. Our results suggest that fighting categorical gender discrimination will have a limited impact on gender inequalities in open source software development, and that gender hiding is not a viable strategy for women

    Nip it in the Bud: Moderation Strategies in Open Source Software Projects and the Role of Bots

    Full text link
    Much of our modern digital infrastructure relies critically upon open sourced software. The communities responsible for building this cyberinfrastructure require maintenance and moderation, which is often supported by volunteer efforts. Moderation, as a non-technical form of labor, is a necessary but often overlooked task that maintainers undertake to sustain the community around an OSS project. This study examines the various structures and norms that support community moderation, describes the strategies moderators use to mitigate conflicts, and assesses how bots can play a role in assisting these processes. We interviewed 14 practitioners to uncover existing moderation practices and ways that automation can provide assistance. Our main contributions include a characterization of moderated content in OSS projects, moderation techniques, as well as perceptions of and recommendations for improving the automation of moderation tasks. We hope that these findings will inform the implementation of more effective moderation practices in open source communities

    Looking before leaping: Creating a software registry

    Full text link
    What lessons can be learned from examining numerous efforts to create a repository or directory of scientist-written software for a discipline? Astronomy has seen a number of efforts to build such a resource, one of which is the Astrophysics Source Code Library (ASCL). The ASCL (ascl.net) was founded in 1999, had a period of dormancy, and was restarted in 2010. When taking over responsibility for the ASCL in 2010, the new editor sought to answer the opening question, hoping this would better inform the work to be done. We also provide specific steps the ASCL is taking to try to improve code sharing and discovery in astronomy and share recent improvements to the resource.Comment: 11 pages; submission for WSSSPE2. Revised after review for publication in the Journal of Open Research Softwar

    How diverse is your team? Investigating gender and nationality diversity in GitHub teams

    Get PDF
    Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.Background Building an effective team of developers is a complex task faced by both software companies and open source communities. The problem of forming a “dream” team involves many variables, including consideration of human factors and it is not a dilemma solvable in a mathematical way. Empirical studies might provide interesting insights to explain which factors need to be taken into account in building a team of developers and which levers act to optimise productivity among developers. Aim In this paper, we present the results of an empirical study aimed at investigating the link between team diversity (i.e., gender, nationality) and productivity (issue fixing time). Method We consider issues solved from the GHTorrent dataset inferring gender and nationality of each team’s members. We also evaluate the politeness of all comments involved in issue resolution. Results Results show that higher gender diversity is linked with a lower team average issue fixing time (higher productivity), that nationality diversity is linked with lower team politeness and that gender diversity is linked with higher sentiment.Peer reviewedFinal Published versio

    The Impact of Corporate Engagement in Open-Source Enterprise Systems Community on Release Performance

    Get PDF
    With the rise of corporate-sponsored open-source software (OSS) projects in the software industry, open-source enterprise systems (OS-ES) have become essential alternatives for small businesses to adopt and use the advanced business software packages. With a longitudinal study of a mature, collectively developed open source software project, we examine how corporate-communal engagement affects OS-ES performance through the theoretical perspective of group faultlines. Further, we propose that various release types can moderate the relationship between corporate-communal engagement and OS-ES release performance. Using ordinary least squares (OLS) regression with a final data set consisting of 124 data points (i.e., releases periods), we find that the relationship between corporate-communal engagement and OS-ES release performance is best characterized as a curvilinear relationship (U-shape relationship). That is, the evenness of corporate-communal engagement results in a reduced OS-ES release performance, and the unevenness of corporate-communal engagement can increase the OS-ES release performance in the forms of improved quality and innovativeness. Moreover, this curvilinear relationship is likely to be weaker in consolidating releases than in expanding releases. We find that our propositions are supported by the data. This dissertation provides various theoretical and practical contributions. Theoretically, we advance a theoretical framework to understand the effects and outcomes of corporate-communal engagement and release type contingencies by applying group faultlines theory to explain our research model. Further, we propose an alternative perspective on understanding software releases by distinguishing OS-ES releases into consolidating and expanding releases. Practically, this study provides suggestions and insights for corporate managers, open-source leaders, and small businesses to better engage in OS-ES development and adopt proper OS-ES products

    From Static and Dynamic Websites to Static Site Generators

    Get PDF
    DĂŒnaamilised sisuhaldustarkvara paketid, nĂ€iteks WordPress, on kasutusel peaaegu pooltel maailma aktiivsetest veebilehtedest. Paljudel neist lehtedest on peamiselt eelloodud sisu – nĂ€iteks blogiartiklid, uudised ja isiklikud vĂ”i ettevĂ”tete veebilehed. Sellise iseloomuga veebilehtede esitamine lĂ€bi igakordse dĂŒnaamilise genereerimise ei lisa mingit vÀÀrtust vĂ”rreldes sellega, kui lehed oleksid eelgenereeritud ehk staatilised.\n\rKĂ€esolevas töös on analĂŒĂŒsitud staatiliste ja dĂŒnaamiliste veebilehtede pĂ”himĂ”ttelisi erinevusi. On leitud, et staatilised lehed omavad sisulisi eeliseid dĂŒnaamiliste ees – nĂ€iteks turvalisus ja kiirus – kuid dĂŒnaamiliste veebilehtede eelised seisnevad peamiselt kĂŒpsemate tööriistade olemasolus.\n\rTöös on vĂ”rreldud kolme populaarseimat staatilise veebilehe generaatorit – Jekyll’i, Hexo’t ja Hugo’t. On leitud, et Hexo sobib hĂ€sti blogimiseks, kuid Jekyll ja Hugo ka universaalsete veebilehtede loomiseks. Hugo’t tasub eelistada suurte veebilehtede puhul tĂ€nu selle oluliselt suuremale genereerimiskiirusele, kuid peab arvestama selle keerulisema laiendatavusega. \n\rStaatiliste veebilehtede ökosĂŒsteemi on pĂ”gusalt tutvustatud ning toodud vĂ€lja vahendeid lehtede majutamiseks, graafilisi kasutajaliideseid jmt. On pakutud ideid, mida tasuks staatiliste veebilehtede tööriistades edasi arendada.Dynamic Content Management Systems like WordPress are used on almost half of the world’s active websites. As many of these sites are content-driven, like blogs, news sites, personal, company and organisation websites, rendering them dynamically does not offer any value compared to if they were static. \n\rParadigmatic differences between static and dynamic websites are analysed and the bene-fits of each described. It is found that for static-by-nature websites, static approach has core benefits such as security and end-user performance, as benefits of dynamic platforms come mainly from the more mature toolset.\n\rFeatures and usability of three popular Static Site Generators – Jekyll, Hexo and Hugo are analysed. It is found that Jekyll and Hugo are more suitable for universal websites, as Hexo is oriented for blogging. Hugo should be preferred for a large website, as its site generation speed is significantly faster than Jekyll’s. However, the extensibility of Hugo is more complicated. \n\rAdditional tools in the growing static websites ecosystem are pointed out. Some ways of combining these to create a complete toolset are given and ideas for future development proposed

    Recovery of Software Architecture from Code Repositories

    Get PDF
    The goal of this work is to create an approach and tool that will a) extract architectural-significant information from code repositories, namely from resources such as dockerfiles and terraform configurations; b) use of the extracted information to synthesise architectural models that will be kept in-sync with the code repositories automatically; c) support the mechanisms that will allow a team to supply any additional details to the architectural model that can't be inferred directly from the repositories. This approach is expected to reduce information redundancy between code and textual documentation, and still allow an integrated and machine-readable view of the overall software architecture of a system.Architecture can be the result of multiple intangibly connected parts spread across source code and other development artifacts. This makes it difficult to describe the architecture without resourcing to auxiliary documentation that puts this information together. Most of the times, this documentation is manually created, render it a costly process which overtime starts to be desregarded and the documentation becomes out of date and sometimes obsolete. Automating the recovery of the architecture using artifacts that are already present on the source code could potentially improve the way documentation is updated and used

    Networks and trust: systems for understanding and supporting internet security

    Get PDF
    Includes bibliographical references.2022 Fall.This dissertation takes a systems-level view of the multitude of existing trust management systems to make sense of when, where and how (or, in some cases, if) each is best utilized. Trust is a belief by one person that by transacting with another person (or organization) within a specific context, a positive outcome will result. Trust serves as a heuristic that enables us to simplify the dozens decisions we make each day about whom we will transact with. In today's hyperconnected world, in which for many people a bulk of their daily transactions related to business, entertainment, news, and even critical services like healthcare take place online, we tend to rely even more on heuristics like trust to help us simplify complex decisions. Thus, trust plays a critical role in online transactions. For this reason, over the past several decades researchers have developed a plethora of trust metrics and trust management systems for use in online systems. These systems have been most frequently applied to improve recommender systems and reputation systems. They have been designed for and applied to varied online systems including peer-to-peer (P2P) filesharing networks, e-commerce platforms, online social networks, messaging and communication networks, sensor networks, distributed computing networks, and others. However, comparatively little research has examined the effects on individuals, organizations or society of the presence or absence of trust in online sociotechnical systems. Using these existing trust metrics and trust management systems, we design a set of experiments to benchmark the performance of these existing systems, which rely heavily on network analysis methods. Drawing on the experiments' results, we propose a heuristic decision-making framework for selecting a trust management system for use in online systems. In this dissertation we also investigate several related but distinct aspects of trust in online sociotechnical systems. Using network/graph analysis methods, we examine how trust (or lack of trust) affects the performance of online networks in terms of security and quality of service. We explore the structure and behavior of online networks including Twitter, GitHub, and Reddit through the lens of trust. We find that higher levels of trust within a network are associated with more spread of misinformation (a form of cybersecurity threat, according to the US CISA) on Twitter. We also find that higher levels of trust in open source developer networks on GitHub are associated with more frequent incidences of cybersecurity vulnerabilities. Using our experimental and empirical findings previously described, we apply the Systems Engineering Process to design and prototype a trust management tool for use on Reddit, which we dub Coni the Trust Moderating Bot. Coni is, to the best of our knowledge, the first trust management tool designed specifically for use on the Reddit platform. Through our work with Coni, we develop and present a blueprint for constructing a Reddit trust tool which not only measures trust levels, but can use these trust levels to take actions on Reddit to improve the quality of submissions within the community (a subreddit)
    • 

    corecore