1,454 research outputs found
Gendered behavior as a disadvantage in open source software development
Women are severely marginalized in software development, especially in open
source. In this article we argue that disadvantage is more due to gendered
behavior than to categorical discrimination: women are at a disadvantage
because of what they do, rather than because of who they are. Using data on
entire careers of users from GitHub.com, we develop a measure to capture the
gendered pattern of behavior: We use a random forest prediction of being female
(as opposed to being male) by behavioral choices in the level of activity,
specialization in programming languages, and choice of partners. We test
differences in success and survival along both categorical gender and the
gendered pattern of behavior. We find that 84.5% of women's disadvantage
(compared to men) in success and 34.8% of their disadvantage in survival are
due to the female pattern of their behavior. Men are also disadvantaged along
their interquartile range of the female pattern of their behavior, and users
who don't reveal their gender suffer an even more drastic disadvantage in
survival probability. Moreover, we do not see evidence for any reduction of
these inequalities in time. Our findings are robust to noise in gender
recognition, and to taking into account particular programming languages, or
decision tree classes of gendered behavior. Our results suggest that fighting
categorical gender discrimination will have a limited impact on gender
inequalities in open source software development, and that gender hiding is not
a viable strategy for women
Nip it in the Bud: Moderation Strategies in Open Source Software Projects and the Role of Bots
Much of our modern digital infrastructure relies critically upon open sourced
software. The communities responsible for building this cyberinfrastructure
require maintenance and moderation, which is often supported by volunteer
efforts. Moderation, as a non-technical form of labor, is a necessary but often
overlooked task that maintainers undertake to sustain the community around an
OSS project. This study examines the various structures and norms that support
community moderation, describes the strategies moderators use to mitigate
conflicts, and assesses how bots can play a role in assisting these processes.
We interviewed 14 practitioners to uncover existing moderation practices and
ways that automation can provide assistance. Our main contributions include a
characterization of moderated content in OSS projects, moderation techniques,
as well as perceptions of and recommendations for improving the automation of
moderation tasks. We hope that these findings will inform the implementation of
more effective moderation practices in open source communities
Looking before leaping: Creating a software registry
What lessons can be learned from examining numerous efforts to create a
repository or directory of scientist-written software for a discipline?
Astronomy has seen a number of efforts to build such a resource, one of which
is the Astrophysics Source Code Library (ASCL). The ASCL (ascl.net) was founded
in 1999, had a period of dormancy, and was restarted in 2010. When taking over
responsibility for the ASCL in 2010, the new editor sought to answer the
opening question, hoping this would better inform the work to be done. We also
provide specific steps the ASCL is taking to try to improve code sharing and
discovery in astronomy and share recent improvements to the resource.Comment: 11 pages; submission for WSSSPE2. Revised after review for
publication in the Journal of Open Research Softwar
How diverse is your team? Investigating gender and nationality diversity in GitHub teams
Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.Background Building an effective team of developers is a complex task faced by both software companies and open source communities. The problem of forming a âdreamâ team involves many variables, including consideration of human factors and it is not a dilemma solvable in a mathematical way. Empirical studies might provide interesting insights to explain which factors need to be taken into account in building a team of developers and which levers act to optimise productivity among developers. Aim In this paper, we present the results of an empirical study aimed at investigating the link between team diversity (i.e., gender, nationality) and productivity (issue fixing time). Method We consider issues solved from the GHTorrent dataset inferring gender and nationality of each teamâs members. We also evaluate the politeness of all comments involved in issue resolution. Results Results show that higher gender diversity is linked with a lower team average issue fixing time (higher productivity), that nationality diversity is linked with lower team politeness and that gender diversity is linked with higher sentiment.Peer reviewedFinal Published versio
The Impact of Corporate Engagement in Open-Source Enterprise Systems Community on Release Performance
With the rise of corporate-sponsored open-source software (OSS) projects in the software industry, open-source enterprise systems (OS-ES) have become essential alternatives for small businesses to adopt and use the advanced business software packages. With a longitudinal study of a mature, collectively developed open source software project, we examine how corporate-communal engagement affects OS-ES performance through the theoretical perspective of group faultlines. Further, we propose that various release types can moderate the relationship between corporate-communal engagement and OS-ES release performance. Using ordinary least squares (OLS) regression with a final data set consisting of 124 data points (i.e., releases periods), we find that the relationship between corporate-communal engagement and OS-ES release performance is best characterized as a curvilinear relationship (U-shape relationship). That is, the evenness of corporate-communal engagement results in a reduced OS-ES release performance, and the unevenness of corporate-communal engagement can increase the OS-ES release performance in the forms of improved quality and innovativeness. Moreover, this curvilinear relationship is likely to be weaker in consolidating releases than in expanding releases. We find that our propositions are supported by the data. This dissertation provides various theoretical and practical contributions. Theoretically, we advance a theoretical framework to understand the effects and outcomes of corporate-communal engagement and release type contingencies by applying group faultlines theory to explain our research model. Further, we propose an alternative perspective on understanding software releases by distinguishing OS-ES releases into consolidating and expanding releases. Practically, this study provides suggestions and insights for corporate managers, open-source leaders, and small businesses to better engage in OS-ES development and adopt proper OS-ES products
From Static and Dynamic Websites to Static Site Generators
DĂŒnaamilised sisuhaldustarkvara paketid, nĂ€iteks WordPress, on kasutusel peaaegu pooltel maailma aktiivsetest veebilehtedest. Paljudel neist lehtedest on peamiselt eelloodud sisu â nĂ€iteks blogiartiklid, uudised ja isiklikud vĂ”i ettevĂ”tete veebilehed. Sellise iseloomuga veebilehtede esitamine lĂ€bi igakordse dĂŒnaamilise genereerimise ei lisa mingit vÀÀrtust vĂ”rreldes sellega, kui lehed oleksid eelgenereeritud ehk staatilised.\n\rKĂ€esolevas töös on analĂŒĂŒsitud staatiliste ja dĂŒnaamiliste veebilehtede pĂ”himĂ”ttelisi erinevusi. On leitud, et staatilised lehed omavad sisulisi eeliseid dĂŒnaamiliste ees â nĂ€iteks turvalisus ja kiirus â kuid dĂŒnaamiliste veebilehtede eelised seisnevad peamiselt kĂŒpsemate tööriistade olemasolus.\n\rTöös on vĂ”rreldud kolme populaarseimat staatilise veebilehe generaatorit â Jekyllâi, Hexoât ja Hugoât. On leitud, et Hexo sobib hĂ€sti blogimiseks, kuid Jekyll ja Hugo ka universaalsete veebilehtede loomiseks. Hugoât tasub eelistada suurte veebilehtede puhul tĂ€nu selle oluliselt suuremale genereerimiskiirusele, kuid peab arvestama selle keerulisema laiendatavusega. \n\rStaatiliste veebilehtede ökosĂŒsteemi on pĂ”gusalt tutvustatud ning toodud vĂ€lja vahendeid lehtede majutamiseks, graafilisi kasutajaliideseid jmt. On pakutud ideid, mida tasuks staatiliste veebilehtede tööriistades edasi arendada.Dynamic Content Management Systems like WordPress are used on almost half of the worldâs active websites. As many of these sites are content-driven, like blogs, news sites, personal, company and organisation websites, rendering them dynamically does not offer any value compared to if they were static. \n\rParadigmatic differences between static and dynamic websites are analysed and the bene-fits of each described. It is found that for static-by-nature websites, static approach has core benefits such as security and end-user performance, as benefits of dynamic platforms come mainly from the more mature toolset.\n\rFeatures and usability of three popular Static Site Generators â Jekyll, Hexo and Hugo are analysed. It is found that Jekyll and Hugo are more suitable for universal websites, as Hexo is oriented for blogging. Hugo should be preferred for a large website, as its site generation speed is significantly faster than Jekyllâs. However, the extensibility of Hugo is more complicated. \n\rAdditional tools in the growing static websites ecosystem are pointed out. Some ways of combining these to create a complete toolset are given and ideas for future development proposed
Recovery of Software Architecture from Code Repositories
The goal of this work is to create an approach and tool that will a) extract architectural-significant information from code repositories, namely from resources such as dockerfiles and terraform configurations; b) use of the extracted information to synthesise architectural models that will be kept in-sync with the code repositories automatically; c) support the mechanisms that will allow a team to supply any additional details to the architectural model that can't be inferred directly from the repositories.
This approach is expected to reduce information redundancy between code and textual documentation, and still allow an integrated and machine-readable view of the overall software architecture of a system.Architecture can be the result of multiple intangibly connected parts spread across source code and other development artifacts. This makes it difficult to describe the architecture without resourcing to auxiliary documentation that puts this information together. Most of the times, this documentation is manually created, render it a costly process which overtime starts to be desregarded and the documentation becomes out of date and sometimes obsolete.
Automating the recovery of the architecture using artifacts that are already present on the source code could potentially improve the way documentation is updated and used
Networks and trust: systems for understanding and supporting internet security
Includes bibliographical references.2022 Fall.This dissertation takes a systems-level view of the multitude of existing trust management systems to make sense of when, where and how (or, in some cases, if) each is best utilized. Trust is a belief by one person that by transacting with another person (or organization) within a specific context, a positive outcome will result. Trust serves as a heuristic that enables us to simplify the dozens decisions we make each day about whom we will transact with. In today's hyperconnected world, in which for many people a bulk of their daily transactions related to business, entertainment, news, and even critical services like healthcare take place online, we tend to rely even more on heuristics like trust to help us simplify complex decisions. Thus, trust plays a critical role in online transactions. For this reason, over the past several decades researchers have developed a plethora of trust metrics and trust management systems for use in online systems. These systems have been most frequently applied to improve recommender systems and reputation systems. They have been designed for and applied to varied online systems including peer-to-peer (P2P) filesharing networks, e-commerce platforms, online social networks, messaging and communication networks, sensor networks, distributed computing networks, and others. However, comparatively little research has examined the effects on individuals, organizations or society of the presence or absence of trust in online sociotechnical systems. Using these existing trust metrics and trust management systems, we design a set of experiments to benchmark the performance of these existing systems, which rely heavily on network analysis methods. Drawing on the experiments' results, we propose a heuristic decision-making framework for selecting a trust management system for use in online systems. In this dissertation we also investigate several related but distinct aspects of trust in online sociotechnical systems. Using network/graph analysis methods, we examine how trust (or lack of trust) affects the performance of online networks in terms of security and quality of service. We explore the structure and behavior of online networks including Twitter, GitHub, and Reddit through the lens of trust. We find that higher levels of trust within a network are associated with more spread of misinformation (a form of cybersecurity threat, according to the US CISA) on Twitter. We also find that higher levels of trust in open source developer networks on GitHub are associated with more frequent incidences of cybersecurity vulnerabilities. Using our experimental and empirical findings previously described, we apply the Systems Engineering Process to design and prototype a trust management tool for use on Reddit, which we dub Coni the Trust Moderating Bot. Coni is, to the best of our knowledge, the first trust management tool designed specifically for use on the Reddit platform. Through our work with Coni, we develop and present a blueprint for constructing a Reddit trust tool which not only measures trust levels, but can use these trust levels to take actions on Reddit to improve the quality of submissions within the community (a subreddit)
- âŠ