12 research outputs found

    git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories

    Full text link
    Data from software repositories have become an important foundation for the empirical study of software engineering processes. A recurring theme in the repository mining literature is the inference of developer networks capturing e.g. collaboration, coordination, or communication from the commit history of projects. Most of the studied networks are based on the co-authorship of software artefacts defined at the level of files, modules, or packages. While this approach has led to insights into the social aspects of software development, it neglects detailed information on code changes and code ownership, e.g. which exact lines of code have been authored by which developers, that is contained in the commit log of software projects. Addressing this issue, we introduce git2net, a scalable python software that facilitates the extraction of fine-grained co-editing networks in large git repositories. It uses text mining techniques to analyse the detailed history of textual modifications within files. This information allows us to construct directed, weighted, and time-stamped networks, where a link signifies that one developer has edited a block of source code originally written by another developer. Our tool is applied in case studies of an Open Source and a commercial software project. We argue that it opens up a massive new source of high-resolution data on human collaboration patterns.Comment: MSR 2019, 12 pages, 10 figure

    Community-Based Production of Open Source Software: What Do We Know About the Developers Who Participate?

    Get PDF
    This paper seeks to close an empirical gap regarding the motivations, personal attributes and behavioral patterns among free/libre and open source (FLOSS) developers, especially those involved in community-based production, and its findings on the existing literature and the future directions for research. Respondents to an extensive web-survey’s (FLOSS-US 2003) questions about their reasons for work on FLOSS are classified according to their distinct “motivational profiles” by hierarchical cluster analysis. Over half of them also are matched to projects of known membership sizes, revealing that although some members from each of the clusters are present in the small, medium and large ranges of the distribution of project sizes, the mixing fractions for the large and the very small project ranges are statistically different. Among developers who changed projects, there is a discernable flow from the bottom toward the very small towards to large projects, some of which is motivated by individuals seeking to improve their programming skills. It is found that the profile of early motivation, along with other individual attributes, significantly affects individual developers’ selections of projects from different regions of the size range.Open source software, FLOSS project, community-based peer production, population heterogeneity, micro-motives, motivational profiles, web-cast surveys, hierarchical cluster analysis

    The Impact of Anonymous Peripheral Contributions on Open Source Software Development

    Get PDF
    Online peer production communities such as open source software (OSS) projects attract both identified and anonymous peripheral contributions (APC) (e.g., defect reports, feature requests, or forum posts). While we can attribute identified peripheral contributions (IPC) to specific individuals and OSS projects need them to succeed, one cannot trace back anonymous peripheral contributions (APC), and they can have both positive and negative ramifications for project development. Open platforms and managers face a challenging design choice in deciding whether to allow APC and for which tasks or what type of projects. We examine the impact that the ratio between APC and IPC has on OSS project performance. Our results suggest that the OSS projects perform the best when they contain a uniform anonymity level (i.e., they contain predominantly APC or predominantly IPC). However, our results also suggest that OSS projects have lower performance when the ratio between APC and IPC nears one (i.e., they contain close to the same number of APC and IPC). Furthermore, our results suggest that these results differ depending on the type of application that a project develops. Our study contributes to the ongoing debate about the implications of anonymity for online communities and informs managers about the effect that anonymous contributions have on their projects

    Identifying Coordination Problems in Software Development:Finding Mismatches between Software and Project Team Structures

    Get PDF
    Today's dynamic and iterative development environment brings significant challenges for software project management. In distributed project settings, "management by walking around" is no longer an option and project managers may miss out on key project insights. The TESNA (TEchnical Social Network Analysis) method and tool aims to provide project managers both a method and a tool for gaining insights and taking corrective action. TESNA achieves this by analysing a project's evolving social and technical network structures using data from multiple sources, including CVS, email and chat repositories. Using pattern theory, TESNA helps to identify areas where the current state of the project's social and technical networks conflicts with what patterns suggest. We refer to such a conflict as a Socio-Technical Structure Clash (STSC). In this paper we report on our experience of using TESNA to identify STSCs in a corporate environment through the mining of software repositories. We find multiple instances of three STSCs (Conway's Law, Code Ownership and Project Coordination) in many of the on-going development projects, thereby validating the method and tool that we have developed

    Validity Issues in the Use of Social Network Analysis with Digital Trace Data

    Get PDF
    There is an exciting natural match between social network analysis methods and the growth of data sources produced by social interactions via information technologies, from online communities to corporate information systems. Information Systems researchers have not been slow to embrace this combination of method and data. Such systems increasingly provide “digital trace data” that provide new research opportunities. Yet digital trace data are substantively different from the survey and interview data for which network analysis measures and interpretations were originally developed. This paper examines 10 validity issues associated with the combination of digital trace data and social network analysis methods, with examples from the IS literature, to provide recommendations for improving the validity of future research

    Three essays on problem-solving in collaborative open productions

    Get PDF
    The term “open production” is frequently used to describe production systems that rely on volunteer participants who are willing to participate, produce, and bear private costs in order to provide a public good. Examples of open production are becoming increasingly common in many industries. What make these productions possible? How may they be sustained in a world of organizations in which the evolutionary products of economic selection are elaborate hierarchical forms of organization? One way to address these questions is to look at how open productions solve problems that are common to all production organizations such as, for example, problems in the division of labor, allocation of tasks, collaboration, coordination, and maintaining balance between inducement and contributions. Under the conditions of extreme decentralization that are the defining feature of open productions, this approach implies a detailed observation of individual problem solving practices. This is the approach I develop in my dissertation. Unlike much of the prior literature on open productions, I deemphasize motivational elements, status-seeking motives, and allocation of property rights issues. I focus instead on actual work practices as revealed by the day-by-day problem solving activities that qualify open productions projects as production organizations despite the absence of formal contractual arrangements to regulate principal-agent relations. What my work adds to the extensive, informative, and well-developed discipline-based explanations that are currently available, is a focus on the emergence of micro-organizational mechanisms through which problem assignment (Chapter 2), problem resolution (Chapter 3), and sustained participation (Chapter 4) are obtained in open productions. In my essays, I draw from organizational sociology and the behavioral theory of the firm to specify models that relate individual problem-solving activities to structured patterns of action through emergent work practices. In the models that I specify and test, I emphasize processes of attention allocation (Chapter 2), repeated collaboration and group diversity (Chapter 3) and identity construction (Chapter 4) as central to our understanding of the dynamics of problem-solving in organizations. One element of novelty in my study is that my research design makes these work practices directly observable at a level of detail, completeness, and precision that was inaccessible in the past. To illustrate the empirical value of the view that I develop I examine problem-solving activities – i.e., bug fixing and code production – within two Free/Open Source Software (F/OSS) projects during their entire life span. Readers of my work will know more about how organizational micro-mechanisms emerge in open productions

    narratives@war

    Get PDF
    Im Laufe der letzten drei Jahrzehnte nahm die Anzahl an sozial engagierten Menschen drastisch zu. Das sich fĂŒr andere Engagieren ist jedoch keineswegs ein neues PhĂ€nomen. Neu hingegen ist die QualitĂ€t und IntensitĂ€t der (politischen) Organisation. Von der Wissenschaft wurden diese gesteigerten Organisationsformen Neue Soziale Bewegungen getauft. Und obwohl diese immer öfter im Zentrum wissenschaftlicher Analyse stehen, beschĂ€ftigten sich nur wenige Arbeiten mit den ideologischen Grundlagen unterschiedlicher Bewegungen und Organisationen. Ziel dieser Arbeit ist es, einen Baustein zum Schließen dieser LĂŒcke bereitzustellen. Der französische Philosoph Jean-Franois Lyotard verkĂŒndete im letzten Jahrhundert das Ende der Großen ErzĂ€hlungen, die bis dahin jedem politischen Diskurs seit der Antike einen Rahmen gegeben hatten. Ihm zufolge werden diese Meta-Narrative durch eine Vielzahl von kleinen, unbequemen Geschichten abgelöst. Lyotards Beobachtung dient als Ausgangspunkt fĂŒr die Reise zu den Ideologien, den Narrativen der Neuen Sozialen Bewegungen. Stellvertretend fĂŒr die Masse der Neuen Sozialen Bewegungen werden zwei sehr unterschiedliche herausgegriffen: die Frauenbewegung und die Free Software Bewegung. Diese wurden bewusst gewĂ€hlt, um große Gebiete der Bewegungslandschaft abzudecken und um eine Generalisierung hin auf andere Bewegungen zu erlauben. Durch kritische Analyse wissenschaftlicher und praktischer Literatur werden die Einzelheiten von drei Narrativen in den Neuen Sozialen Bewegungen herausgearbeitet. Dies sind ein libertĂ€rer, ein anarchistischer und ein kommunitaristischer Narrativ. Ihr Wirken in den Bewegungen beeinflusst die Art und Weise, in der sich ihnen verschriebene Individuen organisieren, ihre Organisationen ausgestalten und wie der Diskurs innerhalb und zwischen den Bewegungen stattfindet. Um die Funktionsweise der Narrative in den Neuen Sozialen Bewegungen systematisch zu beschreiben, werden zentrale Aspekte von Bewegung und Organisation herausgearbeitet und auf Spuren der Narrative hin untersucht. Zwischen und innerhalb der beiden Bewegungen gibt es ĂŒberraschende Gemeinsamkeiten. Vor allem der anarchistische Narrativ ist in diesem Sinne besonders dogmatisch und erzwingt große Gemeinsamkeiten zwischen beiden Bewegungen. Im libertĂ€ren Narrativ hingegen sind die Ähnlichkeiten besonders schwach ausgeprĂ€gt.Over the course of the last three decades the numbers of volunteers in civic projects have steadily increased. While organization and participation in groups dealing with social issues are by no means new, the soaring numbers of volunteers from all walks of life and the elevated levels of political self-awareness are a more recent phenomenon. Academia came to term these groups New Social Movements. Despite a plethora of research being done on these groups, comparably little thougth is given to their political philosophies in a comparative manner. This thesis sets out to draw a first sketch of a previously uncharted area: the political ideologies New Social Movements are founded upon. French philosopher Jean-Franois Lyotard proclaimed the end of the Grand Stories, that had wrapped around every political discourse since Antiquity. According to his analysis, these meta-narratives were replaced by a multitude of little stories or narratives. This assertion serves as a base camp from which the journey to the ideologies of the New Social Movements is being conducted. As proxies for all Movements, two remarkably different ones were analyzed: the feminist movement and free software communities. These were chosen deliberately to cover diverse fields and thus provide results that are generalizable to a greater population of movements. By means of critical analysis of scholarly literature and experience reports of activists, three main narratives are identified that are at work in both movements. A libertarian, an anarchist and a communitarian narrative shape not only the intra-movement discourse but also the way people that subscribe to either one of them organize. To systematically describe the workings of the narratives, central aspects of movemental organizing are identified and their traces in the narratives reported. There are surprising similarities between both movements and across narratives, showing the anarchist narrative to be most similar across the movements. Most differences occur in the libertarian narrative, due to its weak integrating force
    corecore