11,901 research outputs found

    Sensemaking Practices in the Everyday Work of AI/ML Software Engineering

    Get PDF
    This paper considers sensemaking as it relates to everyday software engineering (SE) work practices and draws on a multi-year ethnographic study of SE projects at a large, global technology company building digital services infused with artificial intelligence (AI) and machine learning (ML) capabilities. Our findings highlight the breadth of sensemaking practices in AI/ML projects, noting developers' efforts to make sense of AI/ML environments (e.g., algorithms/methods and libraries), of AI/ML model ecosystems (e.g., pre-trained models and "upstream"models), and of business-AI relations (e.g., how the AI/ML service relates to the domain context and business problem at hand). This paper builds on recent scholarship drawing attention to the integral role of sensemaking in everyday SE practices by empirically investigating how and in what ways AI/ML projects present software teams with emergent sensemaking requirements and opportunities

    Coopetition in an open-source way : lessons from mobile and cloud computing infrastructures

    Get PDF
    An increasing amount of technology is no longer developed in-house. Instead, we are in a new age where technology is developed by a networked community of individuals and organizations, who base their relations to each other on mutual interest. Advances arising from research in platforms, ecosystems, and infrastructures can provide valuable knowledge for better understanding and explaining technology development among a network of firms. More surprisingly, recent research suggests that technology can be jointly developed by rival competing firms in an open-source way. For instance, it is known that the mobile device makers Apple and Samsung continued collaborating in open-source projects while running expensive patent wars in the courts. On top of multidisciplinary theory in open-source software, cooperation among competitors (aka coopetition) and digital infrastructures, I (and my coauthors) explored how rival firms cooperate in the joint development of open-source infrastructures. While assimilating a wide variety of paradigms and analytical approaches, this doctoral research combined the qualitative analysis of naturally occurring data (QA) with the mining of software repositories (MSR) and social network analysis (SNA) within a set of case studies. By turning to the mobile and cloud computing industries in general, and the WebKit and OpenStack opensource infrastructures in particular, we found out that qualitative ethnographic materials, combined with social network visualizations, provide a rich medium that enables a better understanding of competitive and cooperative issues that are simultaneously present and interconnected in open-source infrastructures. Our research contributes back to managerial literature in coopetition strategy, but more importantly to Information Systems by addressing both cooperation and competition within the development of high-networked open-source infrastructures.Yhä suurempaa osaa teknologiasta ei enää kehitetä organisaatioiden omasta toimesta. Sen sijaan, olemme uudella aikakaudella jossa teknologiaa kehitetään verkostoituneessa yksilöiden ja organisaatioiden yhteisössä, missä toimitaan perustuen yhteiseen tavoitteeseen. Alustojen, ekosysteemien ja infrastruktuurien tutkimuksen tulokset voivat tuottaa arvokasta tietämystä teknologian kehittämisestä yritysten verkostossa. Erityisesti tuore tutkimustieto osoittaa että kilpailevat yritykset voivat yhdessä kehittää teknologiaa avoimeen lähdekoodiin perustuvilla käytännöillä. Esimerkiksi tiedetään että mobiililaitteiden valmistajat Apple ja Samsung tekivät yhteistyötä avoimen lähdekoodin projekteissa ja kävivät samaan aikaan kalliita patenttitaistoja eri oikeusfoorumeissa. Perustuen monitieteiseen teoriaan avoimen lähdekoodin ohjelmistoista, yhteistyöstä kilpailijoiden kesken (coopetition) sekä digitaalisista infrastruktuureista, minä (ja kanssakirjoittajani) tutkimme miten kilpailevat yritykset tekevät yhteistyötä avoimen lähdekoodin infrastruktuurien kehityksessä. Sulauttaessaan runsaan joukon paradigmoja ja analyyttisiä lähestymistapoja case-joukon puitteissa, tämä väitöskirjatutkimus yhdisti luonnollisesti esiintyvän datan kvantitatiivisen analyysin ohjelmapakettivarastojen louhintaan ja sosiaalisten verkostojen analyysiin. Tutkiessamme mobiili- ja pilvipalveluiden teollisuudenaloja yleisesti, ja WebKit ja OpenStack avoimen lähdekoodin infrastruktuureja erityisesti, havaitsimme että kvalitatiiviset etnografiset materiaalit yhdistettyinä sosiaalisten verkostojen visualisointiin tuottavat rikkaan aineiston joka mahdollistaa avoimen lähdekoodin infrastruktuuriin samanaikaisesti liittyvien kilpailullisten ja yhteistyökuvioiden hyvän ymmärtämisen. Tutkimuksemme antaa oman panoksensa johdon kirjallisuuteen coopetition strategy -alueella, mutta sitäkin enemmän tietojärjestelmätieteeseen, läpikäymällä sekä yhteistyötä että kilpailua tiiviisti verkostoituneessa avoimen lähdekoodin infrastruktuurien kehitystoiminnassaUma crescente quantidade de tecnologia não é desenvolvida internamente por uma só organização. Em vez disso, estamos em uma nova era em que a tecnologia é desenvolvida por uma comunidade de indivíduos e organizações que baseiam suas relações umas com as outras numa rede de interesse mútuo. Os avanços teórico decorrentes da pesquisa em plataformas computacionais, ecossistemas e infraestruturas digitais fornecem conhecimentos valiosos para uma melhor compreensão e explicação do desenvolvimento tecnológico por uma rede de multiplas empresas. Mais surpreendentemente, pesquisas recentes sugerem que tecnologia pode ser desenvolvida conjuntamente por empresas rivais concorrentes e de uma forma aberta (em código aberto). Por exemplo, sabe-se que os fabricantes de dispositivos móveis Apple e Samsung continuam a colaborar em projetos de código aberto ao mesmo tempo que se confrontam em caras guerras de patentes nos tribunais. Baseados no conhecimento científico de software de código aberto, de cooperação entre concorrentes (também conhecida como coopetição) e de infraestruturas digitais, eu e os meus co-autores exploramos como empresas concorrentes cooperam no desenvolvimento conjunto de infraestruturas de código aberto. Ao utilizar uma variedade de paradigmas e abordagens analíticas, esta pesquisa de doutoramento combinou a análise qualitativa de dados de ocorrência natural (QA) com a análise de repositórios de softwares (MSR) e a análise de redes sociais (SNA) dentro de um conjunto de estudos de casos. Ao investigar as industrias de technologias móveis e de computação em nuvem em geral, e as infraestruturas em código aberto WebKit e OpenStack, em particular, descobrimos que o material etnográfico qualitativo, combinado com visualizações de redes sociais, fornece um meio rico que permite uma melhor compreensão das problemas competitivos e cooperativos que estão simultaneamente presentes e interligados em infraestruturas de código aberto. A nossa pesquisa contribui para a literatura em gestão estratégica e coompetição, mas mais importante para literatura em Sistemas de Informação, abordando a cooperação e concorrência no desenvolvimento de infraestruturas de código aberto por uma rede the indivíduos e organizações em interesse mútuo

    Lessons Learned from Applying Social Network Analysis on an Industrial Free/Libre/Open Source Software Ecosystem

    Get PDF
    Many software projects are no longer done in-house by a single organization. Instead, we are in a new age where software is developed by a networked community of individuals and organizations, which base their relations to each other on mutual interest. Paradoxically, recent research suggests that software development can actually be jointly-developed by rival firms. For instance, it is known that the mobile-device makers Apple and Samsung kept collaborating in open source projects while running expensive patent wars in the court. Taking a case study approach, we explore how rival firms collaborate in the open source arena by employing a multi-method approach that combines qualitative analysis of archival data (QA) with mining software repositories (MSR) and Social Network Analysis (SNA). While exploring collaborative processes within the OpenStack ecosystem, our research contributes to Software Engineering research by exploring the role of groups, sub-communities and business models within a high-networked open source ecosystem. Surprising results point out that competition for the same revenue model (i.e., operating conflicting business models) does not necessary affect collaboration within the ecosystem. Moreover, while detecting the different sub-communities of the OpenStack community, we found out that the expected social tendency of developers to work with developers from same firm (i.e., homophily) did not hold within the OpenStack ecosystem. Furthermore, while addressing a novel, complex and unexplored open source case, this research also contributes to the management literature in coopetition strategy and high-tech entrepreneurship with a rich description on how heterogeneous actors within a high-networked ecosystem (involving individuals, startups, established firms and public organizations) joint-develop a complex infrastructure for big-data in the open source arena.Comment: As accepted by the Journal of Internet Services and Applications (JISA

    Empirical research on the evaluation model and method of sustainability of the open source ecosystem

    Get PDF
    The development of open source brings new thinking and production modes to software engineering and computer science, and establishes a software development method and ecological environment in which groups participate. Regardless of investors, developers, participants, and managers, they are most concerned about whether the Open Source Ecosystem can be sustainable to ensure that the ecosystem they choose will serve users for a long time. Moreover, the most important quality of the software ecosystem is sustainability, and it is also a research area in Symmetry. Therefore, it is significant to assess the sustainability of the Open Source Ecosystem. However, the current measurement of the sustainability of the Open Source Ecosystem lacks universal measurement indicators, as well as a method and a model. Therefore, this paper constructs an Evaluation Indicators System, which consists of three levels: The target level, the guideline level and the evaluation level, and takes openness, stability, activity, and extensibility as measurement indicators. On this basis, a weight calculation method, based on information contribution values and a Sustainability Assessment Model, is proposed. The models and methods are used to analyze the factors affecting the sustainability of Stack Overflow (SO) ecosystem. Through the analysis, we find that every indicator in the SO ecosystem is partaking in different development trends. The development trend of a single indicator does not represent the sustainable development trend of the whole ecosystem. It is necessary to consider all of the indicators to judge that ecosystem’s sustainability. The research on the sustainability of the Open Source Ecosystem is helpful for judging software health, measuring development efficiency and adjusting organizational structure. It also provides a reference for researchers who study the sustainability of software engineering

    Identification-method research for open-source software ecosystems

    Get PDF
    In recent years, open-source software (OSS) development has grown, with many developers around the world working on different OSS projects. A variety of open-source software ecosystems have emerged, for instance, GitHub, StackOverflow, and SourceForge. One of the most typical social-programming and code-hosting sites, GitHub, has amassed numerous open-source-software projects and developers in the same virtual collaboration platform. Since GitHub itself is a large open-source community, it hosts a collection of software projects that are developed together and coevolve. The great challenge here is how to identify the relationship between these projects, i.e., project relevance. Software-ecosystem identification is the basis of other studies in the ecosystem. Therefore, how to extract useful information in GitHub and identify software ecosystems is particularly important, and it is also a research area in symmetry. In this paper, a Topic-based Project Knowledge Metrics Framework (TPKMF) is proposed. By collecting the multisource dataset of an open-source ecosystem, project-relevance analysis of the open-source software is carried out on the basis of software-ecosystem identification. Then, we used our Spectral Clustering algorithm based on Core Project (CP-SC) to identify software-ecosystem projects and further identify software ecosystems. We verified that most software ecosystems usually contain a core software project, and most other projects are associated with it. Furthermore, we analyzed the characteristics of the ecosystem, and we also found that interactive information has greater impact on project relevance. Finally, we summarize the Topic-based Project Knowledge Metrics Framework

    Summary of the First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE1)

    Get PDF
    Challenges related to development, deployment, and maintenance of reusable software for science are becoming a growing concern. Many scientists’ research increasingly depends on the quality and availability of software upon which their works are built. To highlight some of these issues and share experiences, the First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE1) was held in November 2013 in conjunction with the SC13 Conference. The workshop featured keynote presentations and a large number (54) of solicited extended abstracts that were grouped into three themes and presented via panels. A set of collaborative notes of the presentations and discussion was taken during the workshop. Unique perspectives were captured about issues such as comprehensive documentation, development and deployment practices, software licenses and career paths for developers. Attribution systems that account for evidence of software contribution and impact were also discussed. These include mechanisms such as Digital Object Identifiers, publication of “software papers”, and the use of online systems, for example source code repositories like GitHub. This paper summarizes the issues and shared experiences that were discussed, including cross-cutting issues and use cases. It joins a nascent literature seeking to understand what drives software work in science, and how it is impacted by the reward systems of science. These incentives can determine the extent to which developers are motivated to build software for the long-term, for the use of others, and whether to work collaboratively or separately. It also explores community building, leadership, and dynamics in relation to successful scientific software

    Entering an ecosystem: The hybrid OSS landscape from a developer perspective

    Get PDF
    Hybrid Open Source Software projects are virtual organizations that express characteristics of both static and dynamic behavior. They are choreographed through complex organizational structures that mix centralized governance with distributed community drivenness. While many communities use standard software tools to support their development processes, each community has its own ways of working and invisible power structures that influence how contributions are submitted, how they are verified and how decisions about the long-term direction of the software product are made. Navigating this environment is especially challenging for new developers who need to prove their abilities to gain rights to make contributions. This paper provides a viewpoint on the factors that influence a new developer's perception of the hybrid OSS developer community landscape. We apply an established developmental theory to build an initial model for the developer's context and discuss the model's validation, providing its practical and theoretical implications for building and managing on-line developer communities.Peer reviewe

    Open source software GitHub ecosystem: a SEM approach

    Get PDF
    Open source software (OSS) is a collaborative effort. Getting affordable high-quality software with less probability of errors or fails is not far away. Thousands of open-source projects (termed repos) are alternatives to proprietary software development. More than two-thirds of companies are contributing to open source. Open source technologies like OpenStack, Docker and KVM are being used to build the next generation of digital infrastructure. An iconic example of OSS is 'GitHub' - a successful social site. GitHub is a hosting platform that host repositories (repos) based on the Git version control system. GitHub is a knowledge-based workspace. It has several features that facilitate user communication and work integration. Through this thesis I employ data extracted from GitHub, and seek to better understand the OSS ecosystem, and to what extent each of its deployed elements affects the successful development of the OSS ecosystem. In addition, I investigate a repo's growth over different time periods to test the changing behavior of the repo. From our observations developers do not follow one development methodology when developing, and growing their project, and such developers tend to cherry-pick from differing available software methodologies. GitHub API remains the main OSS location engaged to extract the metadata for this thesis's research. This extraction process is time-consuming - due to restrictive access limitations (even with authentication). I apply Structure Equation Modelling (termed SEM) to investigate the relative path relationships between the GitHub- deployed OSS elements, and I determine the path strength contributions of each element to determine the OSS repo's activity level. SEM is a multivariate statistical analysis technique used to analyze structural relationships. This technique is the combination of factor analysis and multiple regression analysis. It is used to analyze the structural relationship between measured variables and/or latent constructs. This thesis bridges the research gap around longitude OSS studies. It engages large sample-size OSS repo metadata sets, data-quality control, and multiple programming language comparisons. Querying GitHub is not direct (nor simple) yet querying for all valid repos remains important - as sometimes illegal, or unrepresentative outlier repos (which may even be quite popular) do arise, and these then need to be removed from each initial OSS's language-specific metadata set. Eight top GitHub programming languages, (selected as the most forked repos) are separately engaged in this thesis's research. This thesis observes these eight metadata sets of GitHub repos. Over time, it measures the different repo contributions of the deployed elements of each metadata set. The number of stars-provided to the repo delivers a weaker contribution to its software development processes. Sometimes forks work against the repo's progress by generating very minor negative total effects into its commit (activity) level, and by sometimes diluting the focus of the repo's software development strategies. Here, a fork may generate new ideas, create a new repo, and then draw some original repo developers off into this new software development direction, thus retarding the original repo's commit (activity) level progression. Multiple intermittent and minor version releases exert lesser GitHub JavaScript repo commit (or activity) changes because they often involve only slight OSS improvements, and because they only require minimal commit/commits contributions. More commit(s) also bring more changes to documentation, and again the GitHub OSS repo's commit (activity) level rises. There are both direct and indirect drivers of the repo's OSS activity. Pulls and commits are the strongest drivers. This suggests creating higher levels of pull requests is likely a preferred prime target consideration for the repo creator's core team of developers. This study offers a big data direction for future work. It allows for the deployment of more sophisticated statistical comparison techniques. It offers further indications around the internal and broad relationships that likely exist between GitHub's OSS big data. Its data extraction ideas suggest a link through to business/consumer consumption, and possibly how these may be connected using improved repo search algorithms that release individual business value components
    corecore