99 research outputs found

    Towards Transparent, Reusable, and Customizable Data Science in Computational Notebooks

    Full text link
    Data science workflows are human-centered processes involving on-demand programming and analysis. While programmable and interactive interfaces such as widgets embedded within computational notebooks are suitable for these workflows, they lack robust state management capabilities and do not support user-defined customization of the interactive components. The absence of such capabilities hinders workflow reusability and transparency while limiting the scope of exploration of the end-users. In response, we developed MAGNETON, a framework for authoring interactive widgets within computational notebooks that enables transparent, reusable, and customizable data science workflows. The framework enhances existing widgets to support fine-grained interaction history management, reusable states, and user-defined customizations. We conducted three case studies in a real-world knowledge graph construction and serving platform to evaluate the effectiveness of these widgets. Based on the observations, we discuss future implications of employing MAGNETON widgets for general-purpose data science workflows.Comment: To appear at Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing System

    Restoring Execution Environments of Jupyter Notebooks

    Get PDF
    More than ninety percent of published Jupyter notebooks do not state dependencies on external packages. This makes them non-executable and thus hinders reproducibility of scientific results. We present SnifferDog, an approach that 1) collects the APIs of Python packages and versions, creating a database of APIs; 2) analyzes notebooks to determine candidates for required packages and versions; and 3) checks which packages are required to make the notebook executable (and ideally, reproduce its stored results). In its evaluation, we show that SnifferDog precisely restores execution environments for the largest majority of notebooks, making them immediately executable for end users.Comment: to be published in the 43rd ACM/IEEE International Conference on Software Engineering (ICSE 2021

    Paths Explored, Paths Omitted, Paths Obscured: Decision Points & Selective Reporting in End-to-End Data Analysis

    Full text link
    Drawing reliable inferences from data involves many, sometimes arbitrary, decisions across phases of data collection, wrangling, and modeling. As different choices can lead to diverging conclusions, understanding how researchers make analytic decisions is important for supporting robust and replicable analysis. In this study, we pore over nine published research studies and conduct semi-structured interviews with their authors. We observe that researchers often base their decisions on methodological or theoretical concerns, but subject to constraints arising from the data, expertise, or perceived interpretability. We confirm that researchers may experiment with choices in search of desirable results, but also identify other reasons why researchers explore alternatives yet omit findings. In concert with our interviews, we also contribute visualizations for communicating decision processes throughout an analysis. Based on our results, we identify design opportunities for strengthening end-to-end analysis, for instance via tracking and meta-analysis of multiple decision paths

    Research Data Management Practices And Impacts on Long-term Data Sustainability: An Institutional Exploration

    Get PDF
    With the \u27data deluge\u27 leading to an institutionalized research environment for data management, U.S. academic faculty have increasingly faced pressure to deposit research data into open online data repositories, which, in turn, is engendering a new set of practices to adapt formal mandates to local circumstances. When these practices involve reorganizing workflows to align the goals of local and institutional stakeholders, we might call them \u27data articulations.\u27 This dissertation uses interviews to establish a grounded understanding of the data articulations behind deposit in 3 studies: (1) a phenomenological study of genomics faculty data management practices; (2) a grounded theory study developing a theory of data deposit as articulation work in genomics; and (3) a comparative case study of genomics and social science researchers to identify factors associated with the institutionalization of research data management (RDM). The findings of this research offer an in-depth understanding of the data management and deposit practices of academic research faculty, and surfaced institutional factors associated with data deposit. Additionally, the studies led to a theoretical framework of data deposit to open research data repositories. The empirical insights into the impacts of institutionalization of RDM and data deposit on long-term data sustainability update our knowledge of the impacts of increasing guidelines for RDM. The work also contributes to the body of data management literature through the development of the data articulation framework which can be applied and further validated by future work. In terms of practice, the studies offer recommendations for data policymakers, data repositories, and researchers on defining strategies and initiatives to leverage data reuse and employ computational approaches to support data management and deposit

    Distributed cognition and businesses as 'mental institutions'

    Get PDF
    This thesis explores distributed cognition within the context of business and argues that businesses can be considered ‘mental institutions’. It therefore defends a liberal view of cognition, recognising the integration of stakeholders within a larger business structure that contains multiple cognitive schemas that conduct, constrain, and amplify one’s thoughts and affectivity in relation to the organisation. The aim of this thesis is therefore to broaden the scope of investigation regarding the socially extended mind and demonstrate the real-world applicability of these discussions to business consultancy. Following a revision of how the ‘mental institution’ should be considered and a deconstruction of the concept of ‘business’, the thesis picks out six institutional artefacts and structures that are common features of business organisations. These are logos, products, shops, offices, hierarchies, and narratives. Mental business institutions are designed with cognition in mind, and thus these institutional features can become integral parts of thought for both employees within business organisations and external consumers. Chapters individually explore the various ways we can become coupled to these artefacts and structures as internal or external stakeholders, and thus integrated within the cognitive niche of the business institution. Finally, an empirical study of a large UK-based utility company provides an example of how one can investigate the collaborative efforts of employees within an organisation through the lens of distributed cognition. Ultimately, an application of distributed cognition and mental institutions to business within this text brings to fruition new additional conceptual resources for management and marketing studies

    Small acts of self: practices of personal vignette games

    Get PDF
    This thesis focuses on personal vignette games as a medium for sharing and exploring the self. It investigates the central thesis question, how vignette games are used to explore personal experiences, examining the creative potential of vignette games as a form of human connection, personal practice and poetic expression. The making of personal vignette games, and the circumstances which lead to their creation, are not often considered within the realm of game studies. However, the exploration of videogames as a personal, everyday and approachable art form is of vital importance to extending the conversation around the creative and social role of digital games. Taking a mixed methods approach, the thesis breaks the central question into three sub-questions: how does the personal vignette game portray an author’s personal experiences, what are the authors’ connections to the personal vignette game, and what is the significance of personal creative process. The studies undertaken focus on three key elements that shape the way the medium is used to share personal experiences: the creators making them, the creative methods used, and the games themselves. Through a combination of close reading games, interviewing creators and creating personal vignette games of my own, I uncover dynamics of personal vignette games within the various contexts which shape them. Throughout this thesis, I situate the personal vignette game within their private, playful and social contexts. I argue that the approachable, transgressive and often abstract form of the vignette game has helped this mode of digital self-expression to flourish outside the boundaries of what might be traditionally considered “game design”. I explore the personal vignette game as a collection of shared perspectives through play, as well as a social connection and a free-form exploration of the self. I present also an ethos for approaching the personal vignette game, whether creating, playing or contemplating the works from a critical standpoint
    corecore