58 research outputs found

    KGTorrent: A dataset of python jupyter notebooks from kaggle

    Get PDF
    Computational notebooks have become the tool of choice for many data scientists and practitioners for performing analyses and disseminating results. Despite their increasing popularity, the research community cannot yet count on a large, curated dataset of computational notebooks. In this paper, we fill this gap by introducing KGTorrent, a dataset of Python Jupyter notebooks with rich metadata retrieved from Kaggle, a platform hosting data science competitions for learners and practitioners with any levels of expertise. We describe how we built KGTorrent, and provide instructions on how to use it and refresh the collection to keep it up to date. Our vision is that the research community will use KGTorrent to study how data scientists, especially practitioners, use Jupyter Notebook in the wild and identify potential shortcomings to inform the design of its future extensions

    Towards productizing AI/ML Models: An industry perspective from data scientists

    Get PDF
    The transition from AI/ML models to production-ready AI-based systems is a challenge for both data scientists and software engineers. In this paper, we report the results of a workshop conducted in a consulting company to understand how this transition is perceived by practitioners. Starting from the need for making AI experiments reproducible, the main themes that emerged are related to the use of the Jupyter Notebook as the primary prototyping tool, and the lack of support for software engineering best practices as well as data science specific functionalities

    Benefitting from the Grey Literature in Software Engineering Research

    Full text link
    Researchers generally place the most trust in peer-reviewed, published information, such as journals and conference papers. By contrast, software engineering (SE) practitioners typically do not have the time, access or expertise to review and benefit from such publications. As a result, practitioners are more likely to turn to other sources of information that they trust, e.g., trade magazines, online blog-posts, survey results or technical reports, collectively referred to as Grey Literature (GL). Furthermore, practitioners also share their ideas and experiences as GL, which can serve as a valuable data source for research. While GL itself is not a new topic in SE, using, benefitting and synthesizing knowledge from the GL in SE is a contemporary topic in empirical SE research and we are seeing that researchers are increasingly benefitting from the knowledge available within GL. The goal of this chapter is to provide an overview to GL in SE, together with insights on how SE researchers can effectively use and benefit from the knowledge and evidence available in the vast amount of GL

    Evaluation of the performance of irradiated silicon strip sensors for the forward detector of the ATLAS Inner Tracker Upgrade

    Get PDF
    The upgrade to the High-Luminosity LHC foreseen in about ten years represents a great challenge for the ATLAS inner tracker and the silicon strip sensors in the forward region. Several strip sensor designs were developed by the ATLAS collaboration and fabricated by Hamamatsu in order to maintain enough performance in terms of charge collection efficiency and its uniformity throughout the active region. Of particular attention, in the case of a stereo-strip sensor, is the area near the sensor edge where shorter strips were ganged to the complete ones. In this work the electrical and charge collection test results on irradiated miniature sensors with forward geometry are presented. Results from charge collection efficiency measurements show that at the maximum expected fluence, the collected charge is roughly halved with respect to the one obtained prior to irradiation. Laser measurements show a good signal uniformity over the sensor. Ganged strips have a similar efficiency as standard strips

    Computer-mediated communication to support distributed requirements elicitations and negotiations tasks

    No full text
    Abstract Requirements engineering is one of the most communication-intensive activities in software development, greatly affected by project stakeholder geographical distribution. Despite advances in collaboration technologies, global software teams continue to experience significant challenges in the elicitation and negotiation of requirements. Deciding which communication technologies to deploy to achieve effective communication in distributed requirements engineering activities is not a trivial task. Is face-to-face or textbased communication more appropriate for requirements elicitations and negotiations? In teams that do not have access to face-to-face communication, is text-based communication more useful in requirements elicitations than in requirements negotiations? Here, we report an empirical study that analyzes the effectiveness of synchronous computer-mediated communication in requirements elicitations and negotiations. Our investigation is guided by a theoretical framework that we developed from theories of computer-mediated communication, common ground, and media selection for group tasks; a framework that considers the effectiveness of a communication medium in relation to the information richness needs of requirements elicitation and negotiation tasks. Our findings bring forward empirical evidence about the perceived as well as objective fit between synchronous communication technology and requirements tasks. First, face-to-face is not always the most preferred medium for requirements tasks, and we reveal a number of conditions in which, in contrast to common belief, text-based communication is preferred for requirements communication. Second, we find that in evaluating outcomes of requirements elicitations and negotiations objectively, group performance is not affected by the communication medium. Third, when groups interact only via text-based communication, common ground in requirements negotiations takes longer to achieve than in requirements elicitations, indicating that distributed requirements elicitation is the task where computer-mediated communication tools have most opportunity for successful application

    Building Trust through Social Awareness: The SocialCDE Project

    No full text
    Trust is paramount in distributed software development to prevent geographically distributed sites to feel and act like distinct, distant teams. Nevertheless, how to build trust among developers with few or no chances to meet is an open issue. To overcome such a challenge we hypothesize that increased social awareness may foster trust building in global software teams. Here, we first present SocialCDE, a tool that aims at augmenting Application Lifecycle Management (ALM) platforms with social awareness to facilitate the establishment of interpersonal connections by disclosing developers’ personal interests and contextual information. Then, we present two different empirical studies, specifically designed to test our hypothesis

    SocialCDE: A Social Awareness Tool for Global Software Teams

    No full text
    We present SocialCDE, a tool that aims at augmenting Application Lifecycle Management (ALM) platforms with social awareness to facilitate the establishment of interpersonal connections and increase the likelihood of successful interactions by disclosing developers’ personal interests and contextual information

    Augmenting Social Awareness in a Collaborative Development Environment

    No full text
    Social awareness, that is information that a person maintains about others in a social or conversational context, can contribute to counteract the lack of teamness in global software development and strengthen trust among remote developers. We hypothesize that information shared on social media can work for distributed software teams as a surrogate of the social awareness gained during informal face to face chats. As a preliminary step we have developed a tool that extends a collaborative development environment by aggregating content from social networks and microblogs into the developer’s workspace
    • …
    corecore