11,294 research outputs found

    Multimodal learning and teaching corpora exchange: Lessons learned in five years by the Mulce project

    Get PDF
    In order to make replication possible for interaction analysis in online learning, the French project named Mulce (2007-2010) and its team worked on requirements for research data to be shareable. We defined a learning and teaching corpus (LETEC) as a package containing the data issued from an online course, the contextual information and metadata, necessary to make these data visible, shareable and reusable. These human, technical and ethical requirements are presented in this paper. We briefly present the structure of a corpus and the repository we developed to share these corpora. Related works are also described and we show how conditions evolved between 2006 and 2011. This leads us to report on how the Mulce project was faced with four particular challenges and to suggest acceptable solutions for computer scientists and researchers in the humanities: both concerned by data sharing in the Technology Enhanced Learning community

    Can AI Moderate Online Communities?

    Full text link
    The task of cultivating healthy communication in online communities becomes increasingly urgent, as gaming and social media experiences become progressively more immersive and life-like. We approach the challenge of moderating online communities by training student models using a large language model (LLM). We use zero-shot learning models to distill and expand datasets followed by a few-shot learning and a fine-tuning approach, leveraging open-access generative pre-trained transformer models (GPT) from OpenAI. Our preliminary findings suggest, that when properly trained, LLMs can excel in identifying actor intentions, moderating toxic comments, and rewarding positive contributions. The student models perform above-expectation in non-contextual assignments such as identifying classically toxic behavior and perform sufficiently on contextual assignments such as identifying positive contributions to online discourse. Further, using open-access models like OpenAI's GPT we experience a step-change in the development process for what has historically been a complex modeling task. We contribute to the information system (IS) discourse with a rapid development framework on the application of generative AI in content online moderation and management of culture in decentralized, pseudonymous communities by providing a sample model suite of industrial-ready generative AI models based on open-access LLMs

    Analysis and Detection of Information Types of Open Source Software Issue Discussions

    Full text link
    Most modern Issue Tracking Systems (ITSs) for open source software (OSS) projects allow users to add comments to issues. Over time, these comments accumulate into discussion threads embedded with rich information about the software project, which can potentially satisfy the diverse needs of OSS stakeholders. However, discovering and retrieving relevant information from the discussion threads is a challenging task, especially when the discussions are lengthy and the number of issues in ITSs are vast. In this paper, we address this challenge by identifying the information types presented in OSS issue discussions. Through qualitative content analysis of 15 complex issue threads across three projects hosted on GitHub, we uncovered 16 information types and created a labeled corpus containing 4656 sentences. Our investigation of supervised, automated classification techniques indicated that, when prior knowledge about the issue is available, Random Forest can effectively detect most sentence types using conversational features such as the sentence length and its position. When classifying sentences from new issues, Logistic Regression can yield satisfactory performance using textual features for certain information types, while falling short on others. Our work represents a nontrivial first step towards tools and techniques for identifying and obtaining the rich information recorded in the ITSs to support various software engineering activities and to satisfy the diverse needs of OSS stakeholders.Comment: 41st ACM/IEEE International Conference on Software Engineering (ICSE2019

    Survey on Evaluation Methods for Dialogue Systems

    Get PDF
    In this paper we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost and time intensive. Thus, much work has been put into finding methods, which allow to reduce the involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented dialogue systems, conversational dialogue systems, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then by presenting the evaluation methods regarding this class

    Building Emotional Support Chatbots in the Era of LLMs

    Full text link
    The integration of emotional support into various conversational scenarios presents profound societal benefits, such as social interactions, mental health counseling, and customer service. However, there are unsolved challenges that hinder real-world applications in this field, including limited data availability and the absence of well-accepted model training paradigms. This work endeavors to navigate these challenges by harnessing the capabilities of Large Language Models (LLMs). We introduce an innovative methodology that synthesizes human insights with the computational prowess of LLMs to curate an extensive emotional support dialogue dataset. Our approach is initiated with a meticulously designed set of dialogues spanning diverse scenarios as generative seeds. By utilizing the in-context learning potential of ChatGPT, we recursively generate an ExTensible Emotional Support dialogue dataset, named ExTES. Following this, we deploy advanced tuning techniques on the LLaMA model, examining the impact of diverse training strategies, ultimately yielding an LLM meticulously optimized for emotional support interactions. An exhaustive assessment of the resultant model showcases its proficiency in offering emotional support, marking a pivotal step in the realm of emotional support bots and paving the way for subsequent research and implementations

    Managing access to the internet in public libraries in the UK: the findings of the MAIPLE project

    Get PDF
    One of the key purposes of the public library is to provide access to information (UNESCO, 1994). In the UK, information is provided in printed formats and for the last decade via public access Internet workstations installed as part of the People’s Network initiative. Recent figures reveal that UK public libraries provide approximately 40,000 computer terminals offering users around 80,000 hours across more than 4,000 service points (CIPFA, 2012). In addition, increasing numbers of public libraries allow users to connect devices such as tablets or smart phones to the Internet via a wireless network access point (Wi-Fi). How do public library staff manage this? What about users viewing harmful or illegal content? And what are the implications for a profession committed to freedom of access to information and opposition to censorship? MAIPLE, a two-year project funded by the Arts and Humanities Research Council has been investigating this issue as little was known about how UK public libraries manage Internet content control including illegal material. MAIPLE has drawn on an extensive review of the literature, an online survey to which all UK public library services were invited to participate (39 per cent response rate) and case studies with five services (two in England, one in Scotland, one in Wales and one in Northern Ireland) to examine the ways these issues are managed and their implications for staff. This presentation will explore the prevalence of tools such as filtering software, Acceptable Use Policies, user authentication, booking software and visual monitoring by staff and consider their efficacy and desirability in the provision of public Internet access. It will consider the professional dilemmas inherent within managing content and access. Finally, it will highlight some of the more important themes emerging from the findings and their implications for practitioners and policy makers
    • …
    corecore