969 research outputs found

    Conversational Entity Linking: Problem Definition and Datasets

    Get PDF
    Machine understanding of user utterances in conversational systems is of utmost importance for enabling engaging and meaningful conversations with users. Entity Linking (EL) is one of the means of text understanding, with proven efficacy for various downstream tasks in information retrieval. In this paper, we study entity linking for conversational systems. To develop a better understanding of what EL in a conversational setting entails, we analyze a large number of dialogues from existing conversational datasets and annotate references to concepts, named entities, and personal entities using crowdsourcing. Based on the annotated dialogues, we identify the main characteristics of conversational entity linking. Further, we report on the performance of traditional EL systems on our Conversational Entity Linking dataset, ConEL, and present an extension to these methods to better fit the conversational setting. The resources released with this paper include annotated datasets, detailed descriptions of crowdsourcing setups, as well as the annotations produced by various EL systems. These new resources allow for an investigation of how the role of entities in conversations is different from that in documents or isolated short text utterances like queries and tweets, and complement existing conversational datasets.publishedVersio

    Quality Assessment Methods for Textual Conversational Interfaces: A Multivocal Literature Review

    Get PDF
    The evaluation and assessment of conversational interfaces is a complex task since such software products are challenging to validate through traditional testing approaches. We conducted a systematic Multivocal Literature Review (MLR), on five different literature sources, to provide a view on quality attributes, evaluation frameworks, and evaluation datasets proposed to provide aid to the researchers and practitioners of the field. We came up with a final pool of 118 contributions, including grey (35) and white literature (83). We categorized 123 different quality attributes and metrics under ten different categories and four macro-categories: Relational, Conversational, User-Centered and Quantitative attributes. While Relational and Conversational attributes are most commonly explored by the scientific literature, we testified a predominance of User-Centered Attributes in industrial literature. We also identified five different academic frameworks/tools to automatically compute sets of metrics, and 28 datasets (subdivided into seven different categories based on the type of data contained) that can produce conversations for the evaluation of conversational interfaces. Our analysis of literature highlights that a high number of qualitative and quantitative attributes are available in the literature to evaluate the performance of conversational interfaces. Our categorization can serve as a valid entry point for researchers and practitioners to select the proper functional and non-functional aspects to be evaluated for their products

    The effects of user assistance systems on user perception and behavior

    Get PDF
    The rapid development of information technology (IT) is changing how people approach and interact with IT systems (Maedche et al. 2016). IT systems can increasingly support people in performing ever more complex tasks (Vtyurina and Fourney 2018). However, people's cognitive abilities have not evolved as quickly as technology (Maedche et al. 2016). Thus, different external factors (e.g., complexity or uncertainty) and internal conditions (e.g., cognitive load or stress) reduce decision quality (Acciarini et al. 2021; Caputo 2013; Hilbert 2012). User-assistance systems (UASs) can help to compensate for human weaknesses and cope with new challenges. UASs aim to improve the user's cognition and capabilities, benefiting individuals, organizations, and society. To achieve this goal, UASs collect, prepare, aggregate, analyze information, and communicate results according to user preferences (Maedche et al. 2019). This support can relieve users and improve the quality of decision-making. Using UASs offers many benefits but requires successful interaction between the user and the UAS. However, this interaction introduces social and technical challenges, such as loss of control or reduced explainability, which can affect user trust and willingness to use the UAS (Maedche et al. 2019). To realize the benefits, UASs must be developed based on an understanding and incorporation of users' needs. Users and UASs are part of a socio-technical system to complete a specific task (Maedche et al. 2019). To create a benefit from the interaction, it is necessary to understand the interaction within the socio-technical system, i.e., the interaction between the user, UAS, and task, and to align the different components. For this reason, this dissertation aims to extend the existing knowledge on UAS design by better understanding the effects and mechanisms during the interaction between UASs and users in different application contexts. Therefore, theory and findings from different disciplines are combined and new theoretical knowledge is derived. In addition, data is collected and analyzed to validate the new theoretical knowledge empirically. The findings can be used to reduce adaptation barriers and realize a positive outcome. Overall this dissertation addresses the four classes of UASs presented by Maedche et al. (2016): basic UASs, interactive UASs, intelligent UASs, and anticipating UASs. First, this dissertation contributes to understanding how users interact with basic UASs. Basic UASs do not process contextual information and interact little with the user (Maedche et al. 2016). This behavior makes basic UASs suitable for application contexts, such as social media, where little interaction is desired. Social media is primarily used for entertainment and focuses on content consumption (Moravec et al. 2018). As a result, social media has become an essential source of news but also a target for fake news, with negative consequences for individuals and society (Clarke et al. 2021; Laato et al. 2020). Thus, this thesis presents two approaches to how basic UASs can be used to reduce the negative influence of fake news. Firstly, basic UASs can provide interventions by warning users of questionable content and providing verified information but the order in which the intervention elements are displayed influences the fake news perception. The intervention elements should be displayed after the fake news story to achieve an efficient intervention. Secondly, basic UASs can provide social norms to motivate users to report fake news and thereby stop the spread of fake news. However, social norms should be used carefully, as they can backfire and reduce the willingness to report fake news. Second, this dissertation contributes to understanding how users interact with interactive UASs. Interactive UASs incorporate limited information from the application context but focus on close interaction with the user to achieve a specific goal or behavior (Maedche et al. 2016). Typical goals include more physical activity, a healthier diet, and less tobacco and alcohol consumption to prevent disease and premature death (World Health Organization 2020). To increase goal achievement, previous researchers often utilize digital human representations (DHRs) such as avatars and embodied agents to form a socio-technical relationship between the user and the interactive UAS (Kim and Sundar 2012a; Pfeuffer et al. 2019). However, understanding how the design features of an interactive UAS affect the interaction with the user is crucial, as each design feature has a distinct impact on the user's perception. Based on existing knowledge, this thesis highlights the most widely used design features and analyzes their effects on behavior. The findings reveal important implications for future interactive UAS design. Third, this dissertation contributes to understanding how users interact with intelligent UASs. Intelligent UASs prioritize processing user and contextual information to adapt to the user's needs rather than focusing on an intensive interaction with the user (Maedche et al. 2016). Thus, intelligent UASs with emotional intelligence can provide people with task-oriented and emotional support, making them ideal for situations where interpersonal relationships are neglected, such as crowd working. Crowd workers frequently work independently without any significant interactions with other people (Jäger et al. 2019). In crowd work environments, traditional leader-employee relationships are usually not established, which can have a negative impact on employee motivation and performance (Cavazotte et al. 2012). Thus, this thesis examines the impact of an intelligent UAS with leadership and emotional capabilities on employee performance and enjoyment. The leadership capabilities of the intelligent UAS lead to an increase in enjoyment but a decrease in performance. The emotional capabilities of the intelligent UAS reduce the stimulating effect of leadership characteristics. Fourth, this dissertation contributes to understanding how users interact with anticipating UASs. Anticipating UASs are intelligent and interactive, providing users with task-related and emotional stimuli (Maedche et al. 2016). They also have advanced communication interfaces and can adapt to current situations and predict future events (Knote et al. 2018). Because of these advanced capabilities anticipating UASs enable collaborative work settings and often use anthropomorphic design cues to make the interaction more intuitive and comfortable (André et al. 2019). However, these anthropomorphic design cues can also raise expectations too high, leading to disappointment and rejection if they are not met (Bartneck et al. 2009; Mori 1970). To create a successful collaborative relationship between anticipating UASs and users, it is important to understand the impact of anthropomorphic design cues on the interaction and decision-making processes. This dissertation presents a theoretical model that explains the interaction between anthropomorphic anticipating UASs and users and an experimental procedure for empirical evaluation. The experiment design lays the groundwork for empirically testing the theoretical model in future research. To sum up, this dissertation contributes to information systems knowledge by improving understanding of the interaction between UASs and users in different application contexts. It develops new theoretical knowledge based on previous research and empirically evaluates user behavior to explain and predict it. In addition, this dissertation generates new knowledge by prototypically developing UASs and provides new insights for different classes of UASs. These insights can be used by researchers and practitioners to design more user-centric UASs and realize their potential benefits

    "Mango Mango, How to Let The Lettuce Dry Without A Spinner?'': Exploring User Perceptions of Using An LLM-Based Conversational Assistant Toward Cooking Partner

    Full text link
    The rapid advancement of the Large Language Model (LLM) has created numerous potentials for integration with conversational assistants (CAs) assisting people in their daily tasks, particularly due to their extensive flexibility. However, users' real-world experiences interacting with these assistants remain unexplored. In this research, we chose cooking, a complex daily task, as a scenario to investigate people's successful and unsatisfactory experiences while receiving assistance from an LLM-based CA, Mango Mango. We discovered that participants value the system's ability to provide extensive information beyond the recipe, offer customized instructions based on context, and assist them in dynamically planning the task. However, they expect the system to be more adaptive to oral conversation and provide more suggestive responses to keep users actively involved. Recognizing that users began treating our LLM-CA as a personal assistant or even a partner rather than just a recipe-reading tool, we propose several design considerations for future development.Comment: Under submission to CHI202

    iSchool Student Research Journal, Vol.11, Iss.1

    Get PDF

    LLM-Powered Conversational Voice Assistants: Interaction Patterns, Opportunities, Challenges, and Design Guidelines

    Full text link
    Conventional Voice Assistants (VAs) rely on traditional language models to discern user intent and respond to their queries, leading to interactions that often lack a broader contextual understanding, an area in which Large Language Models (LLMs) excel. However, current LLMs are largely designed for text-based interactions, thus making it unclear how user interactions will evolve if their modality is changed to voice. In this work, we investigate whether LLMs can enrich VA interactions via an exploratory study with participants (N=20) using a ChatGPT-powered VA for three scenarios (medical self-diagnosis, creative planning, and debate) with varied constraints, stakes, and objectivity. We observe that LLM-powered VA elicits richer interaction patterns that vary across tasks, showing its versatility. Notably, LLMs absorb the majority of VA intent recognition failures. We additionally discuss the potential of harnessing LLMs for more resilient and fluid user-VA interactions and provide design guidelines for tailoring LLMs for voice assistance

    A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

    Full text link
    Large language models (LLMs) are a special class of pretrained language models obtained by scaling model size, pretraining corpus and computation. LLMs, because of their large size and pretraining on large volumes of text data, exhibit special abilities which allow them to achieve remarkable performances without any task-specific training in many of the natural language processing tasks. The era of LLMs started with OpenAI GPT-3 model, and the popularity of LLMs is increasing exponentially after the introduction of models like ChatGPT and GPT4. We refer to GPT-3 and its successor OpenAI models, including ChatGPT and GPT4, as GPT-3 family large language models (GLLMs). With the ever-rising popularity of GLLMs, especially in the research community, there is a strong need for a comprehensive survey which summarizes the recent research progress in multiple dimensions and can guide the research community with insightful future research directions. We start the survey paper with foundation concepts like transformers, transfer learning, self-supervised learning, pretrained language models and large language models. We then present a brief overview of GLLMs and discuss the performances of GLLMs in various downstream tasks, specific domains and multiple languages. We also discuss the data labelling and data augmentation abilities of GLLMs, the robustness of GLLMs, the effectiveness of GLLMs as evaluators, and finally, conclude with multiple insightful future research directions. To summarize, this comprehensive survey paper will serve as a good resource for both academic and industry people to stay updated with the latest research related to GPT-3 family large language models.Comment: Preprint under review, 58 page
    • …
    corecore