143 research outputs found

    Human-human multi-threaded spoken dialogs in the presence of driving

    Get PDF
    The problem addressed in this research is that engineers looking for interface designs do not have enough data about the interaction between multi-threaded dialogs and manual-visual tasks. Our goal was to investigate this interaction. We proposed to analyze how humans handle multi-threaded dialogs while engaged in a manual-visual task. More specifically, we looked at the interaction between performance on two spoken tasks and driving. The novelty of this dissertation is in its focus on the intersection between a manual-visual task and a multi-threaded speech communication between two humans. We proposed an experiment setup that is suitable for investigating multi-threaded spoken dialogs while subjects are involved in a manual-visual task. In our experiments one participant drove a simulated vehicle while talking with another participant located in a different room. The participants communicated using headphones and microphones. Both participants performed an ongoing task, which was interrupted by an interrupting task. Both tasks, the ongoing task and the interrupting task, were done using speech. We collected corpora of annotated data from our experiments and analyzed the data to verify the suitability of the proposed experiment setup. We found that, as expected, driving and our spoken tasks influenced each other. We also found that the timing of interruption influenced the spoken tasks. Unexpectedly, the data indicate that the ongoing task was more influenced by driving than the interrupting task. On the other hand, the interrupting task influenced driving more than the ongoing task. This suggests that the multiple resource model [1] does not capture the complexity of the interactions between the manual-visual and spoken tasks. We proposed that the perceived urgency or the perceived task difficulty plays a role in how the tasks influence each other

    Proactive behavior in voice assistants: A systematic review and conceptual model

    Get PDF
    Voice assistants (VAs) are increasingly integrated into everyday activities and tasks, raising novel challenges for users and researchers. One emergent research direction concerns proactive VAs, who can initiate interaction without direct user input, offering unique benefits including efficiency and natural interaction. Yet, there is a lack of review studies synthesizing the current knowledge on how proactive behavior has been implemented in VAs and under what conditions proactivity has been found more or less suitable. To this end, we conducted a systematic review following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist. We searched for articles in the ACM Digital Library, IEEExplore, and PubMed, and included primary research studies reporting user evaluations of proactive VAs, resulting in 21 studies included for analysis. First, to characterize proactive behavior in VAs we developed a novel conceptual model encompassing context, initiation, and action components: Activity/status emerged as the primary contextual element, direct initiation was more common than indirect initiation, and suggestions were the primary action observed. Second, proactive behavior in VAs was predominantly explored in domestic and in-vehicle contexts, with only safety-critical and emergency situations demonstrating clear benefits for proactivity, compared to mixed findings for other scenarios. The paper concludes with a summary of the prevailing knowledge gaps and potential research avenues

    Temporal entrainment in overlapping speech

    Get PDF
    Wlodarczak M. Temporal entrainment in overlapping speech. Bielefeld: Bielefeld University; 2014

    Human-AI interactions through a Gricean lens

    Get PDF
    Grice’s Cooperative Principle (1975), which describes the implicit maxims that guide effective conversation, has long been applied to conversations between humans. However, as humans begin to interact with non-human dialogue systems more frequently and in a broader scope, an important question emerges: what principles govern those interactions? In the present study, this question is addressed, as human-AI interactions are categorized using Grice’s four maxims. In doing so, it demonstrates the advantages and shortcomings of such an approach, ultimately demonstrating that humans do, indeed, apply these maxims to interactions with AI, even making explicit references to the AI’s performance through a Gricean lens. Twenty-three participants interacted with an American English-speaking Alexa and rated and discussed their experience with an in-lab researcher. Researchers then reviewed each exchange, identifying those that might relate to Grice’s maxims: Quantity, Quality, Manner, and Relevance. Many instances of explicit user frustration stemmed from violations of Grice’s maxims. Quantity violations were noted for too little but not too much information, while Quality violations were rare, indicating high trust in Alexa’s responses. Manner violations focused on speed and humanness. Relevance violations were the most frequent of all violations, and they appear to be the most frustrating. While the maxims help describe many of the issues participants encountered with Alexa’s responses, other issues do not fit neatly into Grice’s framework. For example, participants were particularly averse to Alexa initiating exchanges or making unsolicited suggestions. To address this gap, we propose the addition of human Priority to describe human-AI interaction. Humans and AIs are not (yet?) conversational equals, and human initiative takes priority. Moreover, we find that Relevance is of particular importance in human-AI interactions and suggest that the application of Grice’s Cooperative Principles to human-AI interactions is beneficial both from an AI development perspective as well as a tool for describing an emerging form of interaction

    MULTIMODALITY IN COMPUTER MEDIATED COMMUNICATION

    Get PDF
    2002/2003XVI Ciclo1974Versione digitalizzata della tesi di dottorato cartacea

    Turn-Taking in Human Communicative Interaction

    Get PDF
    The core use of language is in face-to-face conversation. This is characterized by rapid turn-taking. This turn-taking poses a number central puzzles for the psychology of language. Consider, for example, that in large corpora the gap between turns is on the order of 100 to 300 ms, but the latencies involved in language production require minimally between 600ms (for a single word) or 1500 ms (for as simple sentence). This implies that participants in conversation are predicting the ends of the incoming turn and preparing in advance. But how is this done? What aspects of this prediction are done when? What happens when the prediction is wrong? What stops participants coming in too early? If the system is running on prediction, why is there consistently a mode of 100 to 300 ms in response time? The timing puzzle raises further puzzles: it seems that comprehension must run parallel with the preparation for production, but it has been presumed that there are strict cognitive limitations on more than one central process running at a time. How is this bottleneck overcome? Far from being 'easy' as some psychologists have suggested, conversation may be one of the most demanding cognitive tasks in our everyday lives. Further questions naturally arise: how do children learn to master this demanding task, and what is the developmental trajectory in this domain? Research shows that aspects of turn-taking such as its timing are remarkably stable across languages and cultures, but the word order of languages varies enormously. How then does prediction of the incoming turn work when the verb (often the informational nugget in a clause) is at the end? Conversely, how can production work fast enough in languages that have the verb at the beginning, thereby requiring early planning of the whole clause? What happens when one changes modality, as in sign languages -- with the loss of channel constraints is turn-taking much freer? And what about face-to-face communication amongst hearing individuals -- do gestures, gaze, and other body behaviors facilitate turn-taking? One can also ask the phylogenetic question: how did such a system evolve? There seem to be parallels (analogies) in duetting bird species, and in a variety of monkey species, but there is little evidence of anything like this among the great apes. All this constitutes a neglected set of problems at the heart of the psychology of language and of the language sciences. This research topic welcomes contributions from right across the board, for example from psycholinguists, developmental psychologists, students of dialogue and conversation analysis, linguists interested in the use of language, phoneticians, corpus analysts and comparative ethologists or psychologists. We welcome contributions of all sorts, for example original research papers, opinion pieces, and reviews of work in subfields that may not be fully understood in other subfields

    COMPUTATIONAL ANALYSIS OF THE CONVERSATIONAL DYNAMICS OF THE UNITED STATES SUPREME COURT

    Get PDF
    The decisions of the United States Supreme Court have far-reaching implications in American life. Using transcripts of Supreme Court oral arguments this work looks at the conversational dynamics of Supreme Court justices and links their conversational interaction with the decisions of the Court and individual justices. While several studies have looked at the relationship between oral arguments and case variables, to our knowledge, none have looked at the relationship between conversational dynamics and case outcomes. Working from this view, we show that the conversation of Supreme Court justices is both predictable and predictive. We aim to show that conversation during Supreme Court cases is patterned, this patterned conversation is associated with case outcomes, and that this association can be used to make predictions about case outcomes. We present three sets of experiments to accomplish this. The first examines the order of speakers during oral arguments as a patterned sequence, showing that cohesive elements in the discourse, along with references to individuals, provide significant improvements over our "bag-of-words" baseline in identifying speakers in sequence within a transcript. The second graphically examines the association between speaker turn-taking and case outcomes. The results presented with this experiment point to interesting and complex relationships between conversational interaction and case variables, such as justices' votes. The third experiment shows that this relationship can be used in the prediction of case outcomes with accuracy ranging from 62.5% to 76.8% for varying conditions. Finally, we offer recommendations for improved tools for legal researchers interested in the relationship between conversation during oral arguments and case outcomes, and suggestions for how these tools may be applied to more general problems
    • …
    corecore