215 research outputs found
The timing bottleneck: Why timing and overlap are mission-critical for conversational user interfaces, speech recognition and dialogue systems
Speech recognition systems are a key intermediary in voice-driven
human-computer interaction. Although speech recognition works well for pristine
monologic audio, real-life use cases in open-ended interactive settings still
present many challenges. We argue that timing is mission-critical for dialogue
systems, and evaluate 5 major commercial ASR systems for their conversational
and multilingual support. We find that word error rates for natural
conversational data in 6 languages remain abysmal, and that overlap remains a
key challenge (study 1). This impacts especially the recognition of
conversational words (study 2), and in turn has dire consequences for
downstream intent recognition (study 3). Our findings help to evaluate the
current state of conversational ASR, contribute towards multidimensional error
analysis and evaluation, and identify phenomena that need most attention on the
way to build robust interactive speech technologies
Recommended from our members
Synthesized size-sound sound symbolism
Studies of sound symbolism have shown that people can
associate sound and meaning in consistent ways when
presented with maximally contrastive stimulus pairs of
nonwords such as bouba/kiki (rounded/sharp) or mil/mal
(small/big). Recent work has shown the effect extends to
antonymic words from natural languages and has proposed a
role for shared cross-modal correspondences in biasing form-
to-meaning associations. An important open question is how
the associations work, and particularly what the role is of
sound-symbolic matches versus mismatches. We report on a
learning task designed to distinguish between three existing
theories by using a spectrum of sound-symbolically matching,
mismatching, and neutral (neither matching nor mismatching)
stimuli. Synthesized stimuli allow us to control for prosody,
and the inclusion of a neutral condition allows a direct test of
competing accounts. We find evidence for a sound-symbolic
match boost, but not for a mismatch difficulty compared to
the neutral condition
A systematic investigation of gesture kinematics in evolving manual languages in the lab
Item does not contain fulltextSilent gestures consist of complex multi-articulatory movements but are now primarily studied through categorical coding of the referential gesture content. The relation of categorical linguistic content with continuous kinematics is therefore poorly understood. Here, we reanalyzed the video data from a gestural evolution experiment (Motamedi, Schouwstra, Smith, Culbertson, & Kirby, 2019), which showed increases in the systematicity of gesture content over time. We applied computer vision techniques to quantify the kinematics of the original data. Our kinematic analyses demonstrated that gestures become more efficient and less complex in their kinematics over generations of learners. We further detect the systematicity of gesture form on the level of thegesture kinematic interrelations, which directly scales with the systematicity obtained on semantic coding of the gestures. Thus, from continuous kinematics alone, we can tap into linguistic aspects that were previously only approachable through categorical coding of meaning. Finally, going beyond issues of systematicity, we show how unique gesture kinematic dialects emerged over generations as isolated chains of participants gradually diverged over iterations from other chains. We, thereby, conclude that gestures can come to embody the linguistic system at the level of interrelationships between communicative tokens, which should calibrate our theories about form and linguistic content.29 p
A Coding Scheme for Other-initiated Repair Across Languages
We provide an annotated coding scheme for other-initiated repair, along with guidelines for building collections and aggregating cases based on interactionally relevant similarities and differences. The questions and categories of the scheme are grounded in inductive observations of conversational data and connected to a rich body of work on other-initiated repair in conversation analysis. The scheme is developed and tested in a 12-language comparative project and can serve as a stepping stone for future work on other-initiated repair and the systematic comparative study of conversational structures
Recommended from our members
Computational mechanisms for resolving misunderstandings
Imagine discussing yesterdays dinner with a friend: It wasn’t particularly tasty. Your friend concurs, it was very salty!Thinking you were talking about the appetizer (which wasnt salty at all), youre forced to reconsider which course yourfriend was talking about. Was the appetizer salty to her? Was she talking about the main course? People encounter mis-understandings in everyday conversation, yet quickly and seamlessly resolve them. How people do this is an explanatorychallenge: the thing being talked about (i.e., the referent) is often not physically present during the conversation. Hence,theres no easy way for interlocutors to establish common ground via ostensive signaling (e.g., by pointing at the dish). Wedevelop a model of speakers that use pragmatic reasoning to infer the referent inferred by listeners. We explore the perfor-mance of this model using agent-based simulated conversations. The results imply necessary and sufficient conditions forsuccessful updating
Arbitrariness, iconicity, and systematicity in language
The notion that the form of a word bears an arbitrary relation to its meaning accounts only partly for the attested relations between form and meaning in the languages of the world. Recent research suggests a more textured view of vocabulary structure, in which arbitrariness is complemented by iconicity (aspects of form resemble aspects of meaning) and systematicity (statistical regularities in forms predict function). Experimental evidence suggests these form-to-meaning correspondences serve different functions in language processing, development, and communication: systematicity facilitates category learning by means of phonological cues, iconicity facilitates word learning and communication by means of perceptuomotor analogies, and arbitrariness facilitates meaning individuation through distinctive forms. Processes of cultural evolution help to explain how these competing motivations shape vocabulary structure
Conversation analysis (CA)
Conversation analysis (CA) is an approach to the study of language and social interaction that puts at center stage its sequential development. The chain of initiating and responding actions that characterizes any interaction is a source of internal evidence for the meaning of social behavior as it exposes the understandings that participants themselves give of what one another is doing. Such an analysis requires the close and repeated inspection of audio and video recordings of naturally occurring interaction, supported by transcripts and other forms of annotation. Distributional regularities are complemented by a demonstration of participants' orientation to deviant behavior. CA has long maintained a constructive dialogue and reciprocal influence with linguistic anthropology. This includes a recent convergence on the cross-linguistic and cross-cultural study of social interaction
Getting others to do things: A pragmatic typology of recruitments
Getting others to do things is a central part of social interaction in any human society. Language is our main tool for this purpose. In this book, we show that sequences of interaction in which one person’s behaviour solicits or occasions another’s assistance or collaboration share common structural properties that provide a basis for the systematic comparison of this domain across languages. The goal of this comparison is to uncover similarities and differences in how language and other conduct are used in carrying out social action around the world, including different kinds of requests, orders, suggestions, and other actions brought together under the rubric of recruitment
Getting others to do things: A pragmatic typology of recruitments
Getting others to do things is a central part of social interaction in any human society. Language is our main tool for this purpose. In this book, we show that sequences of interaction in which one person’s behaviour solicits or occasions another’s assistance or collaboration share common structural properties that provide a basis for the systematic comparison of this domain across languages. The goal of this comparison is to uncover similarities and differences in how language and other conduct are used in carrying out social action around the world, including different kinds of requests, orders, suggestions, and other actions brought together under the rubric of recruitment
Getting others to do things: A pragmatic typology of recruitments
Getting others to do things is a central part of social interaction in any human society. Language is our main tool for this purpose. In this book, we show that sequences of interaction in which one person’s behaviour solicits or occasions another’s assistance or collaboration share common structural properties that provide a basis for the systematic comparison of this domain across languages. The goal of this comparison is to uncover similarities and differences in how language and other conduct are used in carrying out social action around the world, including different kinds of requests, orders, suggestions, and other actions brought together under the rubric of recruitment
- …