Search CORE

9,557 research outputs found

How Do Gestures Influence Thinking and Speaking? The Gesture-for-Conceptualization Hypothesis.

Author: Alibali Martha W.
Chu Mingyuan
Kita Sotaro
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2017
Field of study

Peer reviewedPostprin

Aberdeen University Research

Crossref

Warwick Research Archives Portal Repository

MPG.PuRe

Analyzing Input and Output Representations for Speech-Driven Gesture Generation

Author: Boersma Paul
Bütepage Judith
Kingma Diederik P
Kucherenko Taras
Matsumoto David
Pavllo Dario
Zhou Yi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/06/2019
Field of study

This paper presents a novel framework for automatic speech-driven gesture generation, applicable to human-agent interaction including both virtual agents and robots. Specifically, we extend recent deep-learning-based, data-driven methods for speech-driven gesture generation by incorporating representation learning. Our model takes speech as input and produces gestures as output, in the form of a sequence of 3D coordinates. Our approach consists of two steps. First, we learn a lower-dimensional representation of human motion using a denoising autoencoder neural network, consisting of a motion encoder MotionE and a motion decoder MotionD. The learned representation preserves the most important aspects of the human pose variation while removing less relevant variation. Second, we train a novel encoder network SpeechE to map from speech to a corresponding motion representation with reduced dimensionality. At test time, the speech encoder and the motion decoder networks are combined: SpeechE predicts motion representations based on a given speech signal and MotionD then decodes these representations to produce motion sequences. We evaluate different representation sizes in order to find the most effective dimensionality for the representation. We also evaluate the effects of using different speech features as input to the model. We find that mel-frequency cepstral coefficients (MFCCs), alone or combined with prosodic features, perform the best. The results of a subsequent user study confirm the benefits of the representation learning.Comment: Accepted at IVA '19. Shorter version published at AAMAS '19. The code is available at https://github.com/GestureGeneration/Speech_driven_gesture_generation_with_autoencode

arXiv.org e-Print Archive

Crossref

Early Turn-taking Prediction with Spiking Neural Networks for Human Robot Collaboration

Author: Wachs Juan P.
Zhou Tian
Publication venue
Publication date: 26/09/2017
Field of study

Turn-taking is essential to the structure of human teamwork. Humans are typically aware of team members' intention to keep or relinquish their turn before a turn switch, where the responsibility of working on a shared task is shifted. Future co-robots are also expected to provide such competence. To that end, this paper proposes the Cognitive Turn-taking Model (CTTM), which leverages cognitive models (i.e., Spiking Neural Network) to achieve early turn-taking prediction. The CTTM framework can process multimodal human communication cues (both implicit and explicit) and predict human turn-taking intentions in an early stage. The proposed framework is tested on a simulated surgical procedure, where a robotic scrub nurse predicts the surgeon's turn-taking intention. It was found that the proposed CTTM framework outperforms the state-of-the-art turn-taking prediction algorithms by a large margin. It also outperforms humans when presented with partial observations of communication cues (i.e., less than 40% of full actions). This early prediction capability enables robots to initiate turn-taking actions at an early stage, which facilitates collaboration and increases overall efficiency.Comment: Submitted to IEEE International Conference on Robotics and Automation (ICRA) 201

arXiv.org e-Print Archive

Crossref

P-model Alternative to the T-model

Author: Roberts Mark D.
Publication venue
Publication date: 01/01/2004
Field of study

Standard linguistic analysis of syntax uses the T-model. This model requires the ordering: D-structure

>

S-structure

>

LF, where D-structure is the deep structure, S-structure is the surface structure, and LF is logical form. Between each of these representations there is movement which alters the order of the constituent words; movement is achieved using the principles and parameters of syntactic theory. Psychological analysis of sentence production is usually either serial or connectionist. Psychological serial models do not accommodate the T-model immediately so that here a new model called the P-model is introduced. The P-model is different from previous linguistic and psychological models. Here it is argued that the LF representation should be replaced by a variant of Frege's three qualities (sense, reference, and force), called the Frege representation or F-representation. In the F-representation the order of elements is not necessarily the same as that in LF and it is suggested that the correct ordering is: F-representation

>

D-structure

>

S-structure. This ordering appears to lead to a more natural view of sentence production and processing. Within this framework movement originates as the outcome of emphasis applied to the sentence. The requirement that the F-representation precedes the D-structure needs a picture of the particular principles and parameters which pertain to movement of words between representations. In general this would imply that there is a preferred or optimal ordering of the symbolic string in the F-representation. The standard ordering is retained because the general way of producing such an optimal ordering is unclear. In this case it is possible to produce an analysis of movement between LF and D-structure similar to the usual analysis of movement between S-structure and LF. It is suggested that a maximal amount of information about a language's grammar and lexicon is stored, because of the necessity of analyzing corrupted data

PhilPapers

CogPrints Cognitive Sciences Eprint Archive

Modifications and Frequency Occurrence of Gestures in Ns - Ns and Nns - Ns Dyads

Author: Wijaya J. (Juliana)
Publication venue: 'Petra Christian University'
Publication date: 01/01/1999
Field of study

In this study, I investigate cross-linguistic differences and similarities in the speech associated gesture in the NS (Native Speaker) - NS and NNS (Nonnative Speaker) - NS dyads when they are telling a narrative. The gesture production between Indonesian native speakers when communicating in Indonesian (L1) and in English (L2) was coded and assessed based on Mc.Neill\u27s model of overall gesture units. The Indonesian speakers\u27 gesture modification when interacting in English was measured by the size of the gestures. The results indicate that Indonesian native speakers gesture more when they communicate in English and modify their gestures by making them bigger and therefore more noticeable to their interlocutors. They use gestures as a communication strategy to help interlocutors comprehend their idea

Neliti

Directory of Open Access Journals

Cobot Programming for Collaborative Industrial Tasks: An Overview

Author: Banziger
Bauer
Benzeghiba
Bicchi
Busch
Calinon
Cao
Chandrasekaran
Cheng
Cherubini
Commission
de Gea Fernandez
Ding
Duque
Faber
Gaz
Ghalamzan
Giuliani
Gleeson
Gombolay
Green
Gu
Gustavsson
Haddadin
Hangl
Hangl
Heess
Hu
Huang
Johannsmeier
Kim
Kobayashi
Koch
Kouris
Kumicakova
Lafleche
Lasota
Lee
Li
Liu
Luo
Maeda
Matsas
Maurice
Maurtua
Meziane
Mohamed Marei
Mohan
Muller
Munzer
Nikolaidis
Noohi
Pedersen
Pellegrinelli
Peternel
Pohlt
Rajeswaran
Realyvasquez-Vargas
Reyes
Rozo
Rude
Schmidt
Schou
Schou
Schulz
Sheng
Shirine El Zaatari
Srimal
Steinmetz
Sylla
Tang
Wang
Weidong Li
Winkelmann
Wojtara
Wongphati
Yang
Zahid Usman
Zhu
Zidek
Publication venue: 'Elsevier BV'
Publication date: 01/06/2019
Field of study

Crossref

Coventry University Pure Portal

The role of gesture delay in coda /r/ weakening: an articulatory, auditory and acoustic study

Author: Lawson Eleanor
Scobbie James M.
Stuart-Smith Jane
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/03/2018
Field of study

The cross-linguistic tendency of coda consonants to weaken, vocalize, or be deleted is shown to have a phonetic basis, resulting from gesture reduction, or variation in gesture timing. This study investigates the effects of the timing of the anterior tongue gesture for coda /r/ on acoustics and perceived strength of rhoticity, making use of two sociolects of Central Scotland (working- and middle-class) where coda /r/ is weakening and strengthening, respectively. Previous articulatory analysis revealed a strong tendency for these sociolects to use different coda /r/ tongue configurations—working- and middle-class speakers tend to use tip/front raised and bunched variants, respectively; however, this finding does not explain working-class /r/ weakening. A correlational analysis in the current study showed a robust relationship between anterior lingual gesture timing, F3, and percept of rhoticity. A linear mixed effects regression analysis showed that both speaker social class and linguistic factors (word structure and the checked/unchecked status of the prerhotic vowel) had significant effects on tongue gesture timing and formant values. This study provides further evidence that gesture delay can be a phonetic mechanism for coda rhotic weakening and apparent loss, but social class emerges as the dominant factor driving lingual gesture timing variation

Crossref

University of Strathclyde Institutional Repository

Enlighten

Queen Margaret University eResearch

Responding to gratitude in elicited oral interaction. A taxonomy of communicative options

Author: Gesuato Sara
Publication venue: ESE Salento University Publishing
Publication date: 01/01/2016
Field of study

This study explores responses to gratitude as expressed in elicited oral interaction (mimetic-pretending open role-plays) produced by native speakers of American English. It first overviews the literature on this topic. It then presents a taxonomy of the head acts and supporting moves of the responses to gratitude instantiated in the corpus under examination, which considers their strategies and formulations. Finally, it reports on their frequency of occurrence and combinatorial options across communicative situations differing in terms of the social distance and power relationships between the interactants. The findings partly confirm what reported in the literature, but partly reveal the flexibility and adaptability of these reacting speech acts to the variable context in which they may be instantiated. On the one hand, the responses to gratitude identified tend to be encoded as simple utterances, and occasionally as complex combinations of head acts and/or supporting moves; also, their head acts show a preference for a small set of strategies and formulation types, while their supporting moves are much more varied in content and form, and thus situation-specific. On the other hand, the frequency of occurrence of the responses to gratitude, their dispersion across situations, and the range of their attested strategies and formulations are not in line with those reported in previous studies. I argue that these partly divergent findings are to be related to the different data collection and categorization procedures adopted, and the different communicative situations considered across studies. Overall, the study suggests that: responses to gratitude are a set of communicative events with fuzzy boundaries, which contains core (i.e. more prototypical) and peripheral (i.e. less prototypical) exemplars; although routinized in function, responses to gratitude are not completely conventionalized in their strategic or surface realizations; alternative research approaches may provide complementary insights into these reacting speech acts; and a higher degree of comparability across studies may be ensured if explicit pragmatic and semantic parameters are adopted in the classification of their shared object of study

Archivio istituzionale della ricerca - Università di Padova