215 research outputs found
Predicting Head Pose in Dyadic Conversation
Natural movement plays a significant role in realistic speech animation. Numerous studies have demonstrated the contribution visual cues make to the degree we, as human observers, find an animation acceptable. Rigid head motion is one visual mode that universally co-occurs with speech, and so it is a reasonable strategy to seek features from the speech mode to predict the head pose. Several previous authors have shown that prediction is possible, but experiments are typically confined to rigidly produced dialogue. Expressive, emotive and prosodic speech exhibit motion patterns that are far more difficult to predict with considerable variation in expected head pose. People involved in dyadic conversation adapt speech and head motion in response to the others’ speech and head motion. Using Deep Bi-Directional Long Short Term Memory (BLSTM) neural networks, we demonstrate that it is possible to predict not just the head motion of the speaker, but also the head motion of the listener from the speech signal
The Natural Statistics of Audiovisual Speech
Humans, like other animals, are exposed to a continuous stream of signals, which are dynamic, multimodal, extended, and time varying in nature. This complex input space must be transduced and sampled by our sensory systems and transmitted to the brain where it can guide the selection of appropriate actions. To simplify this process, it's been suggested that the brain exploits statistical regularities in the stimulus space. Tests of this idea have largely been confined to unimodal signals and natural scenes. One important class of multisensory signals for which a quantitative input space characterization is unavailable is human speech. We do not understand what signals our brain has to actively piece together from an audiovisual speech stream to arrive at a percept versus what is already embedded in the signal structure of the stream itself. In essence, we do not have a clear understanding of the natural statistics of audiovisual speech. In the present study, we identified the following major statistical features of audiovisual speech. First, we observed robust correlations and close temporal correspondence between the area of the mouth opening and the acoustic envelope. Second, we found the strongest correlation between the area of the mouth opening and vocal tract resonances. Third, we observed that both area of the mouth opening and the voice envelope are temporally modulated in the 2–7 Hz frequency range. Finally, we show that the timing of mouth movements relative to the onset of the voice is consistently between 100 and 300 ms. We interpret these data in the context of recent neural theories of speech which suggest that speech communication is a reciprocally coupled, multisensory event, whereby the outputs of the signaler are matched to the neural processes of the receiver
Is the qualitative research interview an acceptable medium for research with palliative care patients and carers?
<p>Abstract</p> <p>Background</p> <p>Contradictory evidence exists about the emotional burden of participating in qualitative research for palliative care patients and carers and this raises questions about whether this type of research is ethically justified in a vulnerable population. This study aimed to investigate palliative care patients' and carers' perceptions of the benefits and problems associated with open interviews and to understand what causes distress and what is helpful about participation in a research interview.</p> <p>Methods</p> <p>A descriptive qualitative study. The data were collected in the context of two studies exploring the experiences of care of palliative care patients and carers. The interviews ended with questions about patients' and carers' thoughts on participating in the studies and whether this had been a distressing or helpful event. We used a qualitative descriptive analysis strategy generated from the interviews and the observational and interactional data obtained in the course of the study.</p> <p>Results</p> <p>The interviews were considered helpful: sharing problems was therapeutic and being able to contribute to research was empowering. However, thinking about the future was reported to be the most challenging. Consent forms were sometimes read with apprehension and being physically unable to sign was experienced as upsetting. Interviewing patients and carers separately was sometimes difficult and not always possible.</p> <p>Conclusion</p> <p>The open interview enables the perspectives of patients and carers to be heard, unfettered from the structure of closed questions. It also enables those patients or carers to take part who would be unable to participate in other study designs. The context is at least as important as the format of the research interview taking into account the relational circumstances with carers and appropriate ways of obtaining informed consent. Retrospective consent could be a solution to enhancing participants control over the interview.</p
On the Tail of the Scottish Vowel Length Rule in Glasgow
One of the most famous sound features of Scottish English is the short/long timing alternation of /i u ai/vowels, which depends on the morpho-phonemic environment, and is known of as the Scottish Vowel Length Rule (SVLR). These alternations make the status of vowel quantity in Scottish English (quasi-)phonemic but are also susceptible to change, particularly in situations of intense sustained dialect contact with Anglo-English. Does the SVLR change in Glasgow where dialect contact at the community level is comparably low? The present study sets out to tackle this question, and tests two hypotheses involving (1) external influences due to dialect-contact and (2) internal, prosodically-induced factors of sound change. Durational analyses of /i u a/ were conducted on a corpus of spontaneous Glaswegian speech from the 1970s and 2000s, and four speaker groups were compared, two of middle-aged men, and two of adolescent boys. Our hypothesis that the development of the SVLR over time may be internally constrained and interact with prosody was largely confirmed. We observed weakening effects in its implementation which were localised in phrase-medial unaccented positions in all speaker groups, and in phrase-final positions in the speakers born after the Second World War. But unlike some other varieties of Scottish or Northern English which show weakening of the Rule under a prolonged contact with Anglo-English, dialect contact seems to be having less impact on the durational patterns in Glaswegian vernacular, probably because of the overall reduced potential for a regular, everyday contact in the West given the different demographies
- …