Search CORE

28,362 research outputs found

Towards the improvement of augmentative and alternative communication through the modelling of conversation

Author: Alm Norman
Arnott John L.
Publication venue: 'Elsevier BV'
Publication date: 01/09/2013
Field of study

University of Dundee Online Publications

Recommended from our members

Unsupervised intralingual and cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction

Author: Byrne William
Gibson Matthew
Publication venue: IEEE Transactions on Audio, Speech, and Language Processing
Publication date: 01/01/2010
Field of study

Hidden Markov model (HMM)-based speech synthesis systems possess several advantages over concatenative synthesis systems. One such advantage is the relative ease with which HMM-based systems are adapted to speakers not present in the training dataset. Speaker adaptation methods used in the field of HMM-based automatic speech recognition (ASR) are adopted for this task. In the case of unsupervised speaker adaptation, previous work has used a supplementary set of acoustic models to estimate the transcription of the adaptation data. This paper firstly presents an approach to the unsupervised speaker adaptation task for HMM-based speech synthesis models which avoids the need for such supplementary acoustic models. This is achieved by defining a mapping between HMM-based synthesis models and ASR-style models, via a two-pass decision tree construction process. Secondly, it is shown that this mapping also enables unsupervised adaptation of HMM-based speech synthesis models without the need to perform linguistic analysis of the estimated transcription of the adaptation data. Thirdly, this paper demonstrates how this technique lends itself to the task of unsupervised cross-lingual adaptation of HMM-based speech synthesis models, and explains the advantages of such an approach. Finally, listener evaluations reveal that the proposed unsupervised adaptation methods deliver performance approaching that of supervised adaptation

Apollo (Cambridge)

Text-based Editing of Talking-head Video

Author: Agrawala M.
Finkelstein A.
Fried O.
Genova K.
Goldman D.
Jin Z.
Shechtman E.
Tewari A.
Theobalt C.
Zollhöfer M.
Publication venue
Publication date: 01/01/2019
Field of study

Editing talking-head video to change the speech content or to remove filler words is challenging. We propose a novel method to edit talking-head video based on its transcript to produce a realistic output video in which the dialogue of the speaker has been modified, while maintaining a seamless audio-visual flow (i.e. no jump cuts). Our method automatically annotates an input talking-head video with phonemes, visemes, 3D face pose and geometry, reflectance, expression and scene illumination per frame. To edit a video, the user has to only edit the transcript, and an optimization strategy then chooses segments of the input corpus as base material. The annotated parameters corresponding to the selected segments are seamlessly stitched together and used to produce an intermediate video representation in which the lower half of the face is rendered with a parametric face model. Finally, a recurrent video generation network transforms this representation to a photorealistic video that matches the edited transcript. We demonstrate a large variety of edits, such as the addition, removal, and alteration of words, as well as convincing language translation and full sentence synthesis

MPG.PuRe

A systematic comparison of affective robot expression modalities

Author: Frederiksen Morten Roed
Støy Kasper
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/12/2019
Field of study

Crossref

The IT University of Copenhagen's Repository

Methodologies for the Automatic Location of Academic and Educational Texts on the Internet

Author: Evans A.
Oxnard L.
Publication venue: School of Geography
Publication date: 01/01/2003
Field of study

Traditionally online databases of web resources have been compiled by a human editor, or though the submissions of authors or interested parties. Considerable resources are needed to maintain a constant level of input and relevance in the face of increasing material quantity and quality, and much of what is in databases is of an ephemeral nature. These pressures dictate that many databases stagnate after an initial period of enthusiastic data entry. The solution to this problem would seem to be the automatic harvesting of resources, however, this process necessitates the automatic classification of resources as ‘appropriate’ to a given database, a problem only solved by complex text content analysis. This paper outlines the component methodologies necessary to construct such an automated harvesting system, including a number of novel approaches. In particular this paper looks at the specific problems of automatically identifying academic research work and Higher Education pedagogic materials. Where appropriate, experimental data is presented from searches in the field of Geography as well as the Earth and Environmental Sciences. In addition, appropriate software is reviewed where it exists, and future directions are outlined

CiteSeerX

White Rose Research Online

Methodologies for the Automatic Location of Academic and Educational Texts on the Internet

Author: Oxnard L.
Evans A.
Publication venue: School of Geography
Publication date: 01/01/2003
Field of study

MIT Libraries Dome

White Rose Research Online

miMic: The microphone as a pencil

Author: Davide Andrea Mauro
Davide Rocchesso
Stefano Delle Monache
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

miMic, a sonic analogue of paper and pencil is proposed: An augmented microphone for vocal and gestural sonic sketching. Vocalizations are classified and interpreted as instances of sound models, which the user can play with by vocal and gestural control. The physical device is based on a modified microphone, with embedded inertial sensors and buttons. Sound models can be selected by vocal imitations that are automatically classified, and each model is mapped to vocal and gestural features for real-time control. With miMic, the sound designer can explore a vast sonic space and quickly produce expressive sonic sketches, which may be turned into sound prototypes by further adjustment of model parameters

Archivio istituzionale della ricerca - Università IUAV di Venezia

Archivio istituzionale della ricerca - Università di Palermo