4,731 research outputs found
Emotion resonance and divergence: a semiotic analysis of music and sound in 'The Lost Thing', an animated short film and 'Elizabeth' a film trailer
Music and sound contributions of interpersonal meaning to film narratives may be different from or similar to meanings made by language and image, and dynamic interactions between several modalities may generate new story messages. Such interpretive potentials of music and voice sound in motion pictures are rarely considered in social semiotic investigations of intermodality. This paper therefore shares two semiotic studies of distinct and combined music, English speech and image systems in an animated short film and a promotional filmtrailer. The paper considers the impact of music and voice sound on interpretations of film narrative meanings. A music system relevant to the analysis of filmic emotion is proposed. Examples show how music and intonation contribute meaning to lexical, visual and gestural elements of the cinematic spaces. Also described are relations of divergence and resonance between emotion types in various couplings of music, intonation, words and images across story phases. The research is relevant to educational knowledge about sound, and semiotic studies of multimodality
17 ways to say yes:Toward nuanced tone of voice in AAC and speech technology
People with complex communication needs who use speech-generating devices have very little expressive control over their tone of voice. Despite its importance in human interaction, the issue of tone of voice remains all but absent from AAC research and development however. In this paper, we describe three interdisciplinary projects, past, present and future: The critical design collection Six Speaking Chairs has provoked deeper discussion and inspired a social model of tone of voice; the speculative concept Speech Hedge illustrates challenges and opportunities in designing more expressive user interfaces; the pilot project Tonetable could enable participatory research and seed a research network around tone of voice. We speculate that more radical interactions might expand frontiers of AAC and disrupt speech technology as a whole
Intonation in unaccompanied singing: Accuracy, drift, and a model of reference pitch memory
Copyright 2014 Acoustical Society of America. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the Acoustical Society of America.
The following article appeared in J. Acoust. Soc. Am. 136, 401 (2014) and may be found at http://dx.doi.org/10.1121/1.4881915
A High Quality Text-To-Speech System Composed of Multiple Neural Networks
While neural networks have been employed to handle several different
text-to-speech tasks, ours is the first system to use neural networks
throughout, for both linguistic and acoustic processing. We divide the
text-to-speech task into three subtasks, a linguistic module mapping from text
to a linguistic representation, an acoustic module mapping from the linguistic
representation to speech, and a video module mapping from the linguistic
representation to animated images. The linguistic module employs a
letter-to-sound neural network and a postlexical neural network. The acoustic
module employs a duration neural network and a phonetic neural network. The
visual neural network is employed in parallel to the acoustic module to drive a
talking head. The use of neural networks that can be retrained on the
characteristics of different voices and languages affords our system a degree
of adaptability and naturalness heretofore unavailable.Comment: Source link (9812006.tar.gz) contains: 1 PostScript file (4 pages)
and 3 WAV audio files. If your system does not support Windows WAV files, try
a tool like "sox" to translate the audio into a format of your choic
Deconstructing Speech: new tools for speech manipulation
My research at the London College of Communication is concerned with archives of recorded speech, what new tools
need to be devised for its manipulation and how to go about
this process of invention. Research into available forms of
analysis of speech is discussed below with regard to two
specific areas, feature vectors from linear predictive coding (LPC) analysis and hidden Markov-model-based automatic speech recognition (ASR) systems. These are discussed in order to demonstrate that whilst aspects of each may be useful in devising a system of speech-archive manipulation for artistic use. Their drawbacks and deficiencies for use in art – consequent of the reasons for their invention – necessitate the creation of tools with artistic, rather than engineering agendas in mind. It is through the initial process of devising conceptual tools for understanding speech as sound objects that I have been confronted with issues of semiotics and semantics of the voice and of the relationship between sound and meaning in speech, and of the role of analysis in mediating existing methods of communication. This is discussed with reference to Jean-Jacques Nattiez’s Music and Discourse: Towards a Semiology of Music (Nattiez 1987). The ‘trace’ – a neutral level of semiotic analysis proposed by Nattiez, far from being hypothetical as suggested by Hatten (1992: 88–98) and others, is present by analogy to many forms of mediation in modern spoken communication and the reproduction of music, and it is precisely this neutrality with regards to meaning that tools for manipulation of speech must possess, since the relationships between the sound of speech and its meaning are ‘intense’ (after Deleuze 1968
- …