3,854 research outputs found

    Towards a vividness in synthesized speech for audiobooks

    Get PDF
    The goal of this study was to determine which acoustic parameters are significant in differentiating the speaking styles of a narrator and that of male and female characters as voiced by a reader of audiobooks. The study was initiated by a need to improve the expressivity and differentiation of speaking styles in fiction books read out by synthesized voices. The corpus used as research material was created from an audio novel, as read by a professional male voice artist. To determine whether it is possible to identify these speaking styles from the voice of the reader, a web-based perception test consisting of 48 sentences was conducted. The results showed that the listeners identified all three styles. For acoustic analysis, the openSMILE toolkit was used and 88 eGeMAPS-defined parameters were extracted for every sentence in the corpus. All styles were differentiated by 38 statistically significant parameters. To improve vividness, synthesizers aimed at reading fiction books could be trained to perform all three styles. Kokkuvõte. Hille Pajupuu, Rene Altrov ja Jaan Pajupuu: Teel audioraamatute sünteeskõne elavdamisele. Uurimuse eesmärk oli teada saada, milli sed olulisemad akustilised parameetrid eristavad audioraamatu lugeja hääles jutustaja kõnet ning mees- ja naistegelaste otsekõnet. Uurimuse tingis vajadus parandada sünteeshäälega loetavate juturaamatute väljendus rikkust ja kõnestiilide eristatavust. Uurimismaterjalina kasutati professionaalse meeshäälega loetud audioromaani „Tõde ja õigus I“ põhjal loodud korpust. Et teada saada, kas audioraamatu lugeja hääle põhjal on kuulaja võimeline eristama eri kõnestiile (jutustaja kõnet, mees- ja naistegelaste otsekõnet), koostati 48 lausest koosnev tajutest. Testi tulemused näitasid, et kuulajad tundsid ära kõik kolm kõnestiili. Akustiliseks analüüsiks kasutati kogu korpuse materjali. openSMILE’i tööriistaga ekstraheeriti kõnest iga lause jaoks 88 eGeMAPSis defineeritud parameetrit. Statistiliselt oluliselt eristasid kõnestiile 38 parameetrit, millest 18 oli seotud hääle kvaliteedi ja tämbriga, 11 hääle valjusega, 8 hääle kõrgusega ja 1 tempoga. Kuna tajutest ja akustiliste parameetrite analüüs näitasid, et audioraamatus eristusid nii jutustaja kõne, naistegelaste otsekõne kui ka meestegelaste otsekõne, võib pidada otstarbekaks õpetada juturaamatuid ettelugevaid süntesaatoreid esitama kõiki kolme kõnestiili. Märksõnad: audioraamatud, kõnestiil, otsekõne, karakteri kõne, GeMAPS, kõneanalüüs, ekspressiivne kõnesüntee

    Mosaic narrative a poetics of cinematic new media narrative

    Get PDF
    This thesis proposes the Poetics of Mosaic Narrative as a tool for theorising the creation and telling of cinematic stories in a digital environment. As such the Poetics of Mosaic Narrative is designed to assist creators of new media narrative to design dramatically compelling screen based stories by drawing from established theories of cinema and emerging theories of new media. In doing so it validates the crucial element of cinematic storytelling in the digital medium, which due to its fragmentary, variable and re-combinatory nature, affords the opportunity for audience interaction. The Poetics of Mosaic Narrative re-asserts the dramatic and cinematic nature of narrative in new media by drawing upon the dramatic theory of Aristotle’s Poetics, the cinematic theories of the 1920s Russian Film Theorists and contemporary Neo-Formalists, the narrative theories of the 1960s French Structuralists, and the scriptwriting theories of contemporary cinema. In particular it focuses on the theory and practice of the prominent new media theorist, Lev Manovich, as a means of investigating and creating a practical poetics. The key element of the Poetics of Mosaic Narrative is the expansion of the previously forgotten and undeveloped Russian Formalist concept of cinematurgy which is vital to the successful development of new media storytelling theory and practice. This concept, as originally proposed but not elaborated by Kazansky, encompasses the notion of the creation of cinematic new media narrative as a mosaic – integrally driven by the narrative systems of plot, as well as the cinematic systems of visual style created by the techniques of cinema- montage, cinematography and mise-en-scene

    Expressive movement generation with machine learning

    Get PDF
    Movement is an essential aspect of our lives. Not only do we move to interact with our physical environment, but we also express ourselves and communicate with others through our movements. In an increasingly computerized world where various technologies and devices surround us, our movements are essential parts of our interaction with and consumption of computational devices and artifacts. In this context, incorporating an understanding of our movements within the design of the technologies surrounding us can significantly improve our daily experiences. This need has given rise to the field of movement computing – developing computational models of movement that can perceive, manipulate, and generate movements. In this thesis, we contribute to the field of movement computing by building machine-learning-based solutions for automatic movement generation. In particular, we focus on using machine learning techniques and motion capture data to create controllable, generative movement models. We also contribute to the field by creating datasets, tools, and libraries that we have developed during our research. We start our research by reviewing the works on building automatic movement generation systems using machine learning techniques and motion capture data. Our review covers background topics such as high-level movement characterization, training data, features representation, machine learning models, and evaluation methods. Building on our literature review, we present WalkNet, an interactive agent walking movement controller based on neural networks. The expressivity of virtual, animated agents plays an essential role in their believability. Therefore, WalkNet integrates controlling the expressive qualities of movement with the goal-oriented behaviour of an animated virtual agent. It allows us to control the generation based on the valence and arousal levels of affect, the movement’s walking direction, and the mover’s movement signature in real-time. Following WalkNet, we look at controlling movement generation using more complex stimuli such as music represented by audio signals (i.e., non-symbolic music). Music-driven dance generation involves a highly non-linear mapping between temporally dense stimuli (i.e., the audio signal) and movements, which renders a more challenging modelling movement problem. To this end, we present GrooveNet, a real-time machine learning model for music-driven dance generation

    3D Composer: A Software for Micro-composition

    Get PDF
    The aim of this compositional research project is to find new paradigms of expression and representation of musical information, supported by technology. This may further our understanding of how artistic intention materialises during the production of a musical work. A further aim is to create a software device, which will allow the user to generate, analyse and manipulate abstract musical information within a multi-dimensional environment. The main intent of this software and composition portfolio is to examine the process involved during the development of a compositional tool to verify how transformations applied to the conceptualisation of musical abstraction will affect musical outcome, and demonstrate how this transformational process would be useful in a creative context. This thesis suggests a reflection upon various technological and conceptual aspects within a dynamic multimedia framework. The discussion situates the artistic work of a composer within the technological sphere, and investigates the role of technology and its influences during the creative process. Notions of space are relocated in the scope of a personal compositional direction in order to develop a new framework for musical creation. The author establishes theoretical ramifications and suggests a definition for micro-composition. The main aspect focuses on the ability to establish a direct conceptual link between visual elements and their correlated musical output, ultimately leading to the design of a software called 3D-Composer, a tool for the visualisation of musical information as a means to assist composers to create works within a new methodological and conceptual realm. Of particular importance is the ability to transform musical structures in three-dimensional space, based on the geometric properties of micro-composition. The compositions Six Electroacoustic Studies and Dada 2009 display the use of the software. The formalisation process was derived from a transposition of influences of the early twentieth century avant-garde period, to a contemporary digital studio environment utilising new media and computer technologies for musical expression

    Acquiring skills in music technology

    Get PDF
    This chapter explores how individuals acquire music technology skills in various settings. We consider this acquisition with reference to the psychological theories of behaviourism, constructivism and metacognition/metalearning. We also discuss what it means to learn, be creative and pursue a musical career within a fast-moving, technology-driven world. What do professional musicians, sound engineers and educators regard as key skills and competencies in music technology, how have priorities changed over time and what attributes are considered as essential for the future? We illustrate our key findings using a wide range of examples drawn from varied cultures, musical and educational settings

    Україна – Канада: сучасні наукові студії

    Get PDF
    The materials of the international collective monograph show the latest Ukrainian-Canadian socio-political, historical, philological, cultural, educational and pedagogical research in the field of modern Canadian Studies. The monograph includes the investigations by several scientists from Ukraine and Canada (from Edmonton, Lutsk, Kyiv, Lviv, and Sumy). Such publication comes out in Ukraine for the first time. For scholars, postgraduates and doctoral students, undergraduates and lecturers of the faculties of international relations, foreign philology, history, political science, philology and journalism, education and social work, Canadian centres in Ukraine and centres of Ukrainian Studies in Canada, as well as for anyone interested in research of Ukrainian-Canadian relations

    Perceptual fail: Female power, mobile technologies and images of self

    Get PDF
    Like a biological species, images of self have descended and modified throughout their journey down the ages, interweaving and recharging their viability with the necessary interjections from culture, tools and technology. Part of this journey has seen images of self also become an intrinsic function within the narratives about female power; consider Helen of Troy “a face that launched a thousand ships” (Marlowe, 1604) or Kim Kardashian (KUWTK) who heralded in the mass mediated ‘selfie’ as a social practice. The interweaving process itself sees the image oscillate between naturalized ‘icon’ and idealized ‘symbol’ of what the person looked like and/or aspired to become. These public images can confirm or constitute beauty ideals as well as influence (via imitation) behaviour and mannerisms, and as such the viewers belief in the veracity of the representative image also becomes intrinsically political manipulating the associated narratives and fostering prejudice (Dobson 2015, Korsmeyer 2004, Pollock 2003). The selfie is arguably ‘a sui generis,’ whilst it is a mediated photographic image of self, it contains its own codes of communication and decorum that fostered the formation of numerous new digital communities and influenced new media aesthetics . For example the selfie is both of nature (it is still a time based piece of documentation) and known to be perceptually untrue (filtered, modified and full of artifice). The paper will seek to demonstrate how selfie culture is infused both by considerable levels of perceptual failings that are now central to contemporary celebrity culture and its’ notion of glamour which in turn is intrinsically linked (but not solely defined) by the province of feminine desire for reinvention, transformation or “self-sexualisation” (Hall, West and McIntyre, 2012). The subject, like the Kardashians or selfies, is divisive. In conclusion this paper will explore the paradox of the perceptual failings at play within selfie culture more broadly, like ‘Reality TV’ selfies are infamously fake yet seem to provide Debord’s (1967) illusory cultural opiate whilst fulfilling a cultural longing. Questions then emerge when considering the narrative impact of these trends on engendered power structures and the traditional status of illusion and narrative fiction
    corecore