Search CORE

1,741 research outputs found

Paralinguistic vocal control of interactive media: how untapped elements of voice might enhance the role of non-speech voice input in the user's experience of multimedia.

Author: Al Hashimi S.
Al Hashimi S.
Publication venue
Publication date: 01/01/2007
Field of study

Much interactive media development, especially commercial development, implies the dominance of the visual modality, with sound as a limited supporting channel. The development of multimedia technologies such as augmented reality and virtual reality has further revealed a distinct partiality to visual media. Sound, however, and particularly voice, have many aspects which have yet to be adequately investigated. Exploration of these aspects may show that sound can, in some respects, be superior to graphics in creating immersive and expressive interactive experiences. With this in mind, this thesis investigates the use of non-speech voice characteristics as a complementary input mechanism in controlling multimedia applications. It presents a number of projects that employ the paralinguistic elements of voice as input to interactive media including both screen-based and physical systems. These projects are used as a means of exploring the factors that seem likely to affect users’ preferences and interaction patterns during non-speech voice control. This exploration forms the basis for an examination of potential roles for paralinguistic voice input. The research includes the conceptual and practical development of the projects and a set of evaluative studies. The work submitted for Ph.D. comprises practical projects (50 percent) and a written dissertation (50 percent). The thesis aims to advance understanding of how voice can be used both on its own and in combination with other input mechanisms in controlling multimedia applications. It offers a step forward in the attempts to integrate the paralinguistic components of voice as a complementary input mode to speech input applications in order to create a synergistic combination that might let the strengths of each mode overcome the weaknesses of the other

Middlesex University Research Repository

Paralinguistic vocal control of interactive media : how untapped elements of voice might enhance the role of non-speech voice input in the user's experience of multimedia

Author: Al Hashimi Sama'a
Publication venue
Publication date: 01/01/2007
Field of study

Much interactive media development, especially commercial development, implies the dominance of the visual modality, with sound as a limited supporting channel. The development of multimedia technologies such as augmented reality and virtual reality has further revealed a distinct partiality to visual media. Sound, however, and particularly voice, have many aspects which have yet to be adequately investigated. Exploration of these aspects may show that sound can, in some respects, be superior to graphics in creating immersive and expressive interactive experiences. With this in mind, this thesis investigates the use of non-speech voice characteristics as a complementary input mechanism in controlling multimedia applications. It presents a number of projects that employ the paralinguistic elements of voice as input to interactive media including both screen-based and physical systems. These projects are used as a means of exploring the factors that seem likely to affect users' preferences and interaction patterns during non-speech voice control. This exploration forms the basis for an examination of potential roles for paralinguistic voice input. The research includes the conceptual and practical development of the projects and a set of evaluative studies. The work submitted for Ph.D. comprises practical projects (50 percent) and a written dissertation (50 percent). The thesis aims to advance understanding of how voice can be used both on its own and in combination with other input mechanisms in controlling multimedia applications. It offers a step forward in the attempts to integrate the paralinguistic components of voice as a complementary input mode to speech input applications in order to create a synergistic combination that might let the strengths of each mode overcome the weaknesses of the other.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository

Proceedings of the 20th BCS HCI Group conference Volume Two

Author: Fields Bob
Healey Patrick
Nickerson Louise Valgerdur
Stockman Tony
Publication venue
Publication date: 30/12/2013
Field of study

Queen Mary Research Online

“Don’t, never no”: Negotiating meaning in ESL among hearing/speaking-impaired netizens

Author: Badilla Cristine Grace
Parmisana Venus Rosales
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 30/04/2022
Field of study

Negotiating meaning can be difficult for the deaf-mute people when being in the hearing and speaking world. Social media offers a platform where the deaf and the mute can engage in meaningful conversations among themselves and between people with hearing and speaking abilities. This paper determined the paralinguistic signals that the deaf-mute students employed in their Facebook posts. Using descriptive-qualitative research design, the study analyzed the lexico-semantic features of their language and how both paralinguistic and linguistic aspects contribute to the negotiation of conceptual meaning. The results revealed that paralinguistic signals are found in emojis, punctuation mark repeats, onomatopoeic spelling, accent stylization, intensification, hashtag and combinations. These signals function to give emphasis or intensify intonation. An emoji is the predominant paralinguistic signal used to compensate the lack of words to express feelings. In addition, distinct lexico-semantic features observed in the data include the incorrect position of words, incorrect lexical choice, redundancy, and insertion of prepositions or the lack thereof. These features do not carry a specific function in negotiating meaning because understanding the semantic content of a message is possible either with or without comprehension of the syntax. Semantic comprehension is not expected to help in the acquisition of the syntactic system because it may be accomplished through the recognition of isolated lexical items and interpretation of non-linguistic cues. Finally, paralinguistic signals and computer-mediated communication for the deaf-mute across generation and race can be considered for future directions of the study and appropriate technological tools may be designed to automate errors found in the social media posts of the deaf-mute

Universitas Ahmad Dahlan: UAD Scientific Journal

Real-time generation and adaptation of social companion robot behaviors

Author: Ritschel Hannes
Publication venue
Publication date: 23/01/2023
Field of study

Social robots will be part of our future homes. They will assist us in everyday tasks, entertain us, and provide helpful advice. However, the technology still faces challenges that must be overcome to equip the machine with social competencies and make it a socially intelligent and accepted housemate. An essential skill of every social robot is verbal and non-verbal communication. In contrast to voice assistants, smartphones, and smart home technology, which are already part of many people's lives today, social robots have an embodiment that raises expectations towards the machine. Their anthropomorphic or zoomorphic appearance suggests they can communicate naturally with speech, gestures, or facial expressions and understand corresponding human behaviors. In addition, robots also need to consider individual users' preferences: everybody is shaped by their culture, social norms, and life experiences, resulting in different expectations towards communication with a robot. However, robots do not have human intuition - they must be equipped with the corresponding algorithmic solutions to these problems. This thesis investigates the use of reinforcement learning to adapt the robot's verbal and non-verbal communication to the user's needs and preferences. Such non-functional adaptation of the robot's behaviors primarily aims to improve the user experience and the robot's perceived social intelligence. The literature has not yet provided a holistic view of the overall challenge: real-time adaptation requires control over the robot's multimodal behavior generation, an understanding of human feedback, and an algorithmic basis for machine learning. Thus, this thesis develops a conceptual framework for designing real-time non-functional social robot behavior adaptation with reinforcement learning. It provides a higher-level view from the system designer's perspective and guidance from the start to the end. It illustrates the process of modeling, simulating, and evaluating such adaptation processes. Specifically, it guides the integration of human feedback and social signals to equip the machine with social awareness. The conceptual framework is put into practice for several use cases, resulting in technical proofs of concept and research prototypes. They are evaluated in the lab and in in-situ studies. These approaches address typical activities in domestic environments, focussing on the robot's expression of personality, persona, politeness, and humor. Within this scope, the robot adapts its spoken utterances, prosody, and animations based on human explicit or implicit feedback.Soziale Roboter werden Teil unseres zukünftigen Zuhauses sein. Sie werden uns bei alltäglichen Aufgaben unterstützen, uns unterhalten und uns mit hilfreichen Ratschlägen versorgen. Noch gibt es allerdings technische Herausforderungen, die zunächst überwunden werden müssen, um die Maschine mit sozialen Kompetenzen auszustatten und zu einem sozial intelligenten und akzeptierten Mitbewohner zu machen. Eine wesentliche Fähigkeit eines jeden sozialen Roboters ist die verbale und nonverbale Kommunikation. Im Gegensatz zu Sprachassistenten, Smartphones und Smart-Home-Technologien, die bereits heute Teil des Lebens vieler Menschen sind, haben soziale Roboter eine Verkörperung, die Erwartungen an die Maschine weckt. Ihr anthropomorphes oder zoomorphes Aussehen legt nahe, dass sie in der Lage sind, auf natürliche Weise mit Sprache, Gestik oder Mimik zu kommunizieren, aber auch entsprechende menschliche Kommunikation zu verstehen. Darüber hinaus müssen Roboter auch die individuellen Vorlieben der Benutzer berücksichtigen. So ist jeder Mensch von seiner Kultur, sozialen Normen und eigenen Lebenserfahrungen geprägt, was zu unterschiedlichen Erwartungen an die Kommunikation mit einem Roboter führt. Roboter haben jedoch keine menschliche Intuition - sie müssen mit entsprechenden Algorithmen für diese Probleme ausgestattet werden. In dieser Arbeit wird der Einsatz von bestärkendem Lernen untersucht, um die verbale und nonverbale Kommunikation des Roboters an die Bedürfnisse und Vorlieben des Benutzers anzupassen. Eine solche nicht-funktionale Anpassung des Roboterverhaltens zielt in erster Linie darauf ab, das Benutzererlebnis und die wahrgenommene soziale Intelligenz des Roboters zu verbessern. Die Literatur bietet bisher keine ganzheitliche Sicht auf diese Herausforderung: Echtzeitanpassung erfordert die Kontrolle über die multimodale Verhaltenserzeugung des Roboters, ein Verständnis des menschlichen Feedbacks und eine algorithmische Basis für maschinelles Lernen. Daher wird in dieser Arbeit ein konzeptioneller Rahmen für die Gestaltung von nicht-funktionaler Anpassung der Kommunikation sozialer Roboter mit bestärkendem Lernen entwickelt. Er bietet eine übergeordnete Sichtweise aus der Perspektive des Systemdesigners und eine Anleitung vom Anfang bis zum Ende. Er veranschaulicht den Prozess der Modellierung, Simulation und Evaluierung solcher Anpassungsprozesse. Insbesondere wird auf die Integration von menschlichem Feedback und sozialen Signalen eingegangen, um die Maschine mit sozialem Bewusstsein auszustatten. Der konzeptionelle Rahmen wird für mehrere Anwendungsfälle in die Praxis umgesetzt, was zu technischen Konzeptnachweisen und Forschungsprototypen führt, die in Labor- und In-situ-Studien evaluiert werden. Diese Ansätze befassen sich mit typischen Aktivitäten in häuslichen Umgebungen, wobei der Schwerpunkt auf dem Ausdruck der Persönlichkeit, dem Persona, der Höflichkeit und dem Humor des Roboters liegt. In diesem Rahmen passt der Roboter seine Sprache, Prosodie, und Animationen auf Basis expliziten oder impliziten menschlichen Feedbacks an

OPUS Augsburg

Recommended from our members

Online tutorial support in open distance learning through audio-graphic SCMC: tutor impressions

Author: Rosell-Aguilar Fernando
Publication venue
Publication date: 01/08/2006
Field of study

The adoption of audio-graphic conferencing brings with it changes to the learning experience for tutors and students alike. These need to be researched to gain an insight into the learning experience of those teaching and being taught through the medium. One of the conferencing tools that has been utilised for much of the documented research on audio-graphic conferencing is the Lyceum software, used at the UK Open University since 2002 to provide tutorial support for higher level language learners. The use of the software has been reported at different stages, from the pilot projects since 1997 (Hauck & Haezewindt, 1999, Shield 2000, Kötter 2001, Hewer and Shield 2001), to reports of the mainstream use (Hampel 2003, Hampel & Hauck 2004). It seems logical that the next step should be to research into the tutors’ experiences of the audio-graphic tool, which is a key element in the CALL research agenda (Warschauer, 1997; Debski & Levy, 1999). As the Open University prepares to phase out the software and replace it with a Moodle-based open-content audio-graphic synchronous conferencing tool, the insight into teaching with such tools becomes more valuable to other language learning professionals and institutions. In this paper we will report on a study of data collected from 18 tutors after spending a year teaching a new beginners’ course online. We will examine their perceptions of the audio-graphic tool and challenge some of the results from the initial research into audio-graphic conferencing. Most tutors found the teaching experience very positive and liked using the tool; however some experienced technical problems and believe that these affect the learning experience. In addition we will report on the first and successful use of the environment for assessment purposes

Open Research Online (The Open University)

Exploring language contact and use among globally mobile populations: a qualitative study of English-speaking short-stay academic sojourners in the Republic of Korea

Author: Pooley Aaron William
Publication venue
Publication date: 01/01/2017
Field of study

This study explores the language contact and use of English speaking sojourners in the Republic of Korea who had no prior knowledge of Korean language or culture prior to arriving in the country. The study focuses on the use of mobile technology assisted l anguage use. Study participants responded to an online survey about their experiences using the Korean language when interacting with Korean speakers, their free time activities, and the types of digital and mobile technologies they used. The survey respon ses informed questions for later discussion groups, in which participants discussed challenges and solutions when encountering new linguistic and social scenarios with Korean speakers. Semi structured interviews were employed to examine the linguistic, soc ial and technological dimensions of the study participants’ brief sojourn in Korea in more depth. The interviews revealed a link between language contact, language use and a mobile instant messaging application. In the second phase of the study, online surveys focused on the language and technology link discovered in the first phase. Throughout Phase Two , the researcher observed the study participants in a series of social contexts, such as informal English practice and university events. Phase Two concluded with semi structured interviews that demonstrated language contact and use within mobile instant messaging chat rooms on participants’ handheld smart devices. The two phases revealed three key factors influencing the language contact and use between the study participants and Korean speakers. Firstly, a mutual perspicacity for mobile technologies and digital communication supported their mediated, screen to screen and blended direct and mediated face to screen interactions. Secondly, Korea’s advanced digital environment comprised handheld smart devices, smart device applications and ubiquitous, high speed Wi Fi their Korean speaking hosts to self reliance. Thirdly, language use between the study participants and Korean speakers incorporated a range of sociolinguistic resources including the exchange of symbols, small expressive images, photographs, video and audio recordings along with or in place of typed text. Using these resources also helped the study participants learn and take part in social and cultural practices, such as gifting digitally, within mobile instant messaging chat rooms. The findings of the study are drawn together in a new conceptual model which h as been called sociolinguistic digital acuity , highlighting the optimal conditions for language contact and use during a brief sojourn in a country with an unfamiliar language and culture

University of Southern Queensland ePrints

The Use of Communication Strategies by Learners of English and Learners of Chinese in Text-based and Video-based Synchronous Computer-mediated Communication (SCMC)

Author: HUNG YU-WAN
Publication venue
Publication date: 01/01/2012
Field of study

The use of communication strategies (CSs) has been of interest on research into second language acquisition (SLA) since it can help learners to attain mutual comprehension effectively and develops understanding of interaction in SLA research. This study investigates and clarifies a wide range of CSs that learners of English and learners of Chinese use to solve language problems as well as to facilitate problem-free discourse in both text-based and video-based SCMC environments. Seven Chinese-speaking learners of English and seven English-speaking learners of Chinese were paired up as tandem (reciprocal) learning dyads in this study. Each dyad participated in four interactions, namely, text-based SCMC in English, text-based SCMC in Chinese, video-based SCMC in English and video-based SCMC in Chinese. The interaction data were analysed along with an after-task questionnaire and stimulated reflection to explore systematically and comprehensively the differences between text-based and video-based SCMC and differences between learners of English and learners of Chinese. The results showed that learners used CSs differently in text-based and video-based SCMC compared with their own performance and indicated different learning opportunities provided by these two modes of SCMC. Although the difference in language was less salient than the medium effect, learners of English and learners of Chinese tended to have their own preferences for particular CSs. When these preferences appear to reflect an appropriate communicative style in one particular culture, learners might need to raise their awareness of some strategies during intercultural communication to avoid possible misunderstanding or offence. Some possible advantages of tandem learning interaction were also identified in this study, such as the potential to develop sociocultural and intercultural competence due to the opportunity to practice culturally appropriate language use with native speakers in a social context

Durham e-Theses

Subtitling for the Deaf and the Hard-of-hearing: A Reception Study in the Turkish Context

Author: Gürkan Ali
Publication venue: UCL (University College London)
Publication date: 28/12/2019
Field of study

This study aims to contribute to a better understanding of subtitling for people with hearing impairments and to improve the accessibility to audiovisual material for hearing-impaired viewers in Turkey. It starts by providing a detailed general overview of the current state of accessibility and includes a detailed discussion on existing legislation, an outline of the limited practice of subtitling for the deaf and the hard-of-hearing (SDH) in Turkish and a profile of the assumed target audience. The ultimate goal of this research is to create a set of guidelines that can be used in the production of quality SDH in Turkey. In order to achieve these aims, the study adopts a product-oriented descriptive approach and first investigates the guidelines applied in countries where SDH has long been established as a professional practice in an attempt to reveal some of the shared values of good practice as well as potential divergences. Following this descriptive analysis, some of the key contradicting practices in the guidelines – speaker identification, reading speed, indication of sound and paralinguistic information – are tested on an audience of (37) Turkish hearing-impaired viewers so as to unveil their needs and preferences within the framework of Audience Reception Theory. Quantitative data on the preferences of Turkish viewers was collected by means of questionnaires filled in by the participants after they had watched different sets of subtitles, each of them testing a different feature. Further qualitative data was obtained through interviews conducted with four participants who took part in the experiment so as to generate more in-depth information regarding their preferences. The results yielded by the statistical analysis of the quantitative data and the interpretive phenomenological analysis of the qualitative data culminated in the drafting of a set of guidelines that can be used in the production of SDH in Turkey

UCL Discovery