6 research outputs found
Letâs Talk Games:An Expert Exploration of Speech Interaction with NPCs
Recent years have witnessed significant advances in speech recognition and language processing technologies, enabling natural language conversations with computers. Concurrently, the gaming industry seeks to heighten immersion as one of the leading mediums for entertainment. This work investigates the potential and challenges of using speech interaction in single-player video games, particularly for interactions with NPCs. We conducted an online survey with video game experts ((Formula presented.)) alongside in-depth interviews with researchers specializing in conversational user interfaces and game user research ((Formula presented.)). Our findings emphasize expertsâ recognition of the considerable potential of speech interaction in games, fostering increased immersion, engagement, and entertainment. Additionally, experts address pertinent concerns like privacy issues and play environment limitations. Drawing from our findings, we provide practical recommendations for integrating speech interaction in single-player games. These encompass potential benefits, challenges, accessibility, and social implications. We further address potential regulatory requirements and offer implementation tips to enhance player experience.</p
Talegjenkjenning i norske bedrifter: En casestudie av effekter og muligheter talegjenkjenning gir norske bedrifter
Denne masterutredningen studerer hvilke effekter og muligheter talegjenkjenning gir for
norske bedrifter. Utredningen bestÄr av et casestudie og et dokumentstudie, hvor vi gjennom
en kvalitativ tilnĂŠrming har tilegnet oss verdifull innsikt for Ă„ belyse problemstillingen
gjennom ulike bedrifter. Vi har ogsÄ gjennomfÞrt et dokumentstudie av en bedrifts interne
dokumenter for Ä analysere effekten av implementering av talegjenkjenning pÄ kundesentre.
Utredningen vil i hovedsak vÊre et bidrag til litteraturen nÄr det gjelder effekter og
muligheter av talegjenkjenning for norske bedrifter, ettersom det er begrenset med litteratur
rundt temaet for det norske sprÄket. I tillegg vil vÄrt studie bidra til Ä se pÄ effektivitet i form
av lÞnnsomhet for talegjenkjenning pÄ kundesentre.
Utredningen viser at effekter talegjenkjenning gir for norske bedrifter kan vĂŠre store
kostnadsbesparelser, og at norske bedrifter opplever forbedring i effektivitet. Funn fra
dokumentstudiet indikerer en avkastning pÄ investering av talegjenkjenning for bedriften pÄ
60.62%. I tillegg viser utredningen at implementering av talegjenkjenning i norske bedrifter
ogsÄ kan gi andre ikke-kostnads relaterte fordeler, som for eksempel ergonomiske fordeler,
bedre kvalitet og informasjonsflyt. Utredningen viser ogsÄ at effektene og mulighetene
talegjenkjenning gir, er avhengig av utviklingen av teknologien og modellene, samt
viktigheten av at modellene trenes opp pÄ riktig datamateriell i forhold til hva modellene skal
brukes til. Funnene vÄre indikerer ogsÄ at mange fremtidige muligheter for norske bedrifter
som bruker talegjenkjenning avhenger av hvordan teknologien brukes. De stĂžrste
utfordringene med kvaliteten av talegjenkjenning for det norske sprÄket indikerer funnene
vÄre at er relatert til mangfoldet av dialekter.nhhma
Automatic speech recognition : an approach for designing inclusive games
Computer games are now a part of our modern culture. However, certain categories of people are excluded from this form of entertainment and social interaction because they are unable to use the interface of the games. The reason for this can be deficits in motor control, vision or hearing. By using automatic speech recognition systems (ASR), voice driven commands can be used to control the game, which can thus open up the possibility for people with motor system difficulty to be included in game communities. This paper aims at find a standard way of using voice commands in games which uses a speech recognition system in the backend, and that can be universally applied for designing inclusive games. Present speech recognition systems however, do not support emotions, attitudes, tones etc. This is a drawback because such expressions can be vital for gaming. Taking multiple types of existing genres of games into account and analyzing their voice command requirements, a general ASRS module is proposed which can work as a common platform for designing inclusive games. A fuzzy logic controller proposed then is to enhance the system. The standard voice driven module can be based on algorithm or fuzzy controller which can be used to design software plug-ins or can be included in microchip. It then can be integrated with the game engines; creating the possibility of voice driven universal access for controlling games
Automatic speech recognition : an approach for designing inclusive games
Computer games are now a part of our modern culture. However, certain categories of people are excluded from this form of entertainment and social interaction because they are unable to use the interface of the games. The reason for this can be deficits in motor control, vision or hearing. By using automatic speech recognition systems (ASR), voice driven commands can be used to control the game, which can thus open up the possibility for people with motor system difficulty to be included in game communities. This paper aims at find a standard way of using voice commands in games which uses a speech recognition system in the backend, and that can be universally applied for designing inclusive games. Present speech recognition systems however, do not support emotions, attitudes, tones etc. This is a drawback because such expressions can be vital for gaming. Taking multiple types of existing genres of games into account and analyzing their voice command requirements, a general ASRS module is proposed which can work as a common platform for designing inclusive games. A fuzzy logic controller proposed then is to enhance the system. The standard voice driven module can be based on algorithm or fuzzy controller which can be used to design software plug-ins or can be included in microchip. It then can be integrated with the game engines; creating the possibility of voice driven universal access for controlling games
Apraxia World: Deploying a Mobile Game and Automatic Speech Recognition for Independent Child Speech Therapy
Children with speech sound disorders typically improve pronunciation quality by undergoing speech therapy, which must be delivered frequently and with high intensity to be effective. As such, clinic sessions are supplemented with home practice, often under caregiver supervision. However, traditional home practice can grow boring for children due to monotony. Furthermore, practice frequency is limited by caregiver availability, making it difficult for some children to reach therapy dosage. To address these issues, this dissertation presents a novel speech therapy game to increase engagement, and explores automatic pronunciation evaluation techniques to afford children independent practice.
Children with speech sound disorders typically improve pronunciation quality by undergoing speech therapy, which must be delivered frequently and with high intensity to be effective. As such, clinic sessions are supplemented with home practice, often under caregiver supervision. However, traditional home practice can grow boring for children due to monotony. Furthermore, practice frequency is limited by caregiver availability, making it difficult for some children to reach therapy dosage. To address these issues, this dissertation presents a novel speech therapy game to increase engagement, and explores automatic pronunciation evaluation techniques to afford children independent practice.
The therapy game, called Apraxia World, delivers customizable, repetition-based speech therapy while children play through platformer-style levels using typical on-screen tablet controls; children complete in-game speech exercises to collect assets required to progress through the levels. Additionally, Apraxia World provides pronunciation feedback according to an automated pronunciation evaluation system running locally on the tablet. Apraxia World offers two advantages over current commercial and research speech therapy games; first, the game provides extended gameplay to support long therapy treatments; second, it affords some therapy practice independence via automatic pronunciation evaluation, allowing caregivers to lightly supervise instead of directly administer the practice. Pilot testing indicated that children enjoyed the game-based therapy much more than traditional practice and that the exercises did not interfere with gameplay. During a longitudinal study, children made clinically-significant pronunciation improvements while playing Apraxia World at home. Furthermore, children remained engaged in the game-based therapy over the two-month testing period and some even wanted to continue playing post-study.
The second part of the dissertation explores word- and phoneme-level pronunciation verification for child speech therapy applications. Word-level pronunciation verification is accomplished using a child-specific template-matching framework, where an utterance is compared against correctly and incorrectly pronounced examples of the word. This framework identified mispronounced words better than both a standard automated baseline and co-located caregivers. Phoneme-level mispronunciation detection is investigated using a technique from the second-language learning literature: training phoneme-specific classifiers with phonetic posterior features. This method also outperformed the standard baseline, but more significantly, identified mispronunciations better than student clinicians