Search CORE

1,612 research outputs found

Voice as a design material : sociophonetic inspired design strategies in Human-Computer Interaction

Author: Bell Allan
Clark Leigh
Cowan Benjamin R.
Dahlbäck Nils
Dixon John A.
Juul Søndergaard Marie Louise
King Simon
Labov William
Lippi-Green Rosina
Matusitz Jonathan
Milroy James
Moore Roger K.
Moore Roger K.
Munteanu Cosmin
Orelus Pierre W.
Rowe Debbie A.
Stuart-Smith Jane
Tompkinson James
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

While there is a renewed interest in voice user interfaces (VUI) in HCI, little attention has been paid to the design of VUI voice output beyond intelligibility and naturalness. We draw on the field of sociophonetics - the study of the social factors that influence the production and perception of speech - to highlight how current VUIs are based on a limited and homogenised set of voice outputs. We argue that current systems do not adequately consider the diversity of peoples’ speech, how that diversity represents sociocultural identities, and how voices have the potential to shape user perceptions and experiences. Ultimately, as other technological developments have influenced the ideologies of language, the voice outputs of VUIs will influence the ideologies of speech. Based on our argument, we pose three design strategies for VUI voice output design - individualisation, context awareness, and diversification - to motivate new ways of conceptualising and designing these technologies

Northumbria Research Link

Crossref

White Rose Research Online

Stereotypical nationality representations in HRI: perspectives from international young adults

Author: Agnes Axelsson
Olov Engwall
Ronald Cumbal
Shivam Mehta
Publication venue: Frontiers Media S.A.
Publication date: 01/11/2023
Field of study

People often form immediate expectations about other people, or groups of people, based on visual appearance and characteristics of their voice and speech. These stereotypes, often inaccurate or overgeneralized, may translate to robots that carry human-like qualities. This study aims to explore if nationality-based preconceptions regarding appearance and accents can be found in people’s perception of a virtual and a physical social robot. In an online survey with 80 subjects evaluating different first-language-influenced accents of English and nationality-influenced human-like faces for a virtual robot, we find that accents, in particular, lead to preconceptions on perceived competence and likeability that correspond to previous findings in social science research. In a physical interaction study with 74 participants, we then studied if the perception of competence and likeability is similar after interacting with a robot portraying one of four different nationality representations from the online survey. We find that preconceptions on national stereotypes that appeared in the online survey vanish or are overshadowed by factors related to general interaction quality. We do, however, find some effects of the robot’s stereotypical alignment with the subject group, with Swedish subjects (the majority group in this study) rating the Swedish-accented robot as less competent than the international group, but, on the other hand, recalling more facts from the Swedish robot’s presentation than the international group does. In an extension in which the physical robot was replaced by a virtual robot interacting in the same scenario online, we further found the same results that preconceptions are of less importance after actual interactions, hence demonstrating that the differences in the ratings of the robot between the online survey and the interaction is not due to the interaction medium. We hence conclude that attitudes towards stereotypical national representations in HRI have a weak effect, at least for the user group included in this study (primarily educated young students in an international setting)

Directory of Open Access Journals

I'M Information Market Issue No. 65 November 1990-January 1991

Author
Publication venue
Publication date: 01/01/1991
Field of study

Archive of European Integration

English voices in ‘Text-to-speech tools’: representation of English users and their varieties from a World Englishes perspective

Author
Publication venue: 'Australian International Academic Centre'
Publication date
Field of study

Crossref

Language variation, automatic speech recognition and algorithmic bias

Author: Markl Nina
Publication venue: The University of Edinburgh
Publication date: 12/12/2023
Field of study

In this thesis, I situate the impacts of automatic speech recognition systems in relation to sociolinguistic theory (in particular drawing on concepts of language variation, language ideology and language policy) and contemporary debates in AI ethics (especially regarding algorithmic bias and fairness). In recent years, automatic speech recognition systems, alongside other language technologies, have been adopted by a growing number of users and have been embedded in an increasing number of algorithmic systems. This expansion into new application domains and language varieties can be understood as an expansion into new sociolinguistic contexts. In this thesis, I am interested in how automatic speech recognition tools interact with this sociolinguistic context, and how they affect speakers, speech communities and their language varieties. Focussing on commercial automatic speech recognition systems for British Englishes, I first explore the extent and consequences of performance differences of these systems for different user groups depending on their linguistic background. When situating this predictive bias within the wider sociolinguistic context, it becomes apparent that these systems reproduce and potentially entrench existing linguistic discrimination and could therefore cause direct and indirect harms to already marginalised speaker groups. To understand the benefits and potentials of automatic transcription tools, I highlight two case studies: transcribing sociolinguistic data in English and transcribing personal voice messages in isiXhosa. The central role of the sociolinguistic context in developing these tools is emphasised in this comparison. Design choices, such as the choice of training data, are particularly consequential because they interact with existing processes of language standardisation. To understand the impacts of these choices, and the role of the developers making them better, I draw on theory from language policy research and critical data studies. These conceptual frameworks are intended to help practitioners and researchers in anticipating and mitigating predictive bias and other potential harms of speech technologies. Beyond looking at individual choices, I also investigate the discourses about language variation and linguistic diversity deployed in the context of language technologies. These discourses put forward by researchers, developers and commercial providers not only have a direct effect on the wider sociolinguistic context, but they also highlight how this context (e.g., existing beliefs about language(s)) affects technology development. Finally, I explore ways of building better automatic speech recognition tools, focussing in particular on well-documented, naturalistic and diverse benchmark datasets. However, inclusive datasets are not necessarily a panacea, as they still raise important questions about the nature of linguistic data and language variation (especially in relation to identity), and may not mitigate or prevent all potential harms of automatic speech recognition systems as embedded in larger algorithmic systems and sociolinguistic contexts

Edinburgh Research Archive

Recommended from our members

The Challenge of Spoken Language Systems: Research Directions for the Nineties

Author: Atlas Les
Beckman Mary
Biermann Alan
Bush Marcia
Clements Mark
Cohen Jordan
Cole Ron
Garcia Oscar
Hanson Brian
Hermansky Hynek
Hirschman Lynette
Levinson Steve
McKeown Kathleen
Morgan Nelson
Novick David G.
Ostendorf Mari
Oviatt Sharon
Price Patti
Silverman Harvey
Spitz Judy
Waibel Alex
Weinstein Clifford
Zahorian Steve
Zue Victor
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1995
Field of study

A spoken language system combines speech recognition, natural language processing and human interface technology. It functions by recognizing the person's words, interpreting the sequence of words to obtain a meaning in terms of the application, and providing an appropriate response back to the user. Potential applications of spoken language systems range from simple tasks, such as retrieving information from an existing database (traffic reports, airline schedules), to interactive problem solving tasks involving complex planning and reasoning (travel planning, traffic routing), to support for multilingual interactions. We examine eight key areas in which basic research is needed to produce spoken language systems: (1) robust speech recognition; (2) automatic training and adaptation; (3) spontaneous speech; (4) dialogue models; (5) natural language response generation; (6) speech synthesis and speech generation; (7) multilingual systems; and (8) interactive multimodal systems. In each area, we identify key research challenges, the infrastructure needed to support research, and the expected benefits. We conclude by reviewing the need for multidisciplinary research, for development of shared corpora and related resources, for computational support and far rapid communication among researchers. The successful development of this technology will increase accessibility of computers to a wide range of users, will facilitate multinational communication and trade, and will create new research specialties and jobs in this rapidly expanding area

Columbia University Academic Commons

Recommended from our members

The Challenge of Spoken Language Systems: Research Directions for the Nineties

Author: McKeown Kathleen
Cole Ron
Hirschman Lynette
Atlas Les
Beckman Mary
Biermann Alan
Bush Marcia
Clements Mark
Cohen Jordan
Garcia Oscar
Hanson Brian
Hermansky Hynek
Levinson Steve
Morgan Nelson
Novick David G.
Ostendorf Mari
Oviatt Sharon
Price Patti
Silverman Harvey
Spitz Judy
Waibel Alex
Weinstein Clifford
Zahorian Steve
Zue Victor
Publication venue
Publication date: 01/01/1995
Field of study

Columbia University Academic Commons

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

User adaptations to system implementation in a mining company in Laos - A case study of organisational change

Author: Singthilath Aliyakone
Publication venue
Publication date: 01/01/2015
Field of study

The purpose of this case study was to assess post–project implementation acceptance by users of new IS/IT systems in a mining company in Laos. The report investigated how the new system changed organisational working cultures and what avoidance or acceptance factors appeared. Also, it looked at how the new implemented systems contributed to the changes in business process and working procedures within Lane Xang Mineral Limited Company (LXML), which is a Lao subsidiary of a mining company from Australia. The change implementation was a strategic business integration of MMG, a Chinese-owned global mining company, headquartered in Melbourne that operated several mining subsidiaries in Australia, Africa, Latin America, and in Laos. In 2013, LXML went through a big change implementation in terms of IS/IT systems consisting of the upgraded computing facilities, I.T. services outsourcing, communication systems, and the introduction of the new Enterprise Resource Planning (ERP) system. Those changes inevitably brought about change in the company’s business processes and working procedures. As a result, it shifted LXML’s way of working from the conventional paper-based system to a more systematic and electronic approach. Following the change, the organisation as well as its staff were faced with cultural issues and mismatch business processes. To gain an understanding of the factors that impacted on the IS/IT implementation within Lane Xang Mineral Limited, this paper applied two analytical frameworks to the study of user acceptance and organisational cultural differences. Data gathering was conducted by an online survey and semi-structure online interviews with staff at different levels from within the organisation. The findings were then divided into enablers and barriers to user’s adaptation to the new systems implementation on individual and organisational level. The findings were also used to compare deductively with the analytical frameworks to verify their influencing categories. This paper is organised in three main sections, the first section introduces the case background and description of the issues from the case study. The second section is a justification of the significance of issues identified, and of the selected conceptual frames that were applied in the study. The third section is the analysis section, which explains data collection methodologies and the analytical details. Findings on the study will also be found within this section. At the end of the paper, the study is concluded by giving recommendations as a guide to I.T. Managers at the MMG headquarters in Australia and the LXML office in Laos, on transnational I.T. implementation within MMG. The recommendations could be taken as a guide for any other organisation (not only limited to the mining industry) to explore in order to plan for an effective I.T. implementation within their firms in the future

ResearchArchive at Victoria University of Wellington