611 research outputs found

    Spartan Daily, December 4, 1970

    Get PDF
    Volume 58, Issue 47https://scholarworks.sjsu.edu/spartandaily/5337/thumbnail.jp

    About Voice: A Longitudinal Study of Speaker Recognition Dataset Dynamics

    Full text link
    Like face recognition, speaker recognition is widely used for voice-based biometric identification in a broad range of industries, including banking, education, recruitment, immigration, law enforcement, healthcare, and well-being. However, while dataset evaluations and audits have improved data practices in computer vision and face recognition, the data practices in speaker recognition have gone largely unquestioned. Our research aims to address this gap by exploring how dataset usage has evolved over time and what implications this has on bias and fairness in speaker recognition systems. Previous studies have demonstrated the presence of historical, representation, and measurement biases in popular speaker recognition benchmarks. In this paper, we present a longitudinal study of speaker recognition datasets used for training and evaluation from 2012 to 2021. We survey close to 700 papers to investigate community adoption of datasets and changes in usage over a crucial time period where speaker recognition approaches transitioned to the widespread adoption of deep neural networks. Our study identifies the most commonly used datasets in the field, examines their usage patterns, and assesses their attributes that affect bias, fairness, and other ethical concerns. Our findings suggest areas for further research on the ethics and fairness of speaker recognition technology.Comment: 14 pages (23 with References and Appendix

    Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

    Full text link
    Automatic speech recognition (ASR) has recently become an important challenge when using deep learning (DL). It requires large-scale training datasets and high computational and storage resources. Moreover, DL techniques and machine learning (ML) approaches in general, hypothesize that training and testing data come from the same domain, with the same input feature space and data distribution characteristics. This assumption, however, is not applicable in some real-world artificial intelligence (AI) applications. Moreover, there are situations where gathering real data is challenging, expensive, or rarely occurring, which can not meet the data requirements of DL models. deep transfer learning (DTL) has been introduced to overcome these issues, which helps develop high-performing models using real datasets that are small or slightly different but related to the training data. This paper presents a comprehensive survey of DTL-based ASR frameworks to shed light on the latest developments and helps academics and professionals understand current challenges. Specifically, after presenting the DTL background, a well-designed taxonomy is adopted to inform the state-of-the-art. A critical analysis is then conducted to identify the limitations and advantages of each framework. Moving on, a comparative study is introduced to highlight the current challenges before deriving opportunities for future research

    The effect of digital apps on Vietnamese EFL learners’ receptive vocabulary acquisition : a case study of quizlet and paper flashcards

    Get PDF
    The thesis aims to investigate the efficacy of a digital vocabulary learning application called Quizlet compared with that of a more traditional method, such as paper flashcards, among English as a Foreign Language (EFL) learners in Vietnam where the teaching and learning of English has been an object of concern for the government. Reports so far have recorded slow progress and official policies attempt to encourage improvement in the area including the use of digital media in teaching and learning. So it is legitimate to ask whether reliance on digital media in EFL education may be justified. This is the practical motivation of this project, which compares a digital tool and a more traditional tool used for the same purpose: the learning of the L2 lexicon. The theoretical framework of the study is the Cognitive-Affective Theory of Learning with Media (CATLM) (Moreno & Mayer, 2007), and the evaluation framework used follows Miyamoto (2001) according to whom multimodal second language learning activities should be evaluated from three different perspectives: the linguistic development in the learner, (2) the linguistic environment provided by the learning tool and (3) the learner’s perception on the learning tool. Consequently, this study examines two vocabulary learning tools, Quizlet and paper flashcards in terms of (a) actual learning outcomes; (b) input, output, interaction and feedback and (c) learners’ attitude. This study follows a design including pre-test, training (two one-hour reading and vocabulary learning sessions per week for four weeks) and immediate post-test as well as delayed post-test. Participants in the study were an intact class of 39 high school students in Vietnam. They were divided into two groups. Approximately twenty new words selected from a reading passage were introduced to the students each week. As for the vocabulary learning tools, group A used Quizlet while group B paper flashcards for the first two weeks. Then, group A switched to paper flashcards, and group B Quizlet in the following two weeks. This method was used to counterbalance the order effect of using two different tools. Data analysis included screen captures (Quizlet) and video recordings (paper flashcards) of six randomly selected participants’ learning activities during training sessions; improvements from vocabulary pre-tests to post-tests and; participants’ responses to a questionnaire. Results suggest that both of the tools have a positive influence on vocabulary learning. However, Quizlet appears to be more effective than paper flashcards in fostering vocabulary development. Additionally, Quizlet has various advantages over paper flashcards in terms of the linguistic environment provided for learning and meets students’ preference. However, paper flashcards do have some specific merits such as encouraging students to practise pronouncing words, which was not observed on Quizlet. The research proposes that there is some justification to the belief that digital apps may elicit better results overall than some of the more traditional method for L2 vocabulary learning in English as a second language because they provide a greater variety of linguistic environments and because they can help meet the need for exposure to native English in the Vietnamese school system

    Graphonomics and your Brain on Art, Creativity and Innovation : Proceedings of the 19th International Graphonomics Conference (IGS 2019 – Your Brain on Art)

    Get PDF
    [Italiano]: “Grafonomia e cervello su arte, creatività e innovazione”. Un forum internazionale per discutere sui recenti progressi nell'interazione tra arti creative, neuroscienze, ingegneria, comunicazione, tecnologia, industria, istruzione, design, applicazioni forensi e mediche. I contributi hanno esaminato lo stato dell'arte, identificando sfide e opportunità, e hanno delineato le possibili linee di sviluppo di questo settore di ricerca. I temi affrontati includono: strategie integrate per la comprensione dei sistemi neurali, affettivi e cognitivi in ambienti realistici e complessi; individualità e differenziazione dal punto di vista neurale e comportamentale; neuroaesthetics (uso delle neuroscienze per spiegare e comprendere le esperienze estetiche a livello neurologico); creatività e innovazione; neuro-ingegneria e arte ispirata dal cervello, creatività e uso di dispositivi di mobile brain-body imaging (MoBI) indossabili; terapia basata su arte creativa; apprendimento informale; formazione; applicazioni forensi. / [English]: “Graphonomics and your brain on art, creativity and innovation”. A single track, international forum for discussion on recent advances at the intersection of the creative arts, neuroscience, engineering, media, technology, industry, education, design, forensics, and medicine. The contributions reviewed the state of the art, identified challenges and opportunities and created a roadmap for the field of graphonomics and your brain on art. The topics addressed include: integrative strategies for understanding neural, affective and cognitive systems in realistic, complex environments; neural and behavioral individuality and variation; neuroaesthetics (the use of neuroscience to explain and understand the aesthetic experiences at the neurological level); creativity and innovation; neuroengineering and brain-inspired art, creative concepts and wearable mobile brain-body imaging (MoBI) designs; creative art therapy; informal learning; education; forensics

    The development of self-identification in Chinese-Vietnamese children in Australia : the influence of family language practices and changing social environments

    Get PDF
    This thesis investigates the development of children’s self-identification in minority bi-ethnic migrant families in relation to their multilingual and multicultural practices, within the context of exogamous families in Australia. While these bi-ethnic partnerships implicitly or explicitly implement policies and strategies to encourage the use of home languages, there is scant understanding of the dynamic interrelation between the development of identity in multi-ethnic children and their language development in changing social environments. Bi- and multilingual children’s language acquisition, family language policy and identity issues have been extensively studied internationally. However, these studies do not systematically investigate the connections between identity development in multilingual children, their respective family’s linguistic and cultural input, and their social environments. This thesis examines family language practices and socio-environmental factors impacting young children’s identity construction, to complement previous research on Australian bilingual children. It seeks to contribute to the current debate between essentialist (psychological) versus non-essentialist (socio-linguistic) identity issues by examining children’s expression of self in response to the three languages in their environment, including their families’ referential practices. It also observes the effects of different social contexts and changing circumstances on children’s self-identification. The design of this research is longitudinal, as it aims to gather data from two Australian Cantonese-Vietnamese families over three years. The key finding of this study is that children construct their identity in a dynamic and context-bound way. Results identify three major influencing factors as playing a role in the children’s self-identification: 1) family language input and practices; 2) family ideologies, cultural practices, and family networks, as well as the migrant community and 3) peers and the childcare/school environments. This thesis contributes new empirical data to existing research on family language policy and adds new language pairs to the field of heritage language maintenance and child identity in the Australian context. The data suggests that self-identification develops in a context-bound way parallel to the context-bound language development proposed in Qi and Di Biase (2020). It reveals that children’s self-identification grows not merely under the influence of their family’s linguistic and cultural practices, but also adjusts to changing circumstances and pressures from peers and adult role models in the dominant environment. These findings may play a role in the preservation of heritage languages and family wellbeing

    Robust text independent closed set speaker identification systems and their evaluation

    Get PDF
    PhD ThesisThis thesis focuses upon text independent closed set speaker identi cation. The contributions relate to evaluation studies in the presence of various types of noise and handset e ects. Extensive evaluations are performed on four databases. The rst contribution is in the context of the use of the Gaussian Mixture Model-Universal Background Model (GMM-UBM) with original speech recordings from only the TIMIT database. Four main simulations for Speaker Identi cation Accuracy (SIA) are presented including di erent fusion strategies: Late fusion (score based), early fusion (feature based) and early-late fusion (combination of feature and score based), late fusion using concatenated static and dynamic features (features with temporal derivatives such as rst order derivative delta and second order derivative delta-delta features, namely acceleration features), and nally fusion of statistically independent normalized scores. The second contribution is again based on the GMM-UBM approach. Comprehensive evaluations of the e ect of Additive White Gaussian Noise (AWGN), and Non-Stationary Noise (NSN) (with and without a G.712 type handset) upon identi cation performance are undertaken. In particular, three NSN types with varying Signal to Noise Ratios (SNRs) were tested corresponding to: street tra c, a bus interior and a crowded talking environment. The performance evaluation also considered the e ect of late fusion techniques based on score fusion, namely mean, maximum, and linear weighted sum fusion. The databases employed were: TIMIT, SITW, and NIST 2008; and 120 speakers were selected from each database to yield 3,600 speech utterances. The third contribution is based on the use of the I-vector, four combinations of I-vectors with 100 and 200 dimensions were employed. Then, various fusion techniques using maximum, mean, weighted sum and cumulative fusion with the same I-vector dimension were used to improve the SIA. Similarly, both interleaving and concatenated I-vector fusion were exploited to produce 200 and 400 I-vector dimensions. The system was evaluated with four di erent databases using 120 speakers from each database. TIMIT, SITW and NIST 2008 databases were evaluated for various types of NSN namely, street-tra c NSN, bus-interior NSN and crowd talking NSN; and the G.712 type handset at 16 kHz was also applied. As recommendations from the study in terms of the GMM-UBM approach, mean fusion is found to yield overall best performance in terms of the SIA with noisy speech, whereas linear weighted sum fusion is overall best for original database recordings. However, in the I-vector approach the best SIA was obtained from the weighted sum and the concatenated fusion.Ministry of Higher Education and Scienti c Research (MoHESR), and the Iraqi Cultural Attach e, Al-Mustansiriya University, Al-Mustansiriya University College of Engineering in Iraq for supporting my PhD scholarship

    He\u27s Dark, Dark; Colorism Among African American Men

    Get PDF
    This study expands literature on colorism and the monolithic emphasis on the experiences of women by investigating black men’s experience with skin tone discrimination. The investigator seeks to interrogate how black males experience colorism by exploring how familial, peer associations, and media shape black males’ understanding of their skin-tone; by asking; what messages, if any, enforcing colorism ideals they receive; as well as the frequency of and adherence to such messages. The investigator utilized focus groups to gather data. Sample was limited to 10 self-identifying African-American black men age 18 and older. Focus group data is analyzed through an intersectional perspective, and thematic coding is utilized for analysis. Findings suggest light skinned and dark skinned men experience colorism differently. Light skinned men noted blatant colorism and often felt they had to authenticate their blackness. Darker skinned men reported more indirect colorism and negative stereotypes as prominent challenges with colorism
    • …
    corecore