4 research outputs found
Accurate Reader Identification for the Arabic Holy Quran Recitations Based on an Enhanced VQ Algorithm
The Speaker identification process is not a new trend; however, for the Arabic Holy Quran recitation, there are still quite improvements that can make this process more accurate and reliable. This paper collected the input data from 14 native Arabic reciters, consisting of “Surah Al-Kawthar” speech signals from the Holy Quran. Moreover, this paper discusses the accuracy rates for 8 and 16 features. Indeed, a modified Vector Quantization (VQ) technique will be presented, in addition to realistically matching the centroids of the various codebooks and measuring systems’ effectiveness. Note that the VQ technique will be utilized to generate the codebooks by clustering these features into a finite number of centroids. The proposed system’s software was built and executed using MATLAB®. The proposed system’s total accuracy rate was 97.92% and 98.51% for 8 and 16 centroids codebooks, respectively. However, this study discussed two validation tactics to ensure that the outcomes are reliable and can be reproduced. Hence, the K-mean clustering algorithm has been used to validate the obtained results and discuss the outcomes of this study. Finally, it has been found that the improved VQ method gives a better result than the K-means method
Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization
Automatic speech recognition (ASR) has recently become an important challenge
when using deep learning (DL). It requires large-scale training datasets and
high computational and storage resources. Moreover, DL techniques and machine
learning (ML) approaches in general, hypothesize that training and testing data
come from the same domain, with the same input feature space and data
distribution characteristics. This assumption, however, is not applicable in
some real-world artificial intelligence (AI) applications. Moreover, there are
situations where gathering real data is challenging, expensive, or rarely
occurring, which can not meet the data requirements of DL models. deep transfer
learning (DTL) has been introduced to overcome these issues, which helps
develop high-performing models using real datasets that are small or slightly
different but related to the training data. This paper presents a comprehensive
survey of DTL-based ASR frameworks to shed light on the latest developments and
helps academics and professionals understand current challenges. Specifically,
after presenting the DTL background, a well-designed taxonomy is adopted to
inform the state-of-the-art. A critical analysis is then conducted to identify
the limitations and advantages of each framework. Moving on, a comparative
study is introduced to highlight the current challenges before deriving
opportunities for future research
Conversational Arabic Automatic Speech Recognition
Colloquial Arabic (CA) is the set of spoken variants of modern Arabic that exist in the form of regional dialects and are considered generally to be mother-tongues in those regions. CA has limited textual resource because it exists only as a spoken language and without a standardised written form. Normally the modern standard Arabic (MSA) writing convention is employed that has limitations in phonetically representing CA. Without phonetic dictionaries the pronunciation of CA words is ambiguous, and can only be obtained through word and/or sentence context. Moreover, CA inherits the MSA complex word structure where words can be created from attaching affixes to a word.
In automatic speech recognition (ASR), commonly used approaches to model acoustic, pronunciation and word variability are language independent. However, one can observe significant differences in performance between English and CA, with the latter yielding up to three times higher error rates.
This thesis investigates the main issues for the under-performance of CA ASR systems. The work focuses on two directions: first, the impact of limited lexical coverage, and insufficient training data for written CA on language modelling is investigated; second, obtaining better models for the acoustics and pronunciations by learning to transfer between written and spoken forms. Several original contributions result from each direction. Using data-driven classes from decomposed text are shown to reduce out-of-vocabulary rate. A novel colloquialisation system to import additional data is introduced; automatic diacritisation to restore the missing short vowels was found to yield good performance; and a new acoustic set for describing CA was defined. Using the proposed methods improved the ASR performance in terms of word error rate in a CA conversational telephone speech ASR task
Women in Artificial intelligence (AI)
This Special Issue, entitled "Women in Artificial Intelligence" includes 17 papers from leading women scientists. The papers cover a broad scope of research areas within Artificial Intelligence, including machine learning, perception, reasoning or planning, among others. The papers have applications to relevant fields, such as human health, finance, or education. It is worth noting that the Issue includes three papers that deal with different aspects of gender bias in Artificial Intelligence. All the papers have a woman as the first author. We can proudly say that these women are from countries worldwide, such as France, Czech Republic, United Kingdom, Australia, Bangladesh, Yemen, Romania, India, Cuba, Bangladesh and Spain. In conclusion, apart from its intrinsic scientific value as a Special Issue, combining interesting research works, this Special Issue intends to increase the invisibility of women in AI, showing where they are, what they do, and how they contribute to developments in Artificial Intelligence from their different places, positions, research branches and application fields. We planned to issue this book on the on Ada Lovelace Day (11/10/2022), a date internationally dedicated to the first computer programmer, a woman who had to fight the gender difficulties of her times, in the XIX century. We also thank the publisher for making this possible, thus allowing for this book to become a part of the international activities dedicated to celebrating the value of women in ICT all over the world. With this book, we want to pay homage to all the women that contributed over the years to the field of AI