Search CORE

41 research outputs found

Speaker recognition utilizing distributed DCT-II based Mel frequency cepstral coefficients and fuzzy vector quantization

Author: Gregory M
Hossan M
Publication venue: Springer (United States)
Publication date: 01/01/2013
Field of study

In this paper, a new and novel Automatic Speaker Recognition (ASR) system is presented. The new ASR system includes novel feature extraction and vector classification steps utilizing distributed Discrete Cosine Transform (DCT-II) based Mel Frequency Cepstral Coef?cients (MFCC) and Fuzzy Vector Quantization (FVQ). The ASR algorithm utilizes an approach based on MFCC to identify dynamic features that are used for Speaker Recognition (SR)

RMIT Research Repository

Modeling and Simulation of SI Engines for Fault Detection

Author: Bhatti Aamer Iqbal
Khan Mansoor
Raza Qarab
Rizvi Mudassar Abbas
Zaidi Sajjad
Publication venue: 'IntechOpen'
Publication date: 14/11/2012
Field of study

IntechOpen

Microporous materials fabricated from discrete molecular cages

Author: Moneypenny II, Timothy Patrick
Publication venue
Publication date: 01/08/2018
Field of study

In the context of materials science, an interesting relationship exists between the properties of solid materials and the existence of void spaces within them. In fact, whether the presence of voids is desired or not tends to depend on one’s perception of the effects that voids induce. In densified materials, for example, the presence of voids can be detrimental to structural integrity. Thus, such materials that contain voids are considered defective. On the other hand, when voids are desirable, their presence in certain materials is essential to material behavior. In zeolites, for example, the size, shape, and connectivity of void spaces regulate catalytic activity. In reality, however, and at some finite length scale, all real materials contain intrinsic void space; a consequence of the imperfect packing arrangements of atoms. Thus, it is necessary not only to elucidate what effects voids have on the properties of materials, but also to investigate methods that provide control of void features within solid materials. While all materials possess intrinsic voids, the ability to introduce intentional voids in solids presents multiple difficulties. The statement “nature abhors a vacuum” is a familiar quip that reflects this challenge of designing open pore spaces in solid materials, as porous frameworks with open void spaces are often higher in energy relative to their more dense structural counterparts. Nonetheless, during the last few decades, technology has advanced such that scientists have significant control over the size, shape, and position of voids within solids. Materials such as zeolites, metal-organic frameworks (MOFs), and covalent-organic frameworks (COFs) all demonstrate the profound ability to position pores of various shapes and sizes with molecular precision in a solid framework. This control over pore design has led to significant materials applications for porous materials including adsorption, catalysis, and molecular separation. Despite the successes of porous networks such as zeolites, MOFs, and COFs, there remains a need for greater molecular diversity and tunable microenvironments that are precise in molecular design. Moreover, there is a need for fundamental understanding of the relationship between characteristics of voids derived from molecular species and the behavior these entities exhibit within solid materials. Herein, we test the hypothesis that discrete molecular cages with non-collapsible pores are building blocks for porous solids by preparing molecular cages via alkyne metathesis. We demonstrate that molecular pores can be rationally synthesized from tritopic organic precursors in a single step and assembled in the solid state to afford permanently porous materials. Featuring organic synthesis and modular packing, our methodology provides molecular control for the fabrication of functional porous materials with precise microenvironments. First, a non-intuitive precursor design principle for synthesizing molecular cages via alkyne metathesis is described. By subjecting a series of precursors with varying bite angles to AM, it is experimentally demonstrated that the product distribution and convergence towards product formation is strongly dependent on precursor bite angle. Furthermore, it was discovered that precursors with the ideal tetrahedron bite angle (60º) do not afford the most efficient pathway to the product. These results lend credence to the underlying systemic issues facing the synthesis of 3D architectures via dynamic covalent chemistry, where variations in precursor geometry lead to significant deviation of product distributions away from discrete products. Next, a systematic study of the effects of molecular shape-persistence on the porosity of molecular solids is discussed. Three molecular cages synthesized via alkyne metathesis and post-synthetic modifications were designed to provide controlled, stepwise adjustments in molecular shape-persistence. Experimental measurements of nitrogen adsorption taken from rapidly and slowly crystallized solids of each cage demonstrated a trend in porosity that correlated with shape-persistence. Molecular dynamic simulations that modeled cage motion corroborated the trend seen in the experimental data and emphasized that shape-persistence governs the microporosity of these materials. Our integrated synthetic and computational approach demonstrates that the microporosity of this class of molecular solids can be controlled through fine-tuning at both the atomic and microscales. Lastly, the fabrication and characterization of a novel solid-state lithium electrolyte nanocomposite derived from a porous molecular cage is discussed. A solid-liquid electrolyte nanocomposite (SLEN) fabricated from an electrolyte system and a porous organic cage exhibits ionic conductivity on the order of 1 x 10-3 S cm-1. With an experimentally measured activation barrier of 0.16 eV, this composite is characterized as a superionic conductor. Furthermore, the SLEN displays excellent oxidative stability up to 4.7 V vs. Li/Li+. This simple three-component system enables the rational design of electrolytes from tunable, discrete molecular architectures that possess intrinsic void space

Illinois Digital Environment for Access to Learning and Scholarship Repository

Towards Building a Speech Recognition System for Quranic Recitations: A Pilot Study Involving Female Reciters

Author: Al-Ayyoub Mahmoud
Al-Issa Suhad
Al-Khaleel Osama
Elmitwally Nouh
Publication venue: 'ScopeMed Publishing'
Publication date: 12/11/2022
Field of study

This paper is the first step in an effort toward building automatic speech recognition (ASR) system for Quranic recitations that caters specifically to female reciters. To function properly, ASR systems require a huge amount of data for training. Surprisingly, the data readily available for Quranic recitations suffer from major limitations. Specifically, the currently available audio recordings of Quran recitations have massive volume, but they are mostly done by male reciters (who have dedicated most of their lives to perfecting their recitation skills) using professional and expensive equipment. Such proficiency in the training data (along with the fact that the reciters come from a specific demographic group; adult males) will most likely lead to some bias in the resulting model and limit their ability to process input from other groups, such as non-/semi-professionals, females or children. This work aims at empirically exploring this shortcoming. To do so, we create a first-of-its-kind (to the best of our knowledge) benchmark dataset called the Quran recitations by females and males (QRFAM) dataset. QRFAM is a relatively big dataset of audio recordings made by male and female reciters from different age groups and proficiency levels. After creating the dataset, we experiment on it by building ASR systems based on one of the most popular open-source ASR models, which is the celebrated DeepSpeech model from Mozilla. The speaker-independent end-to-end models, that we produce, are evaluated using word error rate (WER). Despite DeepSpeech’s known flexibility and prowess (which is shown when trained and tested on recitations from the same group), the models trained on the recitations of one group could not recognize most of the recitations done by the other groups in the testing phase. This shows that there is still a long way to go in order to produce an ASR system that can be used by anyone and the first step is to build and expand the resources needed for this such as QRFAM. Hopefully, our work will be the first step in this direction and it will inspire the community to take more interest in this problem

Birmingham City University Open Access Repository

BCU Open Access

Extensions to the Latent Dirichlet Allocation Topic Model Using Flexible Priors

Author: Ihou Koffi Eddy
Publication venue
Publication date: 23/11/2020
Field of study

Intrinsically, topic models have always their likelihood functions fixed to multinomial distributions as they operate on count data instead of Gaussian data. As a result, their performances ultimately depend on the flexibility of the chosen prior distributions when following the Bayesian paradigm compared to classical approaches such as PLSA (probabilistic latent semantic analysis), unigrams and mixture of unigrams that do not use prior information. The standard LDA (latent Dirichlet allocation) topic model operates with symmetric Dirichlet distribution (as a conjugate prior) which has been found to carry some limitations due to its independent structure that tends to hinder performance for instance in topic correlation including positively correlated data processing. Compared to classical ML estimators, the use of priors ultimately presents another unique advantage of smoothing out the multinomials while enhancing predictive topic models. In this thesis, we propose a series of flexible priors such as generalized Dirichlet (GD) and Beta-Liouville (BL) for our topic models within the collapsed representation, leading to much improved CVB (collapsed variational Bayes) update equations compared to ones from the standard LDA. This is because the flexibility of these priors improves significantly the lower bounds in the corresponding CVB algorithms. We also show the robustness of our proposed CVB inferences when using simultaneously the BL and GD in hybrid generative-discriminative models where the generative stage produces good and heterogeneous topic features that are used in the discriminative stage by powerful classifiers such as SVMs (support vector machines) as we propose efficient probabilistic kernels to facilitate processing (classification) of documents based on topic signatures. Doing so, we implicitly cast topic modeling which is an unsupervised learning method into a supervised learning technique. Furthermore, due to the complexity of the CVB algorithm (as it requires second order Taylor expansions) in general, despite its flexibility, we propose a much simpler and tractable update equation using a MAP (maximum a posteriori) framework with the standard EM (expectation-maximization) algorithm. As most Bayesian posteriors are not tractable for complex models, we ultimately propose the MAP-LBLA (latent BL allocation) where we characterize the contributions of asymmetric BL priors over the symmetric Dirichlet (Dir). The proposed MAP technique importantly offers a point estimate (mode) with a much tractable solution. In the MAP, we show that point estimate could be easy to implement than full Bayesian analysis that integrates over the entire parameter space. The MAP implicitly exhibits some equivalent relationship with the CVB especially the zero order approximations CVB0 and its stochastic version SCVB0. The proposed method enhances performances in information retrieval in text document analysis. We show that parametric topic models (as they are finite dimensional methods) have a much smaller hypothesis space and they generally suffer from model selection. We therefore propose a Bayesian nonparametric (BNP) technique that uses the Hierarchical Dirichlet process (HDP) as conjugate prior to the document multinomial distributions where the asymmetric BL serves as a diffuse (probability) base measure that provides the global atoms (topics) that are shared among documents. The heterogeneity in the topic structure helps in providing an alternative to model selection because the nonparametric topic model (which is infinite dimensional with a much bigger hypothesis space) could now prune out irrelevant topics based on the associated probability masses to only retain the most relevant ones. We also show that for large scale applications, stochastic optimizations using natural gradients of the objective functions have demonstrated significant performances when we learn rapidly both data and parameters in online fashion (streaming). We use both predictive likelihood and perplexity as evaluation methods to assess the robustness of our proposed topic models as we ultimately refer to probability as a way to quantify uncertainty in our Bayesian framework. We improve object categorization in terms of inferences through the flexibility of our prior distributions in the collapsed space. We also improve information retrieval technique with the MAP and the HDP-LBLA topic models while extending the standard LDA. These two applications present the ultimate capability of enhancing a search engine based on topic models

Concordia University Research Repository

Improving speaker recognition by biometric voice deconstruction

Author: Gómez Vilda Pedro
Mazaira Fernández Luis Miguel
Álvarez Marquina Agustín
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2015
Field of study

Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Archivo Digital UPM

Enhancing Software Project Outcomes: Using Machine Learning and Open Source Data to Employ Software Project Performance Determinants

Author: Jagtiani Lalit N.
Publication venue
Publication date: 22/03/2019
Field of study

Many factors can influence the ongoing management and execution of technology projects. Some of these elements are known a priori during the project planning phase. Others require real-time data gathering and analysis throughout the lifetime of a project. These real-time project data elements are often neglected, misclassified, or otherwise misinterpreted during the project execution phase resulting in increased risk of delays, quality issues, and missed business opportunities. The overarching motivation for this research endeavor is to offer reliable improvements in software technology management and delivery. The primary purpose is to discover and analyze the impact, role, and level of influence of various project related data on the ongoing management of technology projects. The study leverages open source data regarding software performance attributes. The goal is to temper the subjectivity currently used by project managers (PMs) with quantifiable measures when assessing project execution progress. Modern-day PMs who manage software development projects are charged with an arduous task. Often, they obtain their inputs from technical leads who tend to be significantly more technical. When assessing software projects, PMs perform their role subject to the limitations of their capabilities and competencies. PMs are required to contend with the stresses of the business environment, the policies, and procedures dictated by their organizations, and resource constraints. The second purpose of this research study is to propose methods by which conventional project assessment processes can be enhanced using quantitative methods that utilize real-time project execution data. Transferability of academic research to industry application is specifically addressed vis-à-vis a delivery framework to provide meaningful data to industry practitioners

UB ScholarWorks