41 research outputs found
Speaker recognition utilizing distributed DCT-II based Mel frequency cepstral coefficients and fuzzy vector quantization
In this paper, a new and novel Automatic Speaker Recognition (ASR) system is presented. The new ASR system includes novel feature extraction and vector classification steps utilizing distributed Discrete Cosine Transform (DCT-II) based Mel Frequency Cepstral Coef?cients (MFCC) and Fuzzy Vector Quantization (FVQ). The ASR algorithm utilizes an approach based on MFCC to identify dynamic features that are used for Speaker Recognition (SR)
Microporous materials fabricated from discrete molecular cages
In the context of materials science, an interesting relationship exists between the properties of solid materials and the existence of void spaces within them. In fact, whether the presence of voids is desired or not tends to depend on one’s perception of the effects that voids induce. In densified materials, for example, the presence of voids can be detrimental to structural integrity. Thus, such materials that contain voids are considered defective. On the other hand, when voids are desirable, their presence in certain materials is essential to material behavior. In zeolites, for example, the size, shape, and connectivity of void spaces regulate catalytic activity. In reality, however, and at some finite length scale, all real materials contain intrinsic void space; a consequence of the imperfect packing arrangements of atoms. Thus, it is necessary not only to elucidate what effects voids have on the properties of materials, but also to investigate methods that provide control of void features within solid materials.
While all materials possess intrinsic voids, the ability to introduce intentional voids in solids presents multiple difficulties. The statement “nature abhors a vacuum” is a familiar quip that reflects this challenge of designing open pore spaces in solid materials, as porous frameworks with open void spaces are often higher in energy relative to their more dense structural counterparts. Nonetheless, during the last few decades, technology has advanced such that scientists have significant control over the size, shape, and position of voids within solids. Materials such as zeolites, metal-organic frameworks (MOFs), and covalent-organic frameworks (COFs) all demonstrate the profound ability to position pores of various shapes and sizes with molecular precision in a solid framework. This control over pore design has led to significant materials applications for porous materials including adsorption, catalysis, and molecular separation.
Despite the successes of porous networks such as zeolites, MOFs, and COFs, there remains a need for greater molecular diversity and tunable microenvironments that are precise in molecular design. Moreover, there is a need for fundamental understanding of the relationship between characteristics of voids derived from molecular species and the behavior these entities exhibit within solid materials. Herein, we test the hypothesis that discrete molecular cages with non-collapsible pores are building blocks for porous solids by preparing molecular cages via alkyne metathesis. We demonstrate that molecular pores can be rationally synthesized from tritopic organic precursors in a single step and assembled in the solid state to afford permanently porous materials. Featuring organic synthesis and modular packing, our methodology provides molecular control for the fabrication of functional porous materials with precise microenvironments.
First, a non-intuitive precursor design principle for synthesizing molecular cages via alkyne metathesis is described. By subjecting a series of precursors with varying bite angles to AM, it is experimentally demonstrated that the product distribution and convergence towards product formation is strongly dependent on precursor bite angle. Furthermore, it was discovered that precursors with the ideal tetrahedron bite angle (60º) do not afford the most efficient pathway to the product. These results lend credence to the underlying systemic issues facing the synthesis of 3D architectures via dynamic covalent chemistry, where variations in precursor geometry lead to significant deviation of product distributions away from discrete products.
Next, a systematic study of the effects of molecular shape-persistence on the porosity of molecular solids is discussed. Three molecular cages synthesized via alkyne metathesis and post-synthetic modifications were designed to provide controlled, stepwise adjustments in molecular shape-persistence. Experimental measurements of nitrogen adsorption taken from rapidly and slowly crystallized solids of each cage demonstrated a trend in porosity that correlated with shape-persistence. Molecular dynamic simulations that modeled cage motion corroborated the trend seen in the experimental data and emphasized that shape-persistence governs the microporosity of these materials. Our integrated synthetic and computational approach demonstrates that the microporosity of this class of molecular solids can be controlled through fine-tuning at both the atomic and microscales.
Lastly, the fabrication and characterization of a novel solid-state lithium electrolyte nanocomposite derived from a porous molecular cage is discussed. A solid-liquid electrolyte nanocomposite (SLEN) fabricated from an electrolyte system and a porous organic cage exhibits ionic conductivity on the order of 1 x 10-3 S cm-1. With an experimentally measured activation barrier of 0.16 eV, this composite is characterized as a superionic conductor. Furthermore, the SLEN displays excellent oxidative stability up to 4.7 V vs. Li/Li+. This simple three-component system enables the rational design of electrolytes from tunable, discrete molecular architectures that possess intrinsic void space
Towards Building a Speech Recognition System for Quranic Recitations: A Pilot Study Involving Female Reciters
This paper is the first step in an effort toward building automatic speech recognition (ASR) system for Quranic recitations that caters specifically to female reciters. To function properly, ASR systems require a huge amount of data for training. Surprisingly, the data readily available for Quranic recitations suffer from major limitations. Specifically, the currently available audio recordings of Quran recitations have massive volume, but they are mostly done by male reciters (who have dedicated most of their lives to perfecting their recitation skills) using professional and expensive equipment. Such proficiency in the training data (along with the fact that the reciters come from a specific demographic group; adult males) will most likely lead to some bias in the resulting model and limit their ability to process input from other groups, such as non-/semi-professionals, females or children. This work aims at empirically exploring this shortcoming. To do so, we create a first-of-its-kind (to the best of our knowledge) benchmark dataset called the Quran recitations by females and males (QRFAM) dataset. QRFAM is a relatively big dataset of audio recordings made by male and female reciters from different age groups and proficiency levels. After creating the dataset, we experiment on it by building ASR systems based on one of the most popular open-source ASR models, which is the celebrated DeepSpeech model from Mozilla. The speaker-independent end-to-end models, that we produce, are evaluated using word error rate (WER). Despite DeepSpeech’s known flexibility and prowess (which is shown when trained and tested on recitations from the same group), the models trained on the recitations of one group could not recognize most of the recitations done by the other groups in the testing phase. This shows that there is still a long way to go in order to produce an ASR system that can be used by anyone and the first step is to build and expand the resources needed for this such as QRFAM. Hopefully, our work will be the first step in this direction and it will inspire the community to take more interest in this problem
Extensions to the Latent Dirichlet Allocation Topic Model Using Flexible Priors
Intrinsically, topic models have always their likelihood functions fixed to multinomial
distributions as they operate on count data instead of Gaussian data. As a result,
their performances ultimately depend on the flexibility of the chosen prior distributions
when following the Bayesian paradigm compared to classical approaches such as PLSA
(probabilistic latent semantic analysis), unigrams and mixture of unigrams that do not use
prior information. The standard LDA (latent Dirichlet allocation) topic model operates
with symmetric Dirichlet distribution (as a conjugate prior) which has been found to carry
some limitations due to its independent structure that tends to hinder performance for
instance in topic correlation including positively correlated data processing. Compared to
classical ML estimators, the use of priors ultimately presents another unique advantage of
smoothing out the multinomials while enhancing predictive topic models.
In this thesis, we propose a series of flexible priors such as generalized Dirichlet (GD)
and Beta-Liouville (BL) for our topic models within the collapsed representation, leading
to much improved CVB (collapsed variational Bayes) update equations compared to ones
from the standard LDA. This is because the flexibility of these priors improves significantly
the lower bounds in the corresponding CVB algorithms. We also show the robustness of our
proposed CVB inferences when using simultaneously the BL and GD in hybrid generative-discriminative models
where the generative stage produces good and heterogeneous topic
features that are used in the discriminative stage by powerful classifiers such as SVMs
(support vector machines) as we propose efficient probabilistic kernels to facilitate processing
(classification) of documents based on topic signatures. Doing so, we implicitly cast topic
modeling which is an unsupervised learning method into a supervised learning technique.
Furthermore, due to the complexity of the CVB algorithm (as it requires second order
Taylor expansions) in general, despite its flexibility, we propose a much simpler and tractable
update equation using a MAP (maximum a posteriori) framework with the standard EM
(expectation-maximization) algorithm. As most Bayesian posteriors are not tractable for
complex models, we ultimately propose the MAP-LBLA (latent BL allocation) where we
characterize the contributions of asymmetric BL priors over the symmetric Dirichlet (Dir).
The proposed MAP technique importantly offers a point estimate (mode) with a much
tractable solution. In the MAP, we show that point estimate could be easy to implement
than full Bayesian analysis that integrates over the entire parameter space. The MAP
implicitly exhibits some equivalent relationship with the CVB especially the zero order
approximations CVB0 and its stochastic version SCVB0. The proposed method enhances
performances in information retrieval in text document analysis.
We show that parametric topic models (as they are finite dimensional methods) have a
much smaller hypothesis space and they generally suffer from model selection. We therefore
propose a Bayesian nonparametric (BNP) technique that uses the Hierarchical Dirichlet
process (HDP) as conjugate prior to the document multinomial distributions where the
asymmetric BL serves as a diffuse (probability) base measure that provides the global
atoms (topics) that are shared among documents. The heterogeneity in the topic structure
helps in providing an alternative to model selection because the nonparametric topic model
(which is infinite dimensional with a much bigger hypothesis space) could now prune out
irrelevant topics based on the associated probability masses to only retain the most relevant
ones.
We also show that for large scale applications, stochastic optimizations using natural
gradients of the objective functions have demonstrated significant performances when we
learn rapidly both data and parameters in online fashion (streaming). We use both
predictive likelihood and perplexity as evaluation methods to assess the robustness of our
proposed topic models as we ultimately refer to probability as a way to quantify uncertainty
in our Bayesian framework. We improve object categorization in terms of inferences through
the flexibility of our prior distributions in the collapsed space. We also improve information
retrieval technique with the MAP and the HDP-LBLA topic models while extending the
standard LDA. These two applications present the ultimate capability of enhancing a search
engine based on topic models
Improving speaker recognition by biometric voice deconstruction
Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions
Enhancing Software Project Outcomes: Using Machine Learning and Open Source Data to Employ Software Project Performance Determinants
Many factors can influence the ongoing management and execution of technology projects. Some of these elements are known a priori during the project planning phase. Others require real-time data gathering and analysis throughout the lifetime of a project. These real-time project data elements are often neglected, misclassified, or otherwise misinterpreted during the project execution phase resulting in increased risk of delays, quality issues, and missed business opportunities. The overarching motivation for this research endeavor is to offer reliable improvements in software technology management and delivery. The primary purpose is to discover and analyze the impact, role, and level of influence of various project related data on the ongoing management of technology projects. The study leverages open source data regarding software performance attributes. The goal is to temper the subjectivity currently used by project managers (PMs) with quantifiable measures when assessing project execution progress. Modern-day PMs who manage software development projects are charged with an arduous task. Often, they obtain their inputs from technical leads who tend to be significantly more technical. When assessing software projects, PMs perform their role subject to the limitations of their capabilities and competencies. PMs are required to contend with the stresses of the business environment, the policies, and procedures dictated by their organizations, and resource constraints. The second purpose of this research study is to propose methods by which conventional project assessment processes can be enhanced using quantitative methods that utilize real-time project execution data. Transferability of academic research to industry application is specifically addressed vis-à-vis a delivery framework to provide meaningful data to industry practitioners