42 research outputs found
AUTOMATIC TEXT-INDEPENDENT SPEAKER TRACKING SYSTEM USING FEED-FORWARD NEURAL NETWORKS (FFNN)
ABSTRACT Speaker tracking is the process of following who says something in a given speech signal. In this paper, we propose a new set of robust source features for Automatic Text-Independent speaker tracking system using Feed-forward neural networks (FFNN). LP analysis is used to extract the source information from the speech signal. This source information is speaker specific. In this approach, instead of capturing the distribution of feature vectors correspond to vocal tract system of the speakers, the time varying speaker-specific source characteristics are captured using Linear Prediction (LP) residual signal of the given speech signal. MFCC features are extracted from the source speech signal, which contains prosody and speaker specific information. These source features which are extracted are proven to be robust and insensitive to channel characteristics and noise. In this paper, finally it is proved that speaker tracking system using source features with FFNN outperformed other existing methods. Keywords: LPC, MFCC, Source feature, Speaker tracking. INTRODUCTION Speech is produced from a time varying vocal tract system excited by a time varying excitation sourc
Growth and characterization of SiC epitaxial layers on Si- and C-face 4H SiC substrates by chemical-vapor deposition
High-quality Schottky junctions have been fabricated on n-type 4H SiC epitaxial layers grown by chemical-vapor deposition on C- and Si-face substrates in order to understand the effect of growth direction on the growth mechanism and formation of defects. Atomic force microscopy analysis showed dramatic differences between the surfaces of SiC epilayers grown on C and Si faces. There was a significant step bunching in the SiC grown on Si-face substrates. Current-voltage, capacitance-voltage, and deep-level transient spectroscopy (DLTS) measurements were carried out on the Schottky junctions to analyze the junction characteristics. The Schottky junctions on C-face SiC showed larger barrier heights than those on Si-face SiC, showing that each face has a different surface energy. The barrier heights of Ni Schottky junctions were found to be 1.97 and 1.54 eV for C-face and Si-face materials, respectively. However, the deep-level spectra obtained by DLTS were similar, regardless of the increased surface roughness of the Si-face 4H SiC
Electrical and Optical Properties of Fluorine Doped Tin Oxide Thin Films Prepared by Magnetron Sputtering
magnetron sputtering technique in an Ar/O2 atmosphere using blends of tin oxide and tin fluoride powder formed into targets. FTO coatings were deposited with a thickness of 400 nm on glass substrates. No post-deposition annealing treatments were carried out. The effects of the chemical composition on the structural (phase, grain size), optical (transmission, optical band-gap) and electrical (resistivity, charge carrier, mobility) properties of the thin films were investigated. Depositing FTO by magnetron sputtering is an environmentally friendly technique and the use of loosely packed blended powder targets gives an efficient means of screening candidate compositions, which also provides a low cost operation. The best film characteristics were achieved using a mass ratio of 12% SnF2 to 88% SnO2 in the target. The thin film produced was polycrystalline with a tetragonal crystal structure. The optimized conditions resulted in a thin film with average visible transmittance of 83% and optical band-gap of 3.80 eV, resistivity of 6.71 × 10−3 Ω·cm, a carrier concentration (Nd) of 1.46 × 1020 cm−3 and a mobility of 15 cm2/Vs
On the wave-like character of periodic precipitates
This article does not have an abstract
Speaker diarization system using HXLPS and deep neural network
In general, speaker diarization is defined as the process of segmenting the input speech signal and grouped the homogenous regions with regard to the speaker identity. The main idea behind this system is that it is able to discriminate the speaker signal by assigning the label of the each speaker signal. Due to rapid growth of broadcasting and meeting, the speaker diarization is burdensome to enhance the readability of the speech transcription. In order to solve this issue, Holoentropy with the eXtended Linear Prediction using autocorrelation Snapshot (HXLPS) and deep neural network (DNN) is proposed for the speaker diarization system. The HXLPS extraction method is newly developed by incorporating the Holoentropy with the XLPS. Once we attain the features, the speech and non-speech signals are detected by the Voice Activity Detection (VAD) method. Then, i-vector representation of every segmented signal is obtained using Universal Background Model (UBM) model. Consequently, DNN is utilized to assign the label for the speaker signal which is then clustered according to the speaker label. The performance is analysed using the evaluation metrics, such as tracking distance, false alarm rate and diarization error rate. The outcome of the proposed method ensures the better diarization performance by achieving the lower DER of 1.36% based on lambda value and DER of 2.23% depends on the frame length. Keywords: Speaker diarization, HXLPS feature extraction, Voice activity detection, Deep neural network, Speaker clustering, Diarization Error Rate (DER
Chemistry of Terminalia species—IX: The structure of methyl anhydro tomentosate
Methyl triacetyltomentosate by loss of water (1 mole) with POCl3-pyridine yields methyl anhydrotomentosate which is not identical with methyl dehydroarjunolate. The anhydro compound yields a molecule of formaldehyde in OsO4-Pb(OAc)4 oxidation, establishing an exocyclic double bond, probably situated at C-20 through a methyl shift. It suffers facile catalytic reduction to give methyl dihydroanhydrotomentosate which resembles methyl asiatate in m.p., m.m.p., and lactonization but a comparison of their IR spectra reveals that probably inversion at C-19 has taken place during the Wagner-Meerwein methyl rearrangement
VSC-Based DSTATCOM for PQ Improvement: A Deep-Learning Approach
With the rapid advancement of the technology, deep learning supported voltage source converter (VSC)-based distributed static compensator (DSTATCOM) for power quality (PQ) improvement has attracted significant interest due to its high accuracy. In this paper, six subnets are structured for the proposed deep learning approach (DL-Approach) algorithm by using its own mathematical equations. Three subnets for active and the other three for reactive weight components are used to extract the fundamental component of the load current. These updated weights are utilised for the generation of the reference source currents for VSC. Hysteresis current controllers (HCCs) are employed in each phase in which generated switching signal patterns need to be carried out from both predicted reference source current and actual source current. As a result, the proposed technique achieves better dynamic performance, less computation burden and better estimation speed. Consequently, the results were obtained for different loading conditions using MATLAB/Simulink software. Finally, the feasibility was effective as per the benchmark of IEEE guidelines in response to harmonics curtailment, power factor (p.f) improvement, load balancing and voltage regulation