7 research outputs found
Recommended from our members
Speaker recognition with hybrid features from a deep belief network
Learning representation from audio data has shown advantages over the handcrafted features such as mel-frequency cepstral coefficients (MFCCs) in many audio applications. In most of the representation learning approaches, the connectionist systems have been used to learn and extract latent features from the fixed length data. In this paper, we propose an approach to combine the learned features and the MFCC features for speaker recognition task, which can be applied to audio scripts of different lengths. In particular, we study the use of features from different levels of deep belief network for quantizing the audio data into vectors of audio word counts. These vectors represent the audio scripts of different lengths that make them easier to train a classifier. We show in the experiment that the audio word count vectors generated from mixture of DBN features at different layers give better performance than the MFCC features. We also can achieve further improvement by combining the audio word count vector and the MFCC features
Supervised classification for object identification in urban areas using satellite imagery
This paper presents a useful method to achieve classification in satellite
imagery. The approach is based on pixel level study employing various features
such as correlation, homogeneity, energy and contrast. In this study gray-scale
images are used for training the classification model. For supervised
classification, two classification techniques are employed namely the Support
Vector Machine (SVM) and the Naive Bayes. With textural features used for
gray-scale images, Naive Bayes performs better with an overall accuracy of 76%
compared to 68% achieved by SVM. The computational time is evaluated while
performing the experiment with two different window sizes i.e., 50x50 and
70x70. The required computational time on a single image is found to be 27
seconds for a window size of 70x70 and 45 seconds for a window size of 50x50.Comment: 2018 International Conference on Computing, Mathematics and
Engineering Technologies (iCoMET
Deep learning methods in speaker recognition: a review
This paper summarizes the applied deep learning practices in the field of
speaker recognition, both verification and identification. Speaker recognition
has been a widely used field topic of speech technology. Many research works
have been carried out and little progress has been achieved in the past 5-6
years. However, as deep learning techniques do advance in most machine learning
fields, the former state-of-the-art methods are getting replaced by them in
speaker recognition too. It seems that DL becomes the now state-of-the-art
solution for both speaker verification and identification. The standard
x-vectors, additional to i-vectors, are used as baseline in most of the novel
works. The increasing amount of gathered data opens up the territory to DL,
where they are the most effective