Search CORE

200 research outputs found

Design of reservoir computing systems for the recognition of noise corrupted speech and handwriting

Author: Jalalvand Azarakhsh
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2015
Field of study

Modularity and Neural Integration in Large-Vocabulary Continuous Speech Recognition

Author: Kilgour Kevin
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2015
Field of study

This Thesis tackles the problems of modularity in Large-Vocabulary Continuous Speech Recognition with use of Neural Network

The Performance Evaluation of Speech Recognition by Comparative Approach

Author: Rajeev R.
Raviteja R.
Venkateswarlu R.L.K.
Publication venue: 'IntechOpen'
Publication date: 12/09/2012
Field of study

Automatically generated summaries of sports videos based on semantic content

Author: Miguel André Almeida Tomás Ferreira de Barros
Publication venue
Publication date: 18/07/2019
Field of study

The sport has been a part of our lives since the beginning of times, whether we are spectators or participants. The diffusion and increase of multimedia platforms made the consumption of these contents available to everyone. Sports videos appeal to a large population all around the world and have become an important form of multimedia content that is streamed over the Internet and television networks. Moreover, sport content creators want to provide the users with relevant information such as live commentary, summarization of the games in form of text or video using automatic tools.As a result, MOG-Technologies wants to create a tool capable of summarizing football matches based on semantic content, and this problem was explored in the scope of this Dissertation. The main objective is to convert the television football commentator's speech into text taking advantage of Google's Speech-to-Text tool. Several machine learning models were then tested to classify sentences into important events. For the model training, a dataset was created, combining 43 games transcription from different television channels also from 72 games provided by Google Search timeline commentary, the combined dataset contains 3260 sentences. To validate the proposed solution the accuracy and f1 score were extracted for each machine learning model.The results show that the developed tool is capable of predicting events in live events, with low error rate. Also, combining multiple sources, not only the sport commentator speech, will help to increase the performance of the tool. It is important to notice that the dataset created during this Dissertation will allow MOG-Technologies to expand and perfect the concept discussed in this project

Wavelet-based techniques for speech recognition

Author: Omar Farooq (7204418)
Publication venue
Publication date: 01/01/2002
Field of study

In this thesis, new wavelet-based techniques have been developed for the extraction of features from speech signals for the purpose of automatic speech recognition (ASR). One of the advantages of the wavelet transform over the short time Fourier transform (STFT) is its capability to process non-stationary signals. Since speech signals are not strictly stationary the wavelet transform is a better choice for time-frequency transformation of these signals. In addition it has compactly supported basis functions, thereby reducing the amount of computation as opposed to STFT where an overlapping window is needed. [Continues.