Search CORE

99 research outputs found

Improving large vocabulary continuous speech recognition by combining GMM-based and reservoir-based acoustic modeling

Author: Demuynck Kris
Martens Jean-Pierre
Triefenbach Fabian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

In earlier work we have shown that good phoneme recognition is possible with a so-called reservoir, a special type of recurrent neural network. In this paper, different architectures based on Reservoir Computing (RC) for large vocabulary continuous speech recognition are investigated. Besides experiments with HMM hybrids, it is shown that a RC-HMM tandem can achieve the same recognition accuracy as a classical HMM, which is a promising result for such a fairly new paradigm. It is also demonstrated that a state-level combination of the scores of the tandem and the baseline HMM leads to a significant improvement over the baseline. A word error rate reduction of the order of 20\% relative is possible

Crossref

Ghent University Academic Bibliography

Social Bots for Online Public Health Interventions

Author: Allem Jon-Patrick
Deb Ashok
Ferrara Emilio
Majmundar Anuja
Matsui Akira
Seo Sungyong
Tandon Rajat
Yan Shen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/04/2018
Field of study

According to the Center for Disease Control and Prevention, in the United States hundreds of thousands initiate smoking each year, and millions live with smoking-related dis- eases. Many tobacco users discuss their habits and preferences on social media. This work conceptualizes a framework for targeted health interventions to inform tobacco users about the consequences of tobacco use. We designed a Twitter bot named Notobot (short for No-Tobacco Bot) that leverages machine learning to identify users posting pro-tobacco tweets and select individualized interventions to address their interest in tobacco use. We searched the Twitter feed for tobacco-related keywords and phrases, and trained a convolutional neural network using over 4,000 tweets dichotomously manually labeled as either pro- tobacco or not pro-tobacco. This model achieves a 90% recall rate on the training set and 74% on test data. Users posting pro- tobacco tweets are matched with former smokers with similar interests who posted anti-tobacco tweets. Algorithmic matching, based on the power of peer influence, allows for the systematic delivery of personalized interventions based on real anti-tobacco tweets from former smokers. Experimental evaluation suggests that our system would perform well if deployed. This research offers opportunities for public health researchers to increase health awareness at scale. Future work entails deploying the fully operational Notobot system in a controlled experiment within a public health campaign

arXiv.org e-Print Archive

Crossref

Modeling of Filled Pauses and Onomatopoeas for Spontaneous Speech Recognition

Author: Andrej Zgank
Mirjam Sepesy Maucec
Publication venue: 'IntechOpen'
Publication date: 16/08/2010
Field of study

IntechOpen

Letter-based speech synthesis

Author: King Simon
Watts Oliver
Yamagishi Junichi
Publication venue
Publication date: 01/09/2010
Field of study

Initial attempts at performing text-to-speech conversion based on standard orthographic units are presented, forming part of a larger scheme of training TTS systems on features that can be trivially extracted from text. We evaluate the possibility of using the technique of decision-tree-based context clustering conventionally used in HMM-based systems for parametertying to handle letter-to-sound conversion. We present the application of a method of compound-feature discovery to corpusbased speech synthesis. Finally, an evaluation of intelligibility of letter-based systems and more conventional phoneme-based systems is presented

CiteSeerX

Edinburgh Research Archive

Edinburgh Research Explorer

Speech Recognition System of Slovenian Broadcast News

Author: Sepesy Maučec Mirjam
Žgank Andrej
Publication venue: 'IntechOpen'
Publication date: 13/06/2011
Field of study

IntechOpen

Digital library of University of Maribor

Mobile Information Access with Spoken Query Answering

Author: Brøndsted Tom
Larsen Henrik Legind
Larsen Lars Bo
Lindberg Børge
Ortiz-Arroyo Daniel
Tan Zheng-Hua
Xu Haitian
Publication venue: Denmark.
Publication date: 01/01/2006
Field of study

VBN

Adaptively growing hierarchical mixtures of experts

Author: Finke Michael
Fritsch Jürgen
Waibel Alex
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

Multidialectal acoustic modeling: a comparative study

Author: Caballero Galeote Mónica
Moreno Bilbao M. Asunción
Nogueiras Rodríguez Albino
Publication venue
Publication date: 01/01/2006
Field of study

In this paper, multidialectal acoustic modeling based on shar- ing data across dialects is addressed. A comparative study of different methods of combining data based on decision tree clustering algorithms is presented. Approaches evolved differ in the way of evaluating the similarity of sounds between di- alects, and the decision tree structure applied. Proposed systems are tested with Spanish dialects across Spain and Latin Amer- ica. All multidialectal proposed systems improve monodialectal performance using data from another dialect but it is shown that the way to share data is critical. The best combination between similarity measure and tree structure achieves an improvement of 7% over the results obtained with monodialectal systems.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC