176,252 research outputs found
Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNN
Call Centers have huge amount of audio data which can be used for achieving
valuable business insights and transcription of phone calls is manually tedious
task. An effective Automated Speech Recognition system can accurately
transcribe these calls for easy search through call history for specific
context and content allowing automatic call monitoring, improving QoS through
keyword search and sentiment analysis. ASR for Call Center requires more
robustness as telephonic environment are generally noisy. Moreover, there are
many low-resourced languages that are on verge of extinction which can be
preserved with help of Automatic Speech Recognition Technology. Urdu is the
most widely spoken language in the world, with 231,295,440 worldwide
still remains a resource constrained language in ASR. Regional call-center
conversations operate in local language, with a mix of English numbers and
technical terms generally causing a "code-switching" problem. Hence, this paper
describes an implementation framework of a resource efficient Automatic Speech
Recognition/ Speech to Text System in a noisy call-center environment using
Chain Hybrid HMM and CNN-TDNN for Code-Switched Urdu Language. Using Hybrid
HMM-DNN approach allowed us to utilize the advantages of Neural Network with
less labelled data. Adding CNN with TDNN has shown to work better in noisy
environment due to CNN's additional frequency dimension which captures extra
information from noisy speech, thus improving accuracy. We collected data from
various open sources and labelled some of the unlabelled data after analysing
its general context and content from Urdu language as well as from commonly
used words from other languages, primarily English and were able to achieve WER
of 5.2% with noisy as well as clean environment in isolated words or numbers as
well as in continuous spontaneous speech.Comment: 32 pages, 19 figures, 2 tables, preprin
Latent Dirichlet Allocation (LDA) for improving the topic modeling of the official bulletin of the spanish state (BOE)
Since Internet was born most people can access fully free to a lot sources of information. Every day a lot of web pages are created and new content is uploaded and shared. Never in the history the humans has been more informed but also uninformed due the huge amount of information that can be access. When we are looking for something in any search engine the results are too many for reading and filtering one by one. Recommended Systems (RS) was created to help us to discriminate and filter these information according to ours preferences. This contribution analyses the RS of the official agency of publications in Spain (BOE), which is known as "Mi BOE'. The way this RS works was analysed, and all the meta-data of the published documents were analysed in order to know the coverage of the system. The results of our analysis show that more than 89% of the documents cannot be recommended, because they are not well described at the documentary level, some of their key meta-data are empty. So, this contribution proposes a method to label documents automatically based on Latent Dirichlet Allocation (LDA). The results are that using this approach the system could recommend (at a theoretical point of view) more than twice of documents that it now does, 11% vs 23% after applied this approach
Enhancing Content-And-Structure Information Retrieval using a Native XML Database
Three approaches to content-and-structure XML retrieval are analysed in this
paper: first by using Zettair, a full-text information retrieval system; second
by using eXist, a native XML database, and third by using a hybrid XML
retrieval system that uses eXist to produce the final answers from likely
relevant articles retrieved by Zettair. INEX 2003 content-and-structure topics
can be classified in two categories: the first retrieving full articles as
final answers, and the second retrieving more specific elements within articles
as final answers. We show that for both topic categories our initial hybrid
system improves the retrieval effectiveness of a native XML database. For
ranking the final answer elements, we propose and evaluate a novel retrieval
model that utilises the structural relationships between the answer elements of
a native XML database and retrieves Coherent Retrieval Elements. The final
results of our experiments show that when the XML retrieval task focusses on
highly relevant elements our hybrid XML retrieval system with the Coherent
Retrieval Elements module is 1.8 times more effective than Zettair and 3 times
more effective than eXist, and yields an effective content-and-structure XML
retrieval
Collaboration Enabling Internet Resource Collection-Building Software and Technologies
Over the last decade the Library of the University of California, Riverside
and its collaborators have developed a number of systems, service designs,
and projects that utilize innovative technologies to foster better Internet
finding tools in libraries and more cooperative and efficient effort in Internet
link and metadata collection building. The open-source software
and projects discussed represent appropriate technologies and sustainable
strategies that we believe will help Internet portals, digital libraries, virtual libraries,
library catalogs-with-portal-like-capabilities (IPDVLCs), and related
collection-building efforts in academia to better scale and more accurately
anticipate and meet the needs of scholarly and educational users.published or submitted for publicatio
Applying semantic web technologies to knowledge sharing in aerospace engineering
This paper details an integrated methodology to optimise Knowledge reuse and sharing, illustrated with a use case in the aeronautics domain. It uses Ontologies as a central modelling strategy for the Capture of Knowledge from legacy docu-ments via automated means, or directly in systems interfacing with Knowledge workers, via user-defined, web-based forms. The domain ontologies used for Knowledge Capture also guide the retrieval of the Knowledge extracted from the data using a Semantic Search System that provides support for multiple modalities during search. This approach has been applied and evaluated successfully within the aerospace domain, and is currently being extended for use in other domains on an increasingly large scale
Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project
In the inEvent EU project [1], we aim at structuring, retrieving, and sharing large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, videoconferences, and lectures. More specifically, we are developing an integrated system that performs audiovisual processing of multimedia recordings, and labels them in terms of interconnected “hyper-events ” (a notion inspired from hyper-texts). Each hyper-event is composed of simpler facets, including audio-video recordings and metadata, which are then easier to search, retrieve and share. In the present paper, we mainly cover the audio processing aspects of the system, including speech recognition, speaker diarization and linking (across recordings), the use of these features for hyper-event indexing and recommendation, and the search portal. We present initial results for feature extraction from lecture recordings using the TED talks. Index Terms: Networked multimedia events; audio processing: speech recognition; speaker diarization and linking; multimedia indexing and searching; hyper-events. 1
- …