2,068 research outputs found
Fast and Accurate OOV Decoder on High-Level Features
This work proposes a novel approach to out-of-vocabulary (OOV) keyword search
(KWS) task. The proposed approach is based on using high-level features from an
automatic speech recognition (ASR) system, so called phoneme posterior based
(PPB) features, for decoding. These features are obtained by calculating
time-dependent phoneme posterior probabilities from word lattices, followed by
their smoothing. For the PPB features we developed a special novel very fast,
simple and efficient OOV decoder. Experimental results are presented on the
Georgian language from the IARPA Babel Program, which was the test language in
the OpenKWS 2016 evaluation campaign. The results show that in terms of maximum
term weighted value (MTWV) metric and computational speed, for single ASR
systems, the proposed approach significantly outperforms the state-of-the-art
approach based on using in-vocabulary proxies for OOV keywords in the indexed
database. The comparison of the two OOV KWS approaches on the fusion results of
the nine different ASR systems demonstrates that the proposed OOV decoder
outperforms the proxy-based approach in terms of MTWV metric given the
comparable processing speed. Other important advantages of the OOV decoder
include extremely low memory consumption and simplicity of its implementation
and parameter optimization.Comment: Interspeech 2017, August 2017, Stockholm, Sweden. 201
Scientific Information Extraction with Semi-supervised Neural Tagging
This paper addresses the problem of extracting keyphrases from scientific
articles and categorizing them as corresponding to a task, process, or
material. We cast the problem as sequence tagging and introduce semi-supervised
methods to a neural tagging model, which builds on recent advances in named
entity recognition. Since annotated training data is scarce in this domain, we
introduce a graph-based semi-supervised algorithm together with a data
selection scheme to leverage unannotated articles. Both inductive and
transductive semi-supervised learning strategies outperform state-of-the-art
information extraction performance on the 2017 SemEval Task 10 ScienceIE task.Comment: accepted by EMNLP 201
Automatic Quality Estimation for ASR System Combination
Recognizer Output Voting Error Reduction (ROVER) has been widely used for
system combination in automatic speech recognition (ASR). In order to select
the most appropriate words to insert at each position in the output
transcriptions, some ROVER extensions rely on critical information such as
confidence scores and other ASR decoder features. This information, which is
not always available, highly depends on the decoding process and sometimes
tends to over estimate the real quality of the recognized words. In this paper
we propose a novel variant of ROVER that takes advantage of ASR quality
estimation (QE) for ranking the transcriptions at "segment level" instead of:
i) relying on confidence scores, or ii) feeding ROVER with randomly ordered
hypotheses. We first introduce an effective set of features to compensate for
the absence of ASR decoder information. Then, we apply QE techniques to perform
accurate hypothesis ranking at segment-level before starting the fusion
process. The evaluation is carried out on two different tasks, in which we
respectively combine hypotheses coming from independent ASR systems and
multi-microphone recordings. In both tasks, it is assumed that the ASR decoder
information is not available. The proposed approach significantly outperforms
standard ROVER and it is competitive with two strong oracles that e xploit
prior knowledge about the real quality of the hypotheses to be combined.
Compared to standard ROVER, the abs olute WER improvements in the two
evaluation scenarios range from 0.5% to 7.3%
- …