22,855 research outputs found
An active learning approach for statistical spoken language understanding
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-25085-9_67In general, large amount of segmented and labeled data is needed to estimate statistical language understanding systems. In recent years, different approaches have been proposed to reduce the segmentation and labeling effort by means of unsupervised o semi-supervised learning techniques. We propose an active learning approach to the estimation of statistical language understanding models that involves the transcription, labeling and segmentation of a small amount of data, along with the use of raw data. We use this approach to learn the understanding component of a Spoken Dialog System. Some experiments that show the appropriateness of our approach are also presented.Work partially supported by the Spanish MICINN under contract TIN2008-06856-C05-02, and by the Vicerrectorat d’InvestigaciĂł, Desenvolupament i InnovaciĂł of the Universitat Politècnica de València under contract 20100982.GarcĂa Granada, F.; Hurtado Oliver, LF.; SanchĂs Arnal, E.; Segarra Soriano, E. (2011). An active learning approach for statistical spoken language understanding. En Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. Springer Verlag (Germany). 7042:565-572. https://doi.org/10.1007/978-3-642-25085-9_67S5655727042De Mori, R., Bechet, F., Hakkani-Tur, D., McTear, M., Riccardi, G., Tur, G.: Spoken language understanding: A survey. IEEE Signal Processing Magazine 25(3), 50–58 (2008)Fraser, M., Gilbert, G.: Simulating speech systems. Computer Speech and Language 5, 81–99 (1991)Gotab, P., Bechet, F., Damnati, G.: Active learning for rule-based and corpus-based spoken labguage understanding moldes. In: IEEE Workshop Automatic Speech Recognition and Understanding (ASRU 2009), pp. 444–449 (2009)Gotab, P., Damnati, G., Becher, F., Delphin-Poulat, L.: Online slu model adaptation with a partial oracle. In: Proc. of InterSpeech 2010, Makuhari, Chiba, Japan, pp. 2862–2865 (2010)He, Y., Young, S.: Spoken language understanding using the hidden vector state model. Speech Communication 48, 262–275 (2006)Ortega, L., Galiano, I., Hurtado, L.F., Sanchis, E., Segarra, E.: A statistical segment-based approach for spoken language understanding. In: Proc. of InterSpeech 2010, Makuhari, Chiba, Japan, pp. 1836–1839 (2010)Riccardi, G., Hakkani-Tur, D.: Active learning: theory and applications to automatic speech recognition. IEEE Transactions on Speech and Audio Processing 13(4), 504–511 (2005)Segarra, E., Sanchis, E., Galiano, M., GarcĂa, F., Hurtado, L.: Extracting Semantic Information Through Automatic Learning Techniques. International Journal of Pattern Recognition and Artificial Intelligence 16(3), 301–307 (2002)Tur, G., Hakkani-Tr, D., Schapire, R.E.: Combining active and semi-supervised learning for spoken language understanding. Speech Communication 45, 171–186 (2005
Robustness issues in a data-driven spoken language understanding system
Robustness is a key requirement in spoken language understanding (SLU) systems. Human speech is often ungrammatical and ill-formed, and there will frequently be a mismatch between training and test data. This paper discusses robustness and adaptation issues in a statistically-based SLU system which is entirely data-driven. To test robustness, the system has been tested on data from the Air Travel Information Service (ATIS) domain which has been artificially corrupted with varying levels of additive noise. Although the speech recognition performance degraded steadily, the system did not fail catastrophically. Indeed, the rate at which the end-to-end performance of the complete system degraded was significantly slower than that of the actual recognition component. In a second set of experiments, the ability to rapidly adapt the core understanding component of the system to a different application within the same broad domain has been tested. Using only a small amount of training data, experiments have shown that a semantic parser based on the Hidden Vector State (HVS) model originally trained on the ATIS corpus can be straightforwardly adapted to the somewhat different DARPA Communicator task using standard adaptation algorithms. The paper concludes by suggesting that the results presented provide initial support to the claim that an SLU system which is statistically-based and trained entirely from data is intrinsically robust and can be readily adapted to new applications
A multilingual SLU system based on semantic decoding of graphs of words
In this paper, we present a statistical approach to Language
Understanding that allows to avoid the effort of obtaining new semantic
models when changing the language. This way, it is not necessary to acquire
and label new training corpora in the new language. Our approach
consists of learning all the semantic models in a target language and
to do the semantic decoding of the sentences pronounced in the source
language after a translation process. In order to deal with the errors and
the lack of coverage of the translations, a mechanism to generalize the
result of several translators is proposed. The graph of words generated
in this phase is the input to the semantic decoding algorithm specifically
designed to combine statistical models and graphs of words. Some experiments
that show the good behavior of the proposed approach are also
presented.Calvo Lance, M.; Hurtado Oliver, LF.; GarcĂa Granada, F.; SanchĂs Arnal, E. (2012). A multilingual SLU system based on semantic decoding of graphs of words. En Advances in Speech and Language Technologies for Iberian Languages. Springer Verlag (Germany). 328:158-167. doi:10.1007/978-3-642-35292-8_17S158167328Hahn, S., Dinarelli, M., Raymond, C., Lefèvre, F., Lehnen, P., De Mori, R., Moschitti, A., Ney, H., Riccardi, G.: Comparing stochastic approaches to spoken language understanding in multiple languages. IEEE Transactions on Audio, Speech, and Language Processing 6(99), 1569–1583 (2010)Raymond, C., Riccardi, G.: Generative and discriminative algorithms for spoken language understanding. In: Proceedings of Interspeech 2007, pp. 1605–1608 (2007)Tur, G., Mori, R.D.: Spoken Language Understanding: Systems for Extracting Semantic Information from Speech, 1st edn. Wiley (2011)Maynard, H.B., Lefèvre, F.: Investigating Stochastic Speech Understanding. In: Proc. of IEEE Automatic Speech Recognition and Understanding Workshop, ASRU (2001)Segarra, E., Sanchis, E., Galiano, M., GarcĂa, F., Hurtado, L.: Extracting Semantic Information Through Automatic Learning Techniques. IJPRAI 16(3), 301–307 (2002)He, Y., Young, S.: Spoken language understanding using the hidden vector state model. Speech Communication 48, 262–275 (2006)De Mori, R., Bechet, F., Hakkani-Tur, D., McTear, M., Riccardi, G., Tur, G.: Spoken language understanding: A survey. IEEE Signal Processing Magazine 25(3), 50–58 (2008)Hakkani-TĂĽr, D., BĂ©chet, F., Riccardi, G., Tur, G.: Beyond ASR 1-best: Using word confusion networks in spoken language understanding. Computer Speech & Language 20(4), 495–514 (2006)Tur, G., Wright, J., Gorin, A., Riccardi, G., Hakkani-TĂĽr, D.: Improving spoken language understanding using word confusion networks. In: Proceedings of the ICSLP. Citeseer (2002)Tur, G., Hakkani-TĂĽr, D., Schapire, R.E.: Combining active and semi-supervised learning for spoken language understanding. Speech Communication 45, 171–186 (2005)Ortega, L., Galiano, I., Hurtado, L.F., Sanchis, E., Segarra, E.: A statistical segment-based approach for spoken language understanding. In: Proc. of InterSpeech 2010, Makuhari, Chiba, Japan, pp. 1836–1839 (2010)Sim, K.C., Byrne, W.J., Gales, M.J.F., Sahbi, H., Woodland, P.C.: Consensus network decoding for statistical machine translation system combination. In: IEEE Int. Conference on Acoustics, Speech, and Signal Processing (2007)Bangalore, S., Bordel, G., Riccardi, G.: Computing Consensus Translation from Multiple Machine Translation Systems. In: Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2001, pp. 351–354 (2001)Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., Higgins, D.G.: ClustalW and ClustalX version 2.0. Bioinformatics 23(21), 2947–2948 (2007)BenedĂ, J.M., Lleida, E., Varona, A., Castro, M.J., Galiano, I., Justo, R., LĂłpez de Letona, I., Miguel, A.: Design and acquisition of a telephone spontaneous speech dialogue corpus in Spanish: DIHANA. In: Proceedings of LREC 2006, Genoa, Italy, pp. 1636–1639 (May 2006
Combining Several ASR Outputs in a Graph-Based SLU System
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-25751-8_66In this paper, we present an approach to Spoken Language
Understanding (SLU) where we perform a combination of multiple
hypotheses from several Automatic Speech Recognizers (ASRs) in
order to reduce the impact of recognition errors in the SLU module. This
combination is performed using a Grammatical Inference algorithm that
provides a generalization of the input sentences by means of a weighted
graph of words. We have also developed a specific SLU algorithm that is
able to process these graphs of words according to a stochastic semantic
modelling.The results show that the combinations of several hypotheses
from the ASR module outperform the results obtained by taking just the
1-best transcriptionThis work is partially supported by the Spanish MEC under contract TIN2014-54288-C4-3-R and FPU Grant AP2010-4193.Calvo Lance, M.; Hurtado Oliver, LF.; GarcĂa-Granada, F.; SanchĂs Arnal, E. (2015). Combining Several ASR Outputs in a Graph-Based SLU System. En Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. Springer. 551-558. https://doi.org/10.1007/978-3-319-25751-8_66S551558Bangalore, S., Bordel, G., Riccardi, G.: Computing consensus translation from multiple machine translation systems. In: ASRU, pp. 351–354 (2001)BenedĂ, J.M., Lleida, E., Varona, A., Castro, M.J., Galiano, I., Justo, R., de Letona, I.L., Miguel, A.: Design and acquisition of a telephone spontaneous speech dialogue corpus in Spanish: DIHANA. In: LREC, pp. 1636–1639 (2006)Bonneau-Maynard, H., Lefèvre, F.: Investigating stochastic speech understanding. In: IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 260–263 (2001)Calvo, M., GarcĂa, F., Hurtado, L.F., JimĂ©nez, S., Sanchis, E.: Exploiting multiple hypotheses for multilingual spoken language understanding. In: CoNLL, pp. 193–201 (2013)Fiscus, J.G.: A post-processing system to yield reduced word error rates: recognizer output voting error reduction (ROVER). In: 1997 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 347–354 (1997)Hahn, S., Dinarelli, M., Raymond, C., Lefèvre, F., Lehnen, P., De Mori, R., Moschitti, A., Ney, H., Riccardi, G.: Comparing stochastic approaches to spoken language understanding in multiple languages. IEEE Transactions on Audio, Speech, and Language Processing 6(99), 1569–1583 (2010)Hakkani-TĂĽr, D., BĂ©chet, F., Riccardi, G., TĂĽr, G.: Beyond ASR 1-best: Using word confusion networks in spoken language understanding. Computer Speech & Language 20(4), 495–514 (2006)He, Y., Young, S.: Spoken language understanding using the hidden vector state model. Speech Communication 48, 262–275 (2006)Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., Higgins, D.G.: ClustalW and ClustalX version 2.0. Bioinformatics 23(21), 2947–2948 (2007)Segarra, E., Sanchis, E., Galiano, M., GarcĂa, F., Hurtado, L.: Extracting Semantic Information Through Automatic Learning Techniques. IJPRAI 16(3), 301–307 (2002)TĂĽr, G., Deoras, A., Hakkani-TĂĽr, D.: Semantic parsing using word confusion networks with conditional random fields. In: INTERSPEECH (2013
- …