40 research outputs found

    SpellMapper: A non-autoregressive neural spellchecker for ASR customization with candidate retrieval based on n-gram mappings

    Full text link
    Contextual spelling correction models are an alternative to shallow fusion to improve automatic speech recognition (ASR) quality given user vocabulary. To deal with large user vocabularies, most of these models include candidate retrieval mechanisms, usually based on minimum edit distance between fragments of ASR hypothesis and user phrases. However, the edit-distance approach is slow, non-trainable, and may have low recall as it relies only on common letters. We propose: 1) a novel algorithm for candidate retrieval, based on misspelled n-gram mappings, which gives up to 90% recall with just the top 10 candidates on Spoken Wikipedia; 2) a non-autoregressive neural model based on BERT architecture, where the initial transcript and ten candidates are combined into one input. The experiments on Spoken Wikipedia show 21.4% word error rate improvement compared to a baseline ASR system.Comment: Accepted by INTERSPEECH 202

    A large list of confusion sets for spellchecking assessed against a corpus of real-word errors

    Get PDF
    One of the methods that has been proposed for dealing with real-word errors (errors that occur when a correctly spelled word is substituted for the one intended) is the "confusion-set" approach - a confusion set being a small group of words that are likely to be confused with one another. Using a list of confusion sets drawn up in advance, a spellchecker, on finding one of these words in a text, can assess whether one of the other members of its set would be a better fit and, if it appears to be so, propose that word as a correction. Much of the research using this approach has suffered from two weaknesses. The first is the small number of confusion sets used. The second is that systems have largely been tested on artificial errors. In this paper we address these two weaknesses. We describe the creation of a realistically sized list of confusion sets, then the assembling of a corpus of real-word errors, and then we assess the potential of that list in relation to that corpus

    Locality enhanced dynamic biasing and sampling strategies for contextual ASR

    Full text link
    Automatic Speech Recognition (ASR) still face challenges when recognizing time-variant rare-phrases. Contextual biasing (CB) modules bias ASR model towards such contextually-relevant phrases. During training, a list of biasing phrases are selected from a large pool of phrases following a sampling strategy. In this work we firstly analyse different sampling strategies to provide insights into the training of CB for ASR with correlation plots between the bias embeddings among various training stages. Secondly, we introduce a neighbourhood attention (NA) that localizes self attention (SA) to the nearest neighbouring frames to further refine the CB output. The results show that this proposed approach provides on average a 25.84% relative WER improvement on LibriSpeech sets and rare-word evaluation compared to the baseline.Comment: Accepted for IEEE ASRU 202

    Multidimensional Pareto optimization of touchscreen keyboards for speed, familiarity and improved spell checking

    Get PDF
    The paper presents a new optimization technique for keyboard layouts based on Pareto front optimization. We used this multifactorial technique to create two new touchscreen phone keyboard layouts based on three design metrics: minimizing finger travel distance in order to maximize text entry speed, a new metric to maximize the quality of spell correction quality by minimizing neighbouring key ambiguity, and maximizing familiarity through a similarity function with the standard Qwerty layout. The paper describes the optimization process and resulting layouts for a standard trapezoid shaped keyboard and a more rectangular layout. Fitts' law modelling shows a predicted 11% improvement in entry speed without taking into account the significantly improved error correction potential and the subsequent effect on speed. In initial user tests typing speed dropped from approx. 21wpm with Qwerty to 13wpm (64%) on first use of our layout but recovered to 18wpm (85%) within four short trial sessions, and was still improving. NASA TLX forms showed no significant difference on load between Qwerty and our new layout use in the fourth session. Together we believe this shows the new layouts are faster and can be quickly adopted by users

    Studying the Effect and Treatment of Misspelled Queries in Cross-Language Information Retrieval

    Get PDF
    [Abstract] The performance of Information Retrieval systems is limited by the linguistic variation present in natural language texts. Word-level Natural Language Processing techniques have been shown to be useful in reducing this variation. In this article, we summarize our work on the extension of these techniques for dealing with phrase-level variation in European languages, taking Spanish as a case in point. We propose the use of syntactic dependencies as complex index terms in an attempt to solve the problems deriving from both syntactic and morpho-syntactic variation and, in this way, to obtain more precise index terms. Such dependencies are obtained through a shallow parser based on cascades of finite-state transducers in order to reduce as far as possible the overhead due to this parsing process. The use of different sources of syntactic information, queries or documents, has been also studied, as has the restriction of the dependencies applied to those obtained from noun phrases. Our approaches have been tested using the CLEF corpus, obtaining consistent improvements with regard to classical word-level non-linguistic techniques. Results show, on the one hand, that syntactic information extracted from documents is more useful than that from queries. On the other hand, it has been demonstrated that by restricting dependencies to those corresponding to noun phrases, important reductions of storage and management costs can be achieved, albeit at the expense of a slight reduction in performance.Ministerio de Economía y Competitividad; FFI2014-51978-C2-1-RRede Galega de Procesamento da Linguaxe e Recuperación de Información; CN2014/034Ministerio de Economía y Competitividad; BES-2015-073768Ministerio de Economía y Competitividad; FFI2014-51978-C2-2-

    Figure Text Extraction in Biomedical Literature

    Get PDF
    Background: Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engin

    A Winnow-Based Approach to Context-Sensitive Spelling Correction

    Full text link
    A large class of machine-learning problems in natural language require the characterization of linguistic context. Two characteristic properties of such problems are that their feature space is of very high dimensionality, and their target concepts refer to only a small subset of the features in the space. Under such conditions, multiplicative weight-update algorithms such as Winnow have been shown to have exceptionally good theoretical properties. We present an algorithm combining variants of Winnow and weighted-majority voting, and apply it to a problem in the aforementioned class: context-sensitive spelling correction. This is the task of fixing spelling errors that happen to result in valid words, such as substituting "to" for "too", "casual" for "causal", etc. We evaluate our algorithm, WinSpell, by comparing it against BaySpell, a statistics-based method representing the state of the art for this task. We find: (1) When run with a full (unpruned) set of features, WinSpell achieves accuracies significantly higher than BaySpell was able to achieve in either the pruned or unpruned condition; (2) When compared with other systems in the literature, WinSpell exhibits the highest performance; (3) The primary reason that WinSpell outperforms BaySpell is that WinSpell learns a better linear separator; (4) When run on a test set drawn from a different corpus than the training set was drawn from, WinSpell is better able than BaySpell to adapt, using a strategy we will present that combines supervised learning on the training set with unsupervised learning on the (noisy) test set.Comment: To appear in Machine Learning, Special Issue on Natural Language Learning, 1999. 25 page

    Implementasi Fitur Autocomplete Dan Algoritma Levenshtein Distance Untuk Meningkatkan Efektivitas Pencarian Kata Di Kamus Besar Bahasa Indonesia (KBBI)

    Full text link
    Penelitian ini dilakukan untuk mengimplementasikan fitur autocomplete dan algoritma levenshtein distance pada apllikasi KBBI dan untuk mengetahui efektivitas penggunaannya dalam fitur pencarian aran kata. Metode pengembangan software yang digunakan adalah dengan menggunakan metode waterfall, yang terdiri dari lima bagian yaitu requirement definitions, system and software design, implementation and unit testing, integration and system testing, dan operation and maintenance.Hasil penelitian yang didapat dari pengujian black box terhadap kemunculan autocomplete adalah muncul untuk setiap kata yang diinputkan. Lalu untuk pengujian dengan algoritma levenshtein distance, saran sudah bisa muncul meskipun tidak semua saran sesuai dengan yang diharapkan dan untuk pengujian terhadap keseluruhan sistem aplikasi dihasilkan keluaran yang valid untuk setiap menu yang diuji. Pengujian keefektifan terhadap efektifitas implementasi autocomplete pada aplikasi adalah sebesar 84.615 % yang berarti fitur ini sangat efektif. Dan untuk levenshtein distance adalah sebesar 76.04 % yang berarti efektif untuk digunakan di aplikasi KBBI. Saran yang dapat diberikan dalam penelitian ini adalah sebaiknya dilakukan penambahan menu pencarian kata dan ungkapan daerah, kata dan ungkapan asing, dan sinonim dan akronim agar kamus digital ini lebih lengkap seperti versi cetaknya

    Misspelled queries in cross-language IR: analysis and management

    Get PDF
    Este artículo estudia el impacto de los errores ortográficos en las consultas sobre el rendimiento de los sistemas de recuperación de información multilingüe, proponiendo dos estrategias para su tratamiento: el empleo de técnicas de corrección ortográfica automática y la utilización de n-gramas de caracteres como términos índice y unidad de traducción, para así aprovecharnos de su robustez inherente. Los resultados demuestran la sensibilidad de estos sistemas frente a dichos errores así como la efectividad de las soluciones propuestas. Hasta donde alcanza nuestro conocimiento no existen trabajos similares en el ámbito multilingüe.This paper studies the impact of misspelled queries on the performance of Cross-Language Information Retrieval systems and proposes two strategies for dealing with them: the use of automatic spelling correction techniques and the use of character n-grams both as index terms and translation units, thus allowing to take advantage of their inherent robustness. Our results demonstrate the sensitivity of these systems to such errors and the effectiveness of the proposed solutions. To the best of our knowledge there are no similar jobs in the cross-language field.Trabajo parcialmente subvencionado por el Ministerio de Economía y Competitividad y FEDER (proyectos TIN2010-18552-C03-01 y TIN2010-18552-C03-02) y por la Xunta de Galicia (ayudas CN 2012/008, CN 2012/317 y CN 2012/319)
    corecore