2,607 research outputs found
Searching by approximate personal-name matching
We discuss the design, building and evaluation of a method to access theinformation of a person, using his name as a search key, even if it has deformations. We present a similarity function, the DEA function, based
on the probabilities of the edit operations accordingly to the involved
letters and their position, and using a variable threshold. The efficacy
of DEA is quantitatively evaluated, without human relevance judgments,
very superior to the efficacy of known methods. A very efficient
approximate search technique for the DEA function is also presented
based on a compacted trie-tree structure.Postprint (published version
A summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition
We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding zero resource (unsupervised) speech technologies and related models of early language acquisition. Centered around the tasks of phonetic and lexical discovery, we consider unified evaluation metrics, present two new approaches for improving speaker independence in the absence of supervision, and evaluate the application of Bayesian word segmentation algorithms to automatic subword unit tokenizations. Finally, we present two strategies for integrating zero resource techniques into supervised settings, demonstrating the potential of unsupervised methods to improve mainstream technologies.5 page(s
Dyslexic children's reading pattern as input for ASR: Data, analysis, and pronunciation model
To realize an automatic speech recognition (ASR) model that
is able to recognize the Bahasa Melayu reading difficulties of dyslexic children, the language corpora has to be generated beforehand. For this purpose, data collection is performed in two public schools involving ten dyslexic children aged between seven to fourteen years old. A total of 114 Bahasa Melayu words,representing 23 consonant-vowel patterns in the spelling system of the language, served as the stimuli. The patterns range from simple to somewhat complex formations of consonant-vowel pairs in words listed in a level one primary school syllabus. An analysis was performed aimed at identifying the most frequent errors made by these dyslexic children when reading aloud, and
describing the emerging reading pattern of dyslexic children
in general. This paper hence provides an overview of the
entire process from data collection to analysis to modeling the pronunciations of words which will serve as the active lexicon for the ASR model. This paper also highlights the challenges of data collection involving dyslexic children when they are reading aloud, and other factors that contribute to the complex nature of the data collected
- …