144 research outputs found

    FT Speech: Danish Parliament Speech Corpus

    Get PDF
    This paper introduces FT Speech, a new speech corpus created from the recorded meetings of the Danish Parliament, otherwise known as the Folketing (FT). The corpus contains over 1,800 hours of transcribed speech by a total of 434 speakers. It is significantly larger in duration, vocabulary, and amount of spontaneous speech than the existing public speech corpora for Danish, which are largely limited to read-aloud and dictation data. We outline design considerations, including the preprocessing methods and the alignment procedure. To evaluate the quality of the corpus, we train automatic speech recognition systems on the new resource and compare them to the systems trained on the Danish part of Spr\r{a}kbanken, the largest public ASR corpus for Danish to date. Our baseline results show that we achieve a 14.01 WER on the new corpus. A combination of FT Speech with in-domain language data provides comparable results to models trained specifically on Spr\r{a}kbanken, showing that FT Speech transfers well to this data set. Interestingly, our results demonstrate that the opposite is not the case. This shows that FT Speech provides a valuable resource for promoting research on Danish ASR with more spontaneous speech.Comment: Submitted to Interspeech 202

    Morphological, syntactic and diacritics rules for automatic diacritization of Arabic sentences

    Get PDF
    AbstractThe diacritical marks of Arabic language are characters other than letters and are in the majority of cases absent from Arab writings. This paper presents a hybrid system for automatic diacritization of Arabic sentences combining linguistic rules and statistical treatments. The used approach is based on four stages. The first phase consists of a morphological analysis using the second version of the morphological analyzer Alkhalil Morpho Sys. Morphosyntactic outputs from this step are used in the second phase to eliminate invalid word transitions according to the syntactic rules. Then, the system used in the third stage is a discrete hidden Markov model and Viterbi algorithm to determine the most probable diacritized sentence. The unseen transitions in the training corpus are processed using smoothing techniques. Finally, the last step deals with words not analyzed by Alkhalil analyzer, for which we use statistical treatments based on the letters. The word error rate of our system is around 2.58% if we ignore the diacritic of the last letter of the word and around 6.28% when this diacritic is taken into account

    Effects of Encoding Practice on Alphabet, Phonemic Awareness, and Spelling Skills of Students with Developmental Delays

    Get PDF
    Reading instruction has historically been deemphasized for students in special education, and the limited research on this topic reveals that sight word vocabulary is most commonly taught in special education classrooms (Browder, Wakeman, Spooner, Ahlgrim-Delzell, Algozzine, 2006). However, successful reading instruction must target the five essential components: vocabulary, fluency, comprehension, phonics, and phonemic awareness (National Reading Panel, 2000). The extremely small body of research attempting to teach phonics and phonemic awareness to students with mild to severe disabilities approaches instruction from a decoding framework with mixed success (Browder et al., 2006). Alternatively, this study aims to teach from an encoding framework. Encoding is the process of converting speech sounds to print by applying the alphabetic code (Herron, 2008). Students are actively engaged in the process relying on their current level of knowledge to construct words. Any attempt is viewed as a success that can be gradually improved by feedback and increased phonological and phonemic awareness. This study investigated whether encoding practice embedded in a narrative context would improve participants’ developmental spelling patterns across intervention sessions, and whether scores on measures of phonological awareness, alphabetic knowledge, print knowledge, language abilities, and spelling would improve following the 18 intervention sessions. Prior to any intervention, participants completed multiple baseline probes attempting to spell three lists of target words that were randomly selected from the words that would be targeted during intervention. Immediately before intervention sessions, participants attempted to spell five target words independently. During intervention sessions, the same five words were practiced in a narrative context with scaffolding and feedback (i.e., examiner and Phonic Faces). Participants again attempted to spell the same five target words independently immediately following the intervention session. On average, participants’ spelling attempts improved following intervention sessions. One participant made expected positive changes in encoding abilities from baseline to intervention, while the other participants made inconsistent progress. From pretest to posttest, participants made clinically significant gains on standardized measures of phonological awareness, vocabulary, and language measures. Findings of the study suggest that students with developmental disabilities have the potential to learn early reading skills when given direct instruction and practice

    Considerations on some aspects of the relationship between intrinsic and extrinsic time in two 4-year-old-children’s and one adult’s speech for duration in brazilian portuguese

    Get PDF
    O trabalho tem como objetivo verificar as diferenças entre duas crianças de idade média de 4 anos e 5 meses e da professora delas na implementação do parâmetro acústico de duração. Crianças e professora são falantes do português brasileiro (PB) e foram gravadas num experimento de repetição de sentenças a partir de modelo fornecido pela pesquisadora. A duração tem se mostrado o principal parâmetro acústico na implementação do acento lexical no PB. Uma drástica redução dos segmentos acústicos em posição pós-tônica tem sido observada no contorno duracional da fala adulta. Entretanto, a literatura sugere, e os dados aqui apresentados vão nessa direção, que crianças abaixo de 6 anos ainda não reduzem tais segmentos como os adultos, principalmente devido ao processo, em curso, de maturação neuromotora. Os dados mostram como as duas crianças aqui estudadas implementam o contorno duracional do PB diferentemente da professora e também com diferenças entre elas, sendo que a questão é sempre como lidar com a redução dos segmentos acústicos nas posições não-acentuadas, a partir de suas capacidades neuromotoras quando da gravação. Com isso, põe-se em discussão a relação entre tempo intrínseco (ou da unidade dinâmica adotada na análise, o gesto articulatório) e tempo extrínseco (ou dos relógios no nível da sílaba e do sintagma ou frase entoacional que interagem com o tempo do gesto articulatório), considerando,para tanto, como as crianças implementam o parâmetro de duração no segmento, na palavra e no contorno duracional da sentença. É proposto que um relógio extrínseco no nível da frase é ajustado antes do relógio extrínseco no nível da sílaba, sem que seja necessário postular um relógio extrínseco no nível do acento lexical, uma vez que as palavras, principalmente na fala da criança menor aqui estudada, parecem estar sendo tratadas como frases entoacionais. Tudo isso pode ser acomodado num modelo dinâmico do ritmo como aquele proposto por Barbosa (2001)

    Map Task Corpus of Heritage BCMS spoken by second-generation speakers in Switzerland

    Full text link
    In this paper, we present a corpus for heritage Bosnian/Croatian/Montenegrin/Serbian (BCMS) spoken in German-speaking Switzerland. The corpus consists of elicited conversations between 29 second-generation speakers originating from different regions of former Yugoslavia. In total, the corpus contains 30 turn-aligned transcripts with an average length of 6 min. It is enriched with extensive speakers’ metadata, annotations, and pre-calculated corpus counts. The corpus can be accessed through an interactive corpus platform that allows for browsing, querying, and filtering, but also for creating and sharing custom annotations. Principal user groups we address with this corpus are researchers of heritage BCMS, as well as students and teachers of BCMS living in diaspora. In addition to introducing the corpus platform and the workflows we adopted to create it, we also present a case study on BCMS spoken by a pair of siblings who participated in the map task, and discuss advantages and challenges of using this corpus platform for linguistic research

    The Relationship Between Predictive Reading and Predictive Spelling Strategies Using Cloze Tests

    Get PDF
    The purpose of this research was to determine if there was a relationship between predictive strategies used in reading and predictive strategies used in spelling and to see if both could be measured using cloze tests. A secondary purpose was to see if there was a relationship between a spelling score on a standardized test and a score on a spelling cloze test; a relationship between a reading comprehension score on a standardized test and a score on a reading cloze test; a relationship between a standardized spelling test score and a score on a spelling word selection test; a relationship between a standardized reading comprehension test score and a standardized spelling test score; and a relationship between a spelling cloze score and a spelling word selection test score. The reading cloze test, spelling cloze test and the spelling word selection test were examiner-designed. The reading cloze test, consisting of forty-nine scored blanks, and the spelling cloze test, consisting of eighteen nonsense words with one or two-letter blanks per word, sought to determine if a subject used predictive strategies. The spelling word selection test consisted of twenty-five groups of three pseudo words and one nonsense word. The standardized spelling and reading comprehension test scores were taken from the Stanford Achievement Test. All tests were administered to a total of eighty-nine students: fifty-five regular sixth graders, fourteen gifted sixth graders and eleven gifted fifth graders who were in the same reading class, and nine learning disabled students who were not doing sixth grade work but were of sixth grade age. Class placement of subjects was determined by the school district. The reading cloze test was scored using synonyms as correct answers. The spelling cloze test was constructed using spelling rules and patterns but some answers which did not conform to these were also accepted if the nonsense word could be pronounced and if it looked like a real word. In each group in the spelling word selection test the nonsense word was the only correct answer. All test data (examiner-designed and standardized) were analyzed using raw scores. Results showed a significant linear correlation between all relationships studied. An informal analysis using averages was used on the time and the score from the examiner-designed tests. The examiner kept track of the time each test was begun and subjects recorded the time each test was completed. Class lists were used to break down the regular sixth graders into reading groups. The gifted fifth and sixth graders and the learning disabled class were already separate reading groups. The informal data analysis showed that on the average, the high reading group of regular sixth graders and the gifted fifth and sixth graders scored higher on all three tests. The medium reading group of regular sixth graders did not do as well and the low reading group of regular sixth graders and the learning disabled class scored even lower. The time factor appeared to have little impact on scores since often the poor readers took as much time as the better ones but still did not do as well

    A prefix encoding for a constructed language

    Get PDF
    This work focuses in the formal and technical analysis of some aspects of a constructed language. As a first part of the work, a possible coding for the language will be studied, emphasizing the pre x coding, for which an extension of the Hu man algorithm from binary to n-ary will be implemented. Because of that in the language we can't know a priori the frequency of use of the words, a study will be done and several strategies will be proposed for an open words system, analyzing previously the existing number of words in current natural languages. As a possible upgrade of the coding, we'll take also a look to the synchronization loss problem, as well as to its solution: the self-synchronization, a t-codes study with the number of possible words for the language, as well as other alternatives. Finally, and from a less formal approach, several applications for the language have been developed: A voice synthesizer, a speech recognition system and a system font for the use of the language in text processors. For each of these applications, the process used for its construction, as well as the problems encountered and still to solve in each will be detailed
    • …
    corecore