174 research outputs found

    End-to-End Disfluency Detection in Automatic Speech Recognition for Second Language Learners

    Get PDF
    Second language (L2) learner's speech data is a big challenge for Automatic Speech Recognition (ASR) models. Moreover, L2 students' speech contains many grammatical errors, mispronunciations and disfluencies, depending on the person's proficiency level. Disfluency detection tasks have conventionally been carried out as an added step after an ASR pipeline, which is inconvenient, as data needs to be prepared in addition to the one used for ASR, as well as the need of finetuning a supplemental model and incorporating it into the downstream task. Conventional ASR systems are comprised of separate model components, an acoustic model, a language model and a lexicon. End-to-end ASR introduces a simplified pipeline over traditional systems, such that the acoustic feature sequences are directly mapped to word sequences, without the need for additional modules. As end-to-end systems streamline the ASR process, this thesis investigates the incorporation of disfluency detection into the same low-resource end-to-end ASR task, thus eliminating the need for a separate component, and ultimately resulting in reduced computations. The disfluency detection models in this work are developed for L2 speakers learning Finnish, and obtain good performance without substantially deviating from an end-to-end L2 Finnish ASR baseline. The best model's ASR performance is promising, reaching a word error rate of 30.41 % and a character error rate of 13.17 %. Moreover, for disfluency detection the model obtains a Recall of 0.5655 and a Precision of 0.6017. The results are encouraging as the models can successfully extrapolate different disfluency types from low-resource L2 Finnish speech

    <i>Vibrio galatheae</i> sp. nov., a novel member of the <i>Vibrionaceae </i>family isolated from the Solomon Sea.

    Get PDF
    Based on genetic, chemotaxonomic and phenotypic characteristics, a novel species belonging to the genus Vibrio is described. The facultative anaerobic strain S2757T was isolated from a mussel collected in the Solomon Sea (Solomon Islands). Phylogenetic analyses based on sequences of 16S rRNA and fur genes indicated the affiliation of the strain to a new species. This observation was supported by a multilocus sequence analysis (MLSA) including sequences of the housekeeping genes 16S rRNA, gyrB, pyrH, recA and topA. In silico DNA-DNA hybridization (DDH) and Average Nucleotide Identity (ANI) values comparing the genomic sequence of strain S2757T with those of closely related type strains were lower than 23 and 82 %, respectively. The DNA G+C content of the strain was 45.3 mol%. Phenotypic and chemotaxonomic analyses clearly differentiated the strain from other Vibrio species. Hence, strain S2757T should be considered a novel species in the genus Vibrio. The name Vibrio galatheae sp. nov. is proposed, with S2757T (= DSM 100497T = LMG 28895T) as the type strain

    Unsupervised Learning for Domain Adaptation in automatic classification tasks through Neural Networks

    Get PDF
    Machine Learning systems have improved dramatically in recent years for automatic recognition and artificial intelligence tasks. In general, these systems are based on the use of a large amount of labeled data - also called training sets - in order to learn a model that fits the problem in question. The training set consists of examples of possible inputs to the system and the output expected from them. Achieving this training set is the main limitation to use Machine Learning systems, since it requires human effort to find and map possible inputs with their corresponding outputs. The situation is often frustrating since systems learn to solve the task for a specific domain - that is, a type of input with relatively homogeneous conditions – and they are not able to generalize to correctly solve the same task in other domains. This project considers the use of Domain Adaptation algorithms, which are capable of learning to adapt a Machine Learning model to work in an unknown domain based on only unlabeled data (unsupervised learning). This facilitates the transfer of systems to new domains because obtaining unlabeled data is relatively cheap, since the cost is to label them. To date, Domain Adaptation algorithms have been used in very restricted contexts, so this project aims to make an empirical evaluation of these algorithms in a greater number of cases, as well as propose possible improvements

    Alain Rabatel, Michèle Monte, Maria das Graças Soares Rodrigues, dirs, Comment les médias parlent des émotions. L’affaire Nafissatou Diallo contre Dominique Strauss-Kahn

    Get PDF
    L’ouvrage réunit 16 contributions consacrées à l’étude des émotions que les médias francophones et non francophones ont suscitées et dont ils se sont fait l’écho dans le traitement d’un événement à retentissement mondial : l’affaire Nafissatou Diallo contre Dominique Strauss-Kahn. En effet, à la suite des accusations d’agression sexuelle, de tentative de viol et de séquestration portées par une femme de chambre de l’hôtel Sofitel de New York (Nafissatou Diallo), Dominique Strauss-Kahn, alors ..
    • …
    corecore