147 research outputs found

    Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process

    Get PDF
    International audienceWe present the design of a corpus of native and non-native speech for the language pair French-German, with a special emphasis on phonetic and prosodic aspects. To our knowledge there is no suitable corpus, in terms of size and coverage, currently available for the target language pair. To select the target L1-L2 interference phenomena we prepare a small preliminary corpus (corpus1), which is analyzed for coverage and cross-checked jointly by French and German experts. Based on this analysis, target phenomena on the phonetic and phonological level are selected on the basis of the expected degree of deviation from the native performance and the frequency of occurrence. 14 speakers performed both L2 (either French or German) and L1 material (either German or French). This allowed us to test, recordings duration, recordings material, the performance of our automatic aligner software. Then, we built corpus2 taking into account what we learned about corpus1. The aims are the same but we adapted speech material to avoid too long recording sessions. 100 speakers will be recorded. The corpus (corpus1 and corpus2) will be prepared as a searchable database, available for the scientific community after completion of the project

    Taller de una estrategia didáctica para hacer una mejora significativa de la habilidad del habla en inglés como segunda lengua basada en la enseñanza del lenguaje comunicativo de Wiiliam Littlewood con los alumnos de primer grado de secundaria del colegio “José María Arguedas en Cacatachi provincia de San Martin, región de San Martin - Perú

    Get PDF
    Al referirnos a la problemática se puede apreciar que durante las clases de inglés los estudiantes muestran un déficit al expresarse en forma oral en el idioma inglés, sólo conocen algunas palabras, no son capaces de expresar sus ideas de manera fluida cuando interactúan con el profesor o sus compañeros, además los docentes utilizan metodología tradicional que no permite que los estudiantes desarrollen su comunicación oral, son pocas las actividades que fomentan la comunicación oral, la clase se centra más en la adquisición del vocabulario y reglas gramaticales que en la producción oral, no hay suficiente motivación para animar a que los estudiantes se expresen en inglés, la comunicación se realizar a través de frases cortas memorizadas por los estudiantes. El Objetivo general fue diseñar y proponer una estrategia didáctica basada en la Enseñanza Comunicativa de la Lengua de William Littlewood para lograr una mejora significativa en la habilidad de hablar en el idioma inglés de los estudiantes del 1er año del nivel secundario del colegio “José María Arguedas” en el distrito de Cacatachi, provincia y región San Martín. Tenemos como título de la investigación: Un taller de una estrategia didáctica para realizar una mejora significativa en la habilidad de hablar en inglés como una segunda lengua basado en la Enseñanza Comunicativa de la Lengua de William Littlewood con los estudiantes del 1er grado del nivel secundario en el colegio “José María Arguedas” en el distrito de Cacatachi, provincia de San Martín, Región de San Martín – Perú. La hipótesis quedó redactada así: Si diseñamos y proponemos una estrategia didáctica basada en la Enseñanza Comunicativa de la Lengua de William Littlewood, entonces los estudiantes del 1er año del nivel secundario del colegio “José María Arguedas” del distrito de Cacatachi, provincia y región San Martín, lograrán una mejora significativa en la habilidad de hablar en el idioma inglés. El tipo de investigación es descriptivo propositivo. Podemos citar como conclusión general de la investigación que se logró diseñar y proponer una estrategia didáctica para mejorar la comunicación oral de los estudiantes en el idioma inglés, comprobándose así la hipótesis general

    Консолидација страндардизације ромског језика - прошлост, садашњост, будућност уз поштовање дијалекатског диверзитета и омогућавање једноставне комуникације на матерњом језику широм света

    Get PDF
    The paper begins with the consideration of the circumstances in which Romani language is currently being developed, compared to the main languages in Europe. Furthermore, the paper deals with different stereotypes and explains the Romani dialectal structure not only from the aspect of heritage and as a mother tongue, but in the light of the measurement of inter-dialectal involvement proven by mathematical methods. This process leads to a clear clarification of the term ’dialect’. The natural occurrence of concepts of common, standard, literary, national, etc. in language is explained (according to different schools) as well as the present situation of the Romani language, with a focus on the prospects of its further development (the principle of butterfly) in the social context: the role of parents, family, church, society, school, media and various institutions in codification and normalization – also from the aspect of developing brakes. Some didactic tools are presented as instruments that also contribute to a better understanding of codification and normalization among Roma, leading to the harmonization of different dialects in the spirit of mutual respect for diversity. Such endeavors, however, are useless if they are not understood by users and if they are not really rooted in their culture. The above elements lead to the problem of direct codification and its connection with communication, in particular modern communication on social networks and multiple academic levels – because its ultimate purpose is to provide the Roma with a widespread modern language, with the ability to express all the nuances of human thinking. The paper ends with examples of some good practices in Romania and the former Yugoslavia – bearing in mind that negating or destroying one language is only one element in a wider mechanism of ethnic prejudice against the people speaking it.Прилог почиње разматрањем околности у којима се тренутно развија ромски језик, у поређењу са главним језицима у Европи. Даље се бави различитим стереотипима и објашњава ромску дијалектолошку структуру не само са аспекта баштине и као матерњег језика, већ и у светлу мерења међудијалекатске удaљености, доказане математичким методама. Тај процес води до јасног разјашњавања појма ‘дијалекат’. Објашњава се природно појављивање појмова заједничког, стандардног, књижевног, националног итд. језика (према различитим школама) као и данашња ситуација ромског језика, са фокусом на перспективе његовог даљег развоја (’принцип лептира’) у друштвеном контексту: улога родитеља, породице, цркве, друштва, школе, медија и разних институција у кодификацији и нормализацији – такође са аспекта кочница у развоју. Неки дидактички алати су представљени као инструменти који такође доприносе бољем разумевању кодификације и нормализације међу Ромима, што води до хармонизације различитих наречја у духу међусобног поштовања различитости. Такви су подухвати ипак бескорисни ако их не разумеју корисници и ако нису стварно укорењени у њиховој култури. Наведени елементи доводе до проблема директне кодификације и њених веза са комуникацијом, нарочито модерном комуникацијом на друштвеним мрежама и разним академским нивоима – зато што је њена крајња сврха да Ромима пружи широко распрострањен модеран језик способан да изрази све нијансе људског мишљења. Презентација се завршава примерима неких добрих пракси у Румунији и бившој Југославији – имајући у виду да је негирање или уништавање једног језика само један елемент у ширем механизму етничких предрасуда против народа који датим језиком говори.Научни скупови ; том 175 / Српска академија наука и уметности. Одељење друштвених наука ; књ. 4

    The Role of Emotional and Facial Expression in Synthesised Sign Language Avatars

    Get PDF
    This thesis explores the role that underlying emotional facial expressions might have in regards to understandability in sign language avatars. Focusing specifically on Irish Sign Language (ISL), we examine the Deaf community’s requirement for a visual-gestural language as well as some linguistic attributes of ISL which we consider fundamental to this research. Unlike spoken language, visual-gestural languages such as ISL have no standard written representation. Given this, we compare current methods of written representation for signed languages as we consider: which, if any, is the most suitable transcription method for the medical receptionist dialogue corpus. A growing body of work is emerging from the field of sign language avatar synthesis. These works are now at a point where they can benefit greatly from introducing methods currently used in the field of humanoid animation and, more specifically, the application of morphs to represent facial expression. The hypothesis underpinning this research is: augmenting an existing avatar (eSIGN) with various combinations of the 7 widely accepted universal emotions identified by Ekman (1999) to deliver underlying facial expressions, will make that avatar more human-like. This research accepts as true that this is a factor in improving usability and understandability for ISL users. Using human evaluation methods (Huenerfauth, et al., 2008) the research compares an augmented set of avatar utterances against a baseline set with regards to 2 key areas: comprehension and naturalness of facial configuration. We outline our approach to the evaluation including our choice of ISL participants, interview environment, and evaluation methodology. Remarkably, the results of this manual evaluation show that there was very little difference between the comprehension scores of the baseline avatars and those augmented withEFEs. However, after comparing the comprehension results for the synthetic human avatar “Anna” against the caricature type avatar “Luna”, the synthetic human avatar Anna was the clear winner. The qualitative feedback allowed us an insight into why comprehension scores were not higher in each avatar and we feel that this feedback will be invaluable to the research community in the future development of sign language avatars. Other questions asked in the evaluation focused on sign language avatar technology in a more general manner. Significantly, participant feedback in regard to these questions indicates a rise in the level of literacy amongst Deaf adults as a result of mobile technology

    Audio self-supervised learning: a survey

    Get PDF
    Inspired by the humans' cognitive ability to generalise knowledge and skills, Self-Supervised Learning (SSL) targets at discovering general representations from large-scale data without requiring human annotations, which is an expensive and time consuming task. Its success in the fields of computer vision and natural language processing have prompted its recent adoption into the field of audio and speech processing. Comprehensive reviews summarising the knowledge in audio SSL are currently missing. To fill this gap, in the present work, we provide an overview of the SSL methods used for audio and speech processing applications. Herein, we also summarise the empirical works that exploit the audio modality in multi-modal SSL frameworks, and the existing suitable benchmarks to evaluate the power of SSL in the computer audition domain. Finally, we discuss some open problems and point out the future directions on the development of audio SSL

    Dealing with linguistic mismatches for automatic speech recognition

    Get PDF
    Recent breakthroughs in automatic speech recognition (ASR) have resulted in a word error rate (WER) on par with human transcribers on the English Switchboard benchmark. However, dealing with linguistic mismatches between the training and testing data is still a significant challenge that remains unsolved. Under the monolingual environment, it is well-known that the performance of ASR systems degrades significantly when presented with the speech from speakers with different accents, dialects, and speaking styles than those encountered during system training. Under the multi-lingual environment, ASR systems trained on a source language achieve even worse performance when tested on another target language because of mismatches in terms of the number of phonemes, lexical ambiguity, and power of phonotactic constraints provided by phone-level n-grams. In order to address the issues of linguistic mismatches for current ASR systems, my dissertation investigates both knowledge-gnostic and knowledge-agnostic solutions. In the first part, classic theories relevant to acoustics and articulatory phonetics that present capability of being transferred across a dialect continuum from local dialects to another standardized language are re-visited. Experiments demonstrate the potentials that acoustic correlates in the vicinity of landmarks could help to build a bridge for dealing with mismatches across difference local or global varieties in a dialect continuum. In the second part, we design an end-to-end acoustic modeling approach based on connectionist temporal classification loss and propose to link the training of acoustics and accent altogether in a manner similar to the learning process in human speech perception. This joint model not only performed well on ASR with multiple accents but also boosted accuracies of accent identification task in comparison to separately-trained models

    Cognitive Component Analysis

    Get PDF

    Papers in Pidgin and Creole Linguistics No. 5

    Get PDF
    PETER MOHLHAUSLER, Pidgins, creoles and post-contact Aboriginal languages in Western Australia -- ROBERT FOSTER, PETER MUHLHA.USLER AND PlllLIP CLARKE, 'Give me back my name': The 'classification' of Aboriginal people in colonial South Australia -- TERRY CROWLEY, The Bislama lexicon before the First World War: written attestations -- ANDERS KALLGAAD, A Pitkern word list -- WARREN SHIBLES The phonetics of pidgin and creole: toward a standard IPA transcriptio

    Austronesian and other languages of the Pacific and South-east Asia : an annotated catalogue of theses and dissertations

    Get PDF

    A novel lip geometry approach for audio-visual speech recognition

    Get PDF
    By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. Various method have been studied by research group around the world to incorporate lip movements into speech recognition in recent years, however exactly how best to incorporate the additional visual information is still not known. This study aims to extend the knowledge of relationships between visual and speech information specifically using lip geometry information due to its robustness to head rotation and the fewer number of features required to represent movement. A new method has been developed to extract lip geometry information, to perform classification and to integrate visual and speech modalities. This thesis makes several contributions. First, this work presents a new method to extract lip geometry features using the combination of a skin colour filter, a border following algorithm and a convex hull approach. The proposed method was found to improve lip shape extraction performance compared to existing approaches. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs best when representing speech in the visual domain. Second, a novel template matching technique able to adapt dynamic differences in the way words are uttered by speakers has been developed, which determines the best fit of an unseen feature signal to those stored in a database template. Third, following on evaluation of integration strategies, a novel method has been developed based on alternative decision fusion strategy, in which the outcome from the visual and speech modality is chosen by measuring the quality of audio based on kurtosis and skewness analysis and driven by white noise confusion. Finally, the performance of the new methods introduced in this work are evaluated using the CUAVE and LUNA-V data corpora under a range of different signal to noise ratio conditions using the NOISEX-92 dataset
    corecore