11 research outputs found

    Innovative technologies for under-resourced language documentation: The BULB Project

    No full text
    International audienceThe project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aims at supporting linguists in documenting unwritten languages. In order to achieve this we will develop tools tailored to the needs of documentary linguists by building upon technology and expertise from the area of natural language processing, most prominently automatic speech recognition and machine translation. As a development and test bed for this we have chosen three less-resourced African languages from the Bantu family: Basaa, Myene and Embosi. Work within the project is divided into three main steps: 1) Collection of a large corpus of speech (100h per language) at a reasonable cost. After initial recording, the data is re-spoken by a reference speaker to enhance the signal quality and orally translated into French. 2) Automatic transcription of the Bantu languages at phoneme level and the French translation at word level. The recognized Bantu phonemes and French words will then be automatically aligned. 3) Tool development. In close cooperation and discussion with the linguists, the speech and language technologists will design and implement tools that will support the linguists in their work, taking into account the linguists' needs and technology's capabilities. The data collection has begun for the three languages. For this we use standard mobile devices and a dedicated software—LIG-AIKUMA, which proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). LIG-AIKUMA 's improved features include a smart generation and handling of speaker metadata as well as respeaking and parallel audio data mapping

    Innovative technologies for under-resourced language documentation: The BULB Project

    Get PDF
    International audienceThe project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aims at supporting linguists in documenting unwritten languages. In order to achieve this we will develop tools tailored to the needs of documentary linguists by building upon technology and expertise from the area of natural language processing, most prominently automatic speech recognition and machine translation. As a development and test bed for this we have chosen three less-resourced African languages from the Bantu family: Basaa, Myene and Embosi. Work within the project is divided into three main steps: 1) Collection of a large corpus of speech (100h per language) at a reasonable cost. After initial recording, the data is re-spoken by a reference speaker to enhance the signal quality and orally translated into French. 2) Automatic transcription of the Bantu languages at phoneme level and the French translation at word level. The recognized Bantu phonemes and French words will then be automatically aligned. 3) Tool development. In close cooperation and discussion with the linguists, the speech and language technologists will design and implement tools that will support the linguists in their work, taking into account the linguists' needs and technology's capabilities. The data collection has begun for the three languages. For this we use standard mobile devices and a dedicated software—LIG-AIKUMA, which proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). LIG-AIKUMA 's improved features include a smart generation and handling of speaker metadata as well as respeaking and parallel audio data mapping

    Eléments de description de l'orungu: langue bantu du gabon (B11b)

    No full text
    L’étude présentée dans le cadre de cette thèse porte sur l’orungu, langue bantu classée B11b par M. Guthrie, parlée à l’Ouest du Gabon, dans la province de l’Ogooué Maritime, par l’un des peuples Ngwè-myènè (ou Myènè selon la dénomination administrative). Cette thèse constitue une première description présentant l’ensemble des éléments grammaticaux en incluant les plans segmental et tonal dans une analyse conjointe des niveaux phonologique, morphologique et post-lexical. On y traite, dans un premier temps, des phonèmes qui caractérisent l’organisation structurelle de la langue, du système des classes nominales et leur implication dans les modifications formelles des lexèmes, ainsi que de la description des alternances consonantiques. La deuxième partie traite conjointement de la morphologie et de la tonologie des différents éléments grammaticaux de la langue. L’établissement des schèmes tonals mène à montrer les processus de dérivation qui sous-tendent le passage de la forme indéfinie à la forme définie des nominaux. L’essentiel de la description verbale est basé sur la dérivation et la flexion verbale dans différents tiroirs de la conjugaison. La tonologie post-lexicale, enfin, décrit les modifications que subissent les schèmes de tonalité propres aux lexèmes lorsqu’ils sont à la fois placés dans certains environnements tonals et dans certaines situations syntaxiques, en tenant compte du type tonal propre aux unités lexicales.<p><p>This PhD-dissertation is a study of Orungu, a Bantu language classified as B11b by M. Guthrie and spoken by a Ngwè-myènè people (or Myènè according to the administrative denomination) in the Ogooué Maritime province of Western Gabon. It presents a first descriptive study of the language and offers a general view of its grammar. It describes the most important segmental and supra-segmental or tonal features of its phonology, morphology and syntax. The first part is a description of the phonemes of Orungu, its noun class system, and its typical consonant mutations. The second part deals with the nominal and verbal morphology and the role tone plays at this level. The establishment of tone schemes results in a demonstration of the processes involved in the derivation of definite nouns from indefinite nouns. The description of the verb morphology is focussed on verbal derivation strategies and on the complex TAM-system involved in the verbal conjugation. The third and final part is a study of the post-lexical tone system and describes the mutations that lexical tone schemes undergo when they occur in certain tonal contexts and/or certain syntactical constructions.Doctorat en philosophie et lettres, Orientation linguistiqueinfo:eu-repo/semantics/nonPublishe

    De la tonalité des nominaux en orungu (B11b)

    No full text
    Ambouroue Odette. De la tonalité des nominaux en orungu (B11b). In: Africana Linguistica 12, 2006. pp. 1-23

    Melodic tones in Orungu (Bantu B11b)

    No full text
    Verbs in Orungu (Bantu B11b) are characterized by the use of melodic tones in verbal paradigms, with particular tone patterns being associated with certain tenses/ aspects. We discuss these six patterns along with a new analysis of the verbal tonology of the language. Two of these melodies are responsible for a remarkable neutralization in the intonational phrase. The distribution of the melodies and their predictability are examined.En orungu (bantu B11b), les verbes sont caractérisés par l’utilisation de tons mélodiques dans les paradigmes verbaux, associant des schèmes tonals particuliers à certains temps/ aspects précis. Nous discutons ces six schèmes à travers une nouvelle analyse de la tonologie verbale de la langue. Deux d’entre eux sont responsables d’une neutralisation remarquable au sein de la phrase intonationnelle. Nous étudions également la distribution des mélodies et leur prévisibilité.Maniacky Jacky, Ambouroue Odette. Melodic tones in Orungu (Bantu B11b). In: Africana Linguistica 20, 2014. pp. 243-261

    The origin and use of a relative clause construction with passive morphology in Orungu (Bantu, Gabon).

    No full text
    International audienceThis paper provides an analysis of two relative clause constructions in the Gabonese Bantu language Orungu that are in complementary distribution. The conditioning of the choice between them is typologically interesting, in that it involves the syntactic relation, the thematic role and referential properties of the target of relativisation. The relative verb form of one of these constructions, which we call the O construction, has passive morphology. We argue that O relatives are the result of a reanalysis of the initial use of passivisation to relativise certain objects by promoting them to subject position, providing formal and semantic evidence that demonstrates that synchronically it is a separate relative clause construction that directly targets objects. This is a very rare type of change in relative clause constructions, which usually merely involves relative clause markers. However, the origin of O relatives is easily accounted for by the predictions of the accessibility hierarchy if we assume the prior existence of a discontinuity of the Toba Batak type, i.e. one in which positions that cannot be directly relativised are first promoted to a higher position on the accessibility hierarchy, from where they can be relativised

    La cuisine au bord du fleuve Congo :lexique et recettes en lokele, foma, topoke, heso et mbuza

    No full text
    This chapter is set in the Democratic Republic of the Congo, in the tropical rainforest, along the Congo River and its tributaries between Bumba and Kisangani. Birgit Ricquier, Odette Ambouroue and Nicolas Mombaya Liwila present culinary lexicons in five languages, namely Lokele (C55), Foma (C56), Topoke (C53), Heso (C52) and Mbuza (C37). The regional menu revolves around diverse cassava preparations and oil palm products, complemented with fish from the rivers and forest ponds, greens, chicken, or forest bounty such as wild plants, bush meat or caterpillars. The recipes present staples such as a porridge of cassava and plantains (lìtúmá), cassava sticks (“chikwangue”, lòmàtà), and the preparation of greens, for instance sweet potato leaves (ɓátɛm̀ bɛĺ ɛ̀).info:eu-repo/semantics/publishe

    Innovative technologies for under-resourced language documentation: The BULB Project

    No full text
    International audienceThe project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aims at supporting linguists in documenting unwritten languages. In order to achieve this we will develop tools tailored to the needs of documentary linguists by building upon technology and expertise from the area of natural language processing, most prominently automatic speech recognition and machine translation. As a development and test bed for this we have chosen three less-resourced African languages from the Bantu family: Basaa, Myene and Embosi. Work within the project is divided into three main steps: 1) Collection of a large corpus of speech (100h per language) at a reasonable cost. After initial recording, the data is re-spoken by a reference speaker to enhance the signal quality and orally translated into French. 2) Automatic transcription of the Bantu languages at phoneme level and the French translation at word level. The recognized Bantu phonemes and French words will then be automatically aligned. 3) Tool development. In close cooperation and discussion with the linguists, the speech and language technologists will design and implement tools that will support the linguists in their work, taking into account the linguists' needs and technology's capabilities. The data collection has begun for the three languages. For this we use standard mobile devices and a dedicated software—LIG-AIKUMA, which proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). LIG-AIKUMA 's improved features include a smart generation and handling of speaker metadata as well as respeaking and parallel audio data mapping
    corecore