113 research outputs found

    Preliminary Experiments on Unsupervised Word Discovery in Mboshi

    No full text
    International audienceThe necessity to document thousands of endangered languages encourages the collaboration between linguists and computer scientists in order to provide the documentary linguistics community with the support of automatic processing tools. The French-German ANR-DFG project Breaking the Unwritten Language Barrier (BULB) aims at developing such tools for three mostly unwritten African languages of the Bantu family. For one of them, Mboshi, a language originating from the " Cu-vette " region of the Republic of Congo, we investigate unsuper-vised word discovery techniques from an unsegmented stream of phonemes. We compare different models and algorithms, both monolingual and bilingual, on a new corpus in Mboshi and French, and discuss various ways to represent the data with suitable granularity. An additional French-English corpus allows us to contrast the results obtained on Mboshi and to experiment with more data

    Images and imagination : automated analysis of priming effects related to autism spectrum disorder and developmental language disorder

    Get PDF
    Different aspects of language processing have been shown to be sensitive to priming but the findings of studies examining priming effects in adolescents with Autism Spectrum Disorder (ASD) and Developmental Language Disorder (DLD) have been inconclusive. We present a study analysing visual and implicit semantic priming in adolescents with ASD and DLD. Based on a dataset of fictional and script-like narratives, we evaluate how often and how extensively, content of two different priming sources is used by the participants. The first priming source was visual, consisting of images shown to the participants to assist them with their storytelling. The second priming source originated from commonsense knowledge, using crowdsourced data containing prototypical script elements. Our results show that individuals with ASD are less sensitive to both types of priming, but show typical usage of primed cues when they use them at all. In contrast, children with DLD show mostly average priming sensitivity, but exhibit an over-proportional use of the priming cues

    Optimized stream-cipher-based transciphering by means of functional-bootstrapping

    Get PDF
    Fully homomorphic encryption suffers from a large expansion in the size of encrypted data, which makes FHE impractical for low-bandwidth networks. Fortunately, transciphering allows to circumvent this issue by involving a symmetric cryptosystem which does not carry the disadvantage of a large expansion factor, and maintains the ability to recover an FHE ciphertext with the cost of extra homomorphic computations on the receiver side. Recent works have started to investigate the efficiency of TFHE as the FHE layer in transciphering, combined with various symmetric schemes including a NIST finalist for lightweight cryptography, namely Grain128-AEAD. Yet, this has so far been done without taking advantage of TFHE functional bootstrapping abilities, that is, evaluating any discrete function ``for free\u27\u27 within the bootstrapping operation. In this work, we thus investigate the use of TFHE functional bootstrapping for implementing Grain128-AEAD in a more efficient base (B>2B > 2) representation, rather than a binary one. This significantly reduces the overall number of necessary bootstrappings in a homomorphic run of the stream-cipher, for example reducing the number of bootstrappings required in the warm-up phase by a factor of ≈\approx 3 when B=16B=16

    Innovative technologies for under-resourced language documentation: The BULB Project

    Get PDF
    International audienceThe project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aims at supporting linguists in documenting unwritten languages. In order to achieve this we will develop tools tailored to the needs of documentary linguists by building upon technology and expertise from the area of natural language processing, most prominently automatic speech recognition and machine translation. As a development and test bed for this we have chosen three less-resourced African languages from the Bantu family: Basaa, Myene and Embosi. Work within the project is divided into three main steps: 1) Collection of a large corpus of speech (100h per language) at a reasonable cost. After initial recording, the data is re-spoken by a reference speaker to enhance the signal quality and orally translated into French. 2) Automatic transcription of the Bantu languages at phoneme level and the French translation at word level. The recognized Bantu phonemes and French words will then be automatically aligned. 3) Tool development. In close cooperation and discussion with the linguists, the speech and language technologists will design and implement tools that will support the linguists in their work, taking into account the linguists' needs and technology's capabilities. The data collection has begun for the three languages. For this we use standard mobile devices and a dedicated software—LIG-AIKUMA, which proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). LIG-AIKUMA 's improved features include a smart generation and handling of speaker metadata as well as respeaking and parallel audio data mapping

    Innovative technologies for under-resourced language documentation: The BULB Project

    No full text
    International audienceThe project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aims at supporting linguists in documenting unwritten languages. In order to achieve this we will develop tools tailored to the needs of documentary linguists by building upon technology and expertise from the area of natural language processing, most prominently automatic speech recognition and machine translation. As a development and test bed for this we have chosen three less-resourced African languages from the Bantu family: Basaa, Myene and Embosi. Work within the project is divided into three main steps: 1) Collection of a large corpus of speech (100h per language) at a reasonable cost. After initial recording, the data is re-spoken by a reference speaker to enhance the signal quality and orally translated into French. 2) Automatic transcription of the Bantu languages at phoneme level and the French translation at word level. The recognized Bantu phonemes and French words will then be automatically aligned. 3) Tool development. In close cooperation and discussion with the linguists, the speech and language technologists will design and implement tools that will support the linguists in their work, taking into account the linguists' needs and technology's capabilities. The data collection has begun for the three languages. For this we use standard mobile devices and a dedicated software—LIG-AIKUMA, which proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). LIG-AIKUMA 's improved features include a smart generation and handling of speaker metadata as well as respeaking and parallel audio data mapping

    Potential use of the Bushmint, Hyptis suaveolens, for the Control of Infestation by the Pink Stalk Borer, Sesamia calamistis on Maize in Southern Benin, West Africa

    Get PDF
    Maize production in Benin, especially in resource-poor farmers' fields, is constrained by stemborers among other factors. One of the major stemborers in southern Benin is Sesamia calamistis Hampson (Lepidoptera: Noctuidae). African farmers cannot afford to use commercial insecticides for controlling stemborers - they are expensive and unsuitable for durable pest management systems due to eco-toxicity. There is therefore a need for cheaper and environmentally friendly methods and botanicals offer an attractive alternative. The bushmint, Hyptis suaveolens (L.) Poit. (Lamiales: Lamiaceae), was compared with the commercial insecticide Furadan (carbofuran) for the control of S. calamistis on maize Zea mays L. (Poales: Poaceae). Trials were conducted in the screenhouse and in the field during the minor cropping season in 2004 at the International Institute of Tropical Agriculture (IITA)-Benin station. The variables measured included numbers of egg masses per plant, eggs per egg mass (in the screenhouse study), population density of S. calamistis, percentage of infested plants and/or ears, and deadhearts in the field. Irrespective of the variable considered, the aqueous extract of H. suaveolens compared favorably with Furadan while maize surrounded by live H. suaveolens plants had lower S. calamistis densities
    • 

    corecore