2 research outputs found

    Bridging the Kuwaiti Dialect Gap in Natural Language Processing

    No full text
    The available dialectal Arabic linguistic resources are very limited in their coverage of Arabic dialects, particularly the Kuwaiti dialect. This shortage of linguistic resources creates struggles for researchers in the Natural Language Processing (NLP) field and limits the development of advanced linguistic analytical and processing tools for the Kuwaiti dialect. Many other low-resource Arabic dialects are still not explored in research due to the challenges faced during the annotators’ recruitment process for dataset labeling. This paper proposes a weak supervised classification system to solve the problem of recruiting human annotators called “q8SentiLabeler”. In addition, we developed a large dataset consisting of over 16.6k posts serving sentiment analysis in the Kuwaiti dialect. This dataset covers several themes and timeframes to remove any bias that might affect its content. Furthermore, we evaluated our dataset using multiple traditional machine-learning classifiers and advanced deep-learning language models to test its performance. Results demonstrate the positive potential of “q8SentiLabeler” to replace human annotators with a 93% for pairwise percent agreement and 0.87 for Cohen’s Kappa coefficient. Using the ARBERT model on our dataset, we achieved 89% accuracy in the system’s performance

    Household income, fetal size and birth weight:an analysis of eight populations

    Get PDF
    International audienceBackground The age at onset of the association between poverty and poor health is not understood. Our hypothesis was that individuals from highest household income (HI), compared to those with lowest HI, will have increased fetal size in the second and third trimester and birth. Methods. Second and third trimester fetal ultrasound measurements and birth measurements were obtained from eight cohorts. Results were analysed in cross-sectional two-stage individual patient data (IPD) analyses and also a longitudinal one-stage IPD analysis. Results The eight cohorts included 21 714 individuals. In the two-stage (cross-sectional) IPD analysis, individuals from the highest HI category compared with those from the lowest HI category had larger head size at birth (mean difference 0.22 z score (0.07, 0.36)), in the third trimester (0.25 (0.16, 0.33)) and second trimester (0.11 (0.02, 0.19)). Weight was higher at birth in the highest HI category. In the one-stage (longitudinal) IPD analysis which included data from six cohorts (n=11 062), head size was larger (mean difference 0.13 (0.03, 0.23)) for individuals in the highest HI compared with lowest category, and this difference became greater between the second trimester and birth. Similarly, in the one-stage IPD, weight was heavier in second highest HI category compared with the lowest (mean difference 0.10 (0 .00, 0.20)) and the difference widened as pregnancy progressed. Length was not linked to HI category in the longitudinal model. Conclusions The association between HI, an index of poverty, and fetal size is already present in the second trimester
    corecore