39 research outputs found

    How good are Large Language Models on African Languages?

    Full text link
    Recent advancements in natural language processing have led to the proliferation of large language models (LLMs). These models have been shown to yield good performance, using in-context learning, even on unseen tasks and languages. Additionally, they have been widely adopted as language-model-as-a-service commercial APIs like GPT-4 API. However, their performance on African languages is largely unknown. We present an analysis of three popular large language models (mT0, LLaMa 2, and GPT-4) on five tasks (news topic classification, sentiment classification, machine translation, question answering, and named entity recognition) across 30 African languages, spanning different language families and geographical regions. Our results suggest that all LLMs produce below-par performance on African languages, and there is a large gap in performance compared to high-resource languages like English most tasks. We find that GPT-4 has an average or impressive performance on classification tasks but very poor results on generative tasks like machine translation. Surprisingly, we find that mT0 had the best overall on cross-lingual QA, better than the state-of-the-art supervised model (i.e. fine-tuned mT5) and GPT-4 on African languages. Overall, LLaMa 2 records the worst performance due to its limited multilingual capabilities and English-centric pre-training corpus. In general, our findings present a call-to-action to ensure African languages are well represented in large language models, given their growing popularity

    Endochin-like quinolones (ELQs) and bumped kinase inhibitors (BKIs): Synergistic and additive effects of combined treatments against Neospora caninum infection in vitro and in vivo.

    Get PDF
    The apicomplexan parasite Neospora caninum is an important causative agent of congenital neosporosis, resulting in abortion, birth of weak offspring and neuromuscular disorders in cattle, sheep, and many other species. Among several compound classes that are currently being developed, two have been reported to limit the effects of congenital neosporosis: (i) bumped kinase inhibitors (BKIs) target calcium dependent protein kinase 1 (CDPK1), an enzyme that is encoded by an apicoplast-derived gene and found only in apicomplexans and plants. CDPK1 is essential for host cell invasion and egress; (ii) endochin-like quinolones (ELQs) are inhibitors of the cytochrome bc1 complex of the mitochondrial electron transport chain and thus inhibit oxidative phosphorylation. We here report on the in vitro and in vivo activities of BKI-1748, and of ELQ-316 and its respective prodrugs ELQ-334 and ELQ-422, applied either as single-compounds or ELQ-BKI-combinations. In vitro, BKI-1748 and ELQ-316, as well as BKI-1748 and ELQ-334, acted synergistically, while this was not observed for the BKI-1748/ELQ-422 combination treatment. In a N. caninum-infected pregnant BALB/c mouse model, the synergistic effects observed in vitro were not entirely reproduced, but 100% postnatal survival and 100% inhibition of vertical transmission was noted in the group treated with the BKI-1748/ELQ-334 combination. In addition, the combined drug applications resulted in lower neonatal mortality compared to treatments with single drugs

    Effects of control interventions on Clostridium difficile infection in England: an observational study

    Get PDF
    Background: The control of Clostridium difficile infections is an international clinical challenge. The incidence of C difficile in England declined by roughly 80% after 2006, following the implementation of national control policies; we tested two hypotheses to investigate their role in this decline. First, if C difficile infection declines in England were driven by reductions in use of particular antibiotics, then incidence of C difficile infections caused by resistant isolates should decline faster than that caused by susceptible isolates across multiple genotypes. Second, if C difficile infection declines were driven by improvements in hospital infection control, then transmitted (secondary) cases should decline regardless of susceptibility. Methods: Regional (Oxfordshire and Leeds, UK) and national data for the incidence of C difficile infections and antimicrobial prescribing data (1998–2014) were combined with whole genome sequences from 4045 national and international C difficile isolates. Genotype (multilocus sequence type) and fluoroquinolone susceptibility were determined from whole genome sequences. The incidence of C difficile infections caused by fluoroquinolone-resistant and fluoroquinolone-susceptible isolates was estimated with negative-binomial regression, overall and per genotype. Selection and transmission were investigated with phylogenetic analyses. Findings: National fluoroquinolone and cephalosporin prescribing correlated highly with incidence of C difficile infections (cross-correlations >0·88), by contrast with total antibiotic prescribing (cross-correlations 0·2). Interpretation: Restricting fluoroquinolone prescribing appears to explain the decline in incidence of C difficile infections, above other measures, in Oxfordshire and Leeds, England. Antimicrobial stewardship should be a central component of C difficile infection control programmes

    MasakhaNEWS: News Topic Classification for African languages

    Full text link
    African languages are severely under-represented in NLP research due to lack of datasets covering several NLP tasks. While there are individual language specific datasets that are being expanded to different tasks, only a handful of NLP tasks (e.g. named entity recognition and machine translation) have standardized benchmark datasets covering several geographical and typologically-diverse African languages. In this paper, we develop MasakhaNEWS -- a new benchmark dataset for news topic classification covering 16 languages widely spoken in Africa. We provide an evaluation of baseline models by training classical machine learning models and fine-tuning several language models. Furthermore, we explore several alternatives to full fine-tuning of language models that are better suited for zero-shot and few-shot learning such as cross-lingual parameter-efficient fine-tuning (like MAD-X), pattern exploiting training (PET), prompting language models (like ChatGPT), and prompt-free sentence transformer fine-tuning (SetFit and Cohere Embedding API). Our evaluation in zero-shot setting shows the potential of prompting ChatGPT for news topic classification in low-resource African languages, achieving an average performance of 70 F1 points without leveraging additional supervision like MAD-X. In few-shot setting, we show that with as little as 10 examples per label, we achieved more than 90\% (i.e. 86.0 F1 points) of the performance of full supervised training (92.6 F1 points) leveraging the PET approach.Comment: Accepted to IJCNLP-AACL 2023 (main conference

    Whole-genome sequencing for prediction of Mycobacterium tuberculosis drug susceptibility and resistance : a retrospective cohort study

    Get PDF
    BACKGROUND : Diagnosing drug-resistance remains an obstacle to the elimination of tuberculosis. Phenotypic drugsusceptibility testing is slow and expensive, and commercial genotypic assays screen only common resistancedetermining mutations. We used whole-genome sequencing to characterise common and rare mutations predicting drug resistance, or consistency with susceptibility, for all fi rst-line and second-line drugs for tuberculosis. METHODS : Between Sept 1, 2010, and Dec 1, 2013, we sequenced a training set of 2099 Mycobacterium tuberculosis genomes. For 23 candidate genes identifi ed from the drug-resistance scientifi c literature, we algorithmically characterised genetic mutations as not conferring resistance (benign), resistance determinants, or uncharacterised. We then assessed the ability of these characterisations to predict phenotypic drug-susceptibility testing for an independent validation set of 1552 genomes. We sought mutations under similar selection pressure to those characterised as resistance determinants outside candidate genes to account for residual phenotypic resistance. FINDINGS : We characterised 120 training-set mutations as resistance determining, and 772 as benign. With these mutations, we could predict 89·2% of the validation-set phenotypes with a mean 92·3% sensitivity (95% CI 90·7–93·7) and 98·4% specifi city (98·1–98·7). 10·8% of validation-set phenotypes could not be predicted because uncharacterised mutations were present. With an in-silico comparison, characterised resistance determinants had higher sensitivity than the mutations from three line-probe assays (85·1% vs 81·6%). No additional resistance determinants were identifi ed among mutations under selection pressure in non-candidate genes. INTERPRETATION : A broad catalogue of genetic mutations enable data from whole-genome sequencing to be used clinically to predict drug resistance, drug susceptibility, or to identify drug phenotypes that cannot yet be genetically predicted. This approach could be integrated into routine diagnostic workfl ows, phasing out phenotypic drugsusceptibility testing while reporting drug resistance early.Wellcome Trust, National Institute of Health Research, Medical Research Council, and the European Union.http://www.thelancet.com/infectionhb201

    Mapping local patterns of childhood overweight and wasting in low- and middle-income countries between 2000 and 2017

    Get PDF
    A double burden of malnutrition occurs when individuals, household members or communities experience both undernutrition and overweight. Here, we show geospatial estimates of overweight and wasting prevalence among children under 5 years of age in 105 low- and middle-income countries (LMICs) from 2000 to 2017 and aggregate these to policy-relevant administrative units. Wasting decreased overall across LMICs between 2000 and 2017, from 8.4% (62.3 (55.1–70.8) million) to 6.4% (58.3 (47.6–70.7) million), but is predicted to remain above the World Health Organization’s Global Nutrition Target of <5% in over half of LMICs by 2025. Prevalence of overweight increased from 5.2% (30 (22.8–38.5) million) in 2000 to 6.0% (55.5 (44.8–67.9) million) children aged under 5 years in 2017. Areas most affected by double burden of malnutrition were located in Indonesia, Thailand, southeastern China, Botswana, Cameroon and central Nigeria. Our estimates provide a new perspective to researchers, policy makers and public health agencies in their efforts to address this global childhood syndemic

    Measuring progress from 1990 to 2017 and projecting attainment to 2030 of the health-related Sustainable Development Goals for 195 countries and territories: a systematic analysis for the Global Burden of Disease Study 2017

    Get PDF
    Background: Efforts to establish the 2015 baseline and monitor early implementation of the UN Sustainable Development Goals (SDGs) highlight both great potential for and threats to improving health by 2030. To fully deliver on the SDG aim of “leaving no one behind”, it is increasingly important to examine the health-related SDGs beyond national-level estimates. As part of the Global Burden of Diseases, Injuries, and Risk Factors Study 2017 (GBD 2017), we measured progress on 41 of 52 health-related SDG indicators and estimated the health-related SDG index for 195 countries and territories for the period 1990–2017, projected indicators to 2030, and analysed global attainment. Methods: We measured progress on 41 health-related SDG indicators from 1990 to 2017, an increase of four indicators since GBD 2016 (new indicators were health worker density, sexual violence by non-intimate partners, population census status, and prevalence of physical and sexual violence [reported separately]). We also improved the measurement of several previously reported indicators. We constructed national-level estimates and, for a subset of health-related SDGs, examined indicator-level differences by sex and Socio-demographic Index (SDI) quintile. We also did subnational assessments of performance for selected countries. To construct the health-related SDG index, we transformed the value for each indicator on a scale of 0–100, with 0 as the 2\ub75th percentile and 100 as the 97\ub75th percentile of 1000 draws calculated from 1990 to 2030, and took the geometric mean of the scaled indicators by target. To generate projections through 2030, we used a forecasting framework that drew estimates from the broader GBD study and used weighted averages of indicator-specific and country-specific annualised rates of change from 1990 to 2017 to inform future estimates. We assessed attainment of indicators with defined targets in two ways: first, using mean values projected for 2030, and then using the probability of attainment in 2030 calculated from 1000 draws. We also did a global attainment analysis of the feasibility of attaining SDG targets on the basis of past trends. Using 2015 global averages of indicators with defined SDG targets, we calculated the global annualised rates of change required from 2015 to 2030 to meet these targets, and then identified in what percentiles the required global annualised rates of change fell in the distribution of country-level rates of change from 1990 to 2015. We took the mean of these global percentile values across indicators and applied the past rate of change at this mean global percentile to all health-related SDG indicators, irrespective of target definition, to estimate the equivalent 2030 global average value and percentage change from 2015 to 2030 for each indicator. Findings: The global median health-related SDG index in 2017 was 59\ub74 (IQR 35\ub74–67\ub73), ranging from a low of 11\ub76 (95% uncertainty interval 9\ub76–14\ub70) to a high of 84\ub79 (83\ub71–86\ub77). SDG index values in countries assessed at the subnational level varied substantially, particularly in China and India, although scores in Japan and the UK were more homogeneous. Indicators also varied by SDI quintile and sex, with males having worse outcomes than females for non-communicable disease (NCD) mortality, alcohol use, and smoking, among others. Most countries were projected to have a higher health-related SDG index in 2030 than in 2017, while country-level probabilities of attainment by 2030 varied widely by indicator. Under-5 mortality, neonatal mortality, maternal mortality ratio, and malaria indicators had the most countries with at least 95% probability of target attainment. Other indicators, including NCD mortality and suicide mortality, had no countries projected to meet corresponding SDG targets on the basis of projected mean values for 2030 but showed some probability of attainment by 2030. For some indicators, including child malnutrition, several infectious diseases, and most violence measures, the annualised rates of change required to meet SDG targets far exceeded the pace of progress achieved by any country in the recent past. We found that applying the mean global annualised rate of change to indicators without defined targets would equate to about 19% and 22% reductions in global smoking and alcohol consumption, respectively; a 47% decline in adolescent birth rates; and a more than 85% increase in health worker density per 1000 population by 2030. Interpretation: The GBD study offers a unique, robust platform for monitoring the health-related SDGs across demographic and geographic dimensions. Our findings underscore the importance of increased collection and analysis of disaggregated data and highlight where more deliberate design or targeting of interventions could accelerate progress in attaining the SDGs. Current projections show that many health-related SDG indicators, NCDs, NCD-related risks, and violence-related indicators will require a concerted shift away from what might have driven past gains—curative interventions in the case of NCDs—towards multisectoral, prevention-oriented policy action and investments to achieve SDG aims. Notably, several targets, if they are to be met by 2030, demand a pace of progress that no country has achieved in the recent past. The future is fundamentally uncertain, and no model can fully predict what breakthroughs or events might alter the course of the SDGs. What is clear is that our actions—or inaction—today will ultimately dictate how close the world, collectively, can get to leaving no one behind by 2030

    Measuring progress from 1990 to 2017 and projecting attainment to 2030 of the health-related Sustainable Development Goals for 195 countries and territories: a systematic analysis for the Global Burden of Disease Study 2017.

    Get PDF
    BACKGROUND: Efforts to establish the 2015 baseline and monitor early implementation of the UN Sustainable Development Goals (SDGs) highlight both great potential for and threats to improving health by 2030. To fully deliver on the SDG aim of 'leaving no one behind', it is increasingly important to examine the health-related SDGs beyond national-level estimates. As part of the Global Burden of Diseases, Injuries, and Risk Factors Study 2017 (GBD 2017), we measured progress on 41 of 52 health-related SDG indicators and estimated the health-related SDG index for 195 countries and territories for the period 1990-2017, projected indicators to 2030, and analysed global attainment. METHODS: We measured progress on 41 health-related SDG indicators from 1990 to 2017, an increase of four indicators since GBD 2016 (new indicators were health worker density, sexual violence by non-intimate partners, population census status, and prevalence of physical and sexual violence [reported separately]). We also improved the measurement of several previously reported indicators. We constructed national-level estimates and, for a subset of health-related SDGs, examined indicator-level differences by sex and Socio-demographic Index (SDI) quintile. We also did subnational assessments of performance for selected countries. To construct the health-related SDG index, we transformed the value for each indicator on a scale of 0-100, with 0 as the 2·5th percentile and 100 as the 97·5th percentile of 1000 draws calculated from 1990 to 2030, and took the geometric mean of the scaled indicators by target. To generate projections through 2030, we used a forecasting framework that drew estimates from the broader GBD study and used weighted averages of indicator-specific and country-specific annualised rates of change from 1990 to 2017 to inform future estimates. We assessed attainment of indicators with defined targets in two ways: first, using mean values projected for 2030, and then using the probability of attainment in 2030 calculated from 1000 draws. We also did a global attainment analysis of the feasibility of attaining SDG targets on the basis of past trends. Using 2015 global averages of indicators with defined SDG targets, we calculated the global annualised rates of change required from 2015 to 2030 to meet these targets, and then identified in what percentiles the required global annualised rates of change fell in the distribution of country-level rates of change from 1990 to 2015. We took the mean of these global percentile values across indicators and applied the past rate of change at this mean global percentile to all health-related SDG indicators, irrespective of target definition, to estimate the equivalent 2030 global average value and percentage change from 2015 to 2030 for each indicator
    corecore