Search CORE

7 research outputs found

Communication

Author: Adjin-Tettey T.D.
Bombi T.
Bukula T.
Della Togna M.
Diale M.
Garman A.
Halse P.
Hyera F.
Kleyn Leti
Petersen F.
Publication venue: DPME (Department of Planning, Monitoring and Evaluation), GTAC (Government Technical Advisory Centre) & NRF (National Research Foundation)
Publication date: 01/06/2021
Field of study

This chapter discusses research on the capacity and effectiveness of government’s communications strategy as South Africa went through the various stages of lockdown during the Covid-19 pandemic in 2020. It probes the working relationship between communications from all spheres of government and community, private, digital, and social media, as well as organised civil society before and during the lockdown and assesses its impact and efficacy. Recognising the multilingual nature of South African society, the urban–rural digital divide, and the prohibitive costs of data, the chapter identifies lessons and reaffirms the relevance of the development communications approach to government– citizen communications. It motivates for the prioritisation of accessible, multilingual digital communications with a citizen feedback loop that is transparent and responsive to ensure people are informed and empowered, as envisioned in the Constitution. Such responsiveness needs an enabling environment from government and from the public, private, and community media landscape. Collaboration and cooperation across these sectors with government communications and with the nongovernmental health and communications sectors is critical in such an all-encompassing crisis. The chapter highlights the need to continue to understand South Africa’s highly diverse communication space, in which digital new media platforms exist alongside loudhailers, and make accommodations in legislation, policy, and government coordination with social partners to reach all people across the digital, class, and language divides.This chapter 4 is published in the first edition of South Africa Covid-19 country report in June 2021.https://www.gov.za/sites/default/files/gcis_document/202206/sa-covid-19-reporta.pd

UPSpace at the University of Pretoria

MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African languages

Author: Abdullahi M
Adelani DI
Adelani TA
Agbolo A
Akinade I
Alabi JO
Aremu A
Atindogbe G
Bamba Dione CM
Bukula A
Buzaaba H
Chimhenga E
Dossou BFP
Emezue CC
Gitau C
Gotosa K
Gwadabe T
Kabore FO
Kalipe G
Klakow D
Koagne VM
Mabuya R
Macucwa T
Marivate V
Mbaye D
Mboning ET
Mizha P
Muhammad SH
Mukiibi J
Munkoh-Buabeng E
Musabeyezu T
Nabende P
Nahimana M
Niyomutabazi E
Ogayo P
Onyenwe I
Samuel O
Sibanda B
Sindane T
Tapo AA
Taylor A
Traore S
Uchechukwu C
Yusuf A
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/07/2023
Field of study

In this paper, we present AfricaPOS, the largest part-of-speech (POS) dataset for 20 typologically diverse African languages. We discuss the challenges in annotating POS for these languages using the universal dependencies (UD) guidelines. We conducted extensive POS baseline experiments using both conditional random field and several multilingual pre-trained language models. We applied various cross-lingual transfer models trained with data available in the UD. Evaluating on the AfricaPOS dataset, we show that choosing the best transfer language(s) in both single-source and multi-source setups greatly improves the POS tagging performance of the target languages, in particular when combined with parameter-fine-tuning methods. Crucially, transferring knowledge from a language that matches the language family and morphosyntactic properties seems to be more effective for POS tagging in unseen languages

UCL Discovery

MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition

Author: Abdulmumin I
Adelani DI
Adewumi T
Adeyemi M
Ahia O
Alabi JO
Aremu A
Bamba Dione CM
Beukman M
Bukula A
Buzaaba H
Chukwuneke C
Dossou BFP
Emezue CC
Ezeani I
Gitau C
Gwadabe T
Hacheme GQ
Kabore F
Kalipe G
Klakow D
Koagne VM
Lignos C
Mabuya R
Macucwa T
Marivate V
Mbaye D
Mboning E
Mokono NL
Muhammad SH
Mukiibi J
Munkoh-Buabeng E
Nabende P
Nakatumba-Nabende J
Neubig G
Ngoli TM
Ogayo P
Ogundepo O
Palen-Michel C
Rijhwani S
Ruder S
Sibanda B
Tapo AA
Taylor A
Yousuf O
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/12/2022
Field of study

African languages are spoken by over a billion people, but are underrepresented in NLP research and development. The challenges impeding progress include the limited availability of annotated datasets, as well as a lack of understanding of the settings where current methods are effective. In this paper, we make progress towards solutions for these challenges, focusing on the task of named entity recognition (NER). We create the largest human-annotated NER dataset for 20 African languages, and we study the behavior of state-of-the-art cross-lingual transfer methods in an Africa-centric setting, demonstrating that the choice of source language significantly affects performance. We show that choosing the best transfer language improves zero-shot F1 scores by an average of 14 points across 20 languages compared to using English. Our results highlight the need for benchmark datasets and models that cover typologically-diverse African languages

UCL Discovery

AfriMTE and AfriCOMET : Empowering COMET to Embrace Under-resourced African Languages

Author: Abdullahi Saheed S.
Abolade Daud
Adelani David Ifeoluwa
Adewumi Tosin
Afolabi Abeeb
Agrawal Sweta
Ajao Simbiat
Akinjobi Zainab
Al-Azzawi Sana
Alkhaled Lama
Anigri Salma El
Aremu Anuoluwapo
Awoyomi Oluwabusayo Olufunke
Bourhim Sofia
Briakou Eleftheria
Brian Sam
Bukula Andiswa
Carpuat Marine
Chukwuneke Chiamaka
Etori Naome A.
Hassan Ayinde
He Xuanli
Hourrane Oumaima
Iro Ruqayya Nasir
Kimotho Wangari
Kimotho Wangui
Macharm Ricky
Mangwana Thabiso
Masiak Marek
Mbonu Chinedu Emmanuel
Mohamed Muhidin
Mohamed Shafie Abdi
Mokayede Hamam
Momo Lyse Naomi Wamba
Moore Stephen E.
Muchiri Eric
Muhammad Shamsuddeen Hassan
Mwase Christine
Ndolela Lolwethu
Njoroge Samuel
Obiefuna Nnaemeka
Ochieng Millicent
Ogayo Perez
Ogbu Onyekachi Raphael
Ojo Jessica
Olatoye Temitayo
Omotayo Abdul-Hakeem
Opoku Bernard
Osei Salomey
Otiende Verrah Akinyi
Rei Ricardo
Sari Sakayo Toadoum
Shode Iyanuoluwa
Siro Clemencia
Stenetorp Pontus
Wang Jiayi
Yuehgoh Foutse
Publication venue: 'Center for Open Science'
Publication date: 16/11/2023
Field of study

Despite the progress we have recorded in scaling multilingual machine translation (MT) models and evaluation data to several under-resourced African languages, it is difficult to measure accurately the progress we have made on these languages because evaluation is often performed on n-gram matching metrics like BLEU that often have worse correlation with human judgments. Embedding-based metrics such as COMET correlate better; however, lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with a simplified MQM guideline for error-span annotation and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET, a COMET evaluation metric for African languages by leveraging DA training data from high-resource languages and African-centric multilingual encoder (AfroXLM-Roberta) to create the state-of-the-art evaluation metric for African languages MT with respect to Spearman-rank correlation with human judgments (+0.406)

Lancaster E-Prints

Holmium laser prostatectomy

Author: A Das
A Duc Le
B Bukula
BA Lowe
DE Meier
EJ Graves
GA Magoha
HP Krahn
HP Krahn
JA Moody
JA Moody
JA Moody
James E. Lingeman
JD McConnell
JN Kabalin
JN Kabalin
K Fujita
K Matsuoka
K Matsuoka
M Kitagawa
MH Muzafer
MJ Mackey
MR Fraundorfer
MR Fraundorfer
PA Cornford
PJ Gilling
PJ Gilling
PJ Gilling
PJ Gilling
PJ Gilling
PJ Gilling
PJ Gilling
PJ Gilling
RD David
RM Morse
Ryan F. Paterson
S Drinis
SS Chun
TB Kulb
TJ Stillwell
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Superelectrophilic Solvation

Crossref

AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages

Author: Abdullahi Saheed S.
Abolade Daud
Adelani David Ifeoluwa
Adewumi Tosin
Afolabi Abeeb
Agrawal Sweta
Ajao Simbiat
Akinjobi Zainab
Al-Azzawi Sana
Alkhaled Lama
Anigri Salma El
Aremu Anuoluwapo
Awoyomi Oluwabusayo Olufunke
Bourhim Sofia
Briakou Eleftheria
Brian Sam
Bukula Andiswa
Carpuat Marine
Chukwuneke Chiamaka
Etori Naome A.
Hassan Ayinde
He Xuanli
Hourrane Oumaima
Iro Ruqayya Nasir
Kimotho Wangari
Kimotho Wangui
Lu Yao
Macharm Ricky
Mangwana Thabiso
Masiak Marek
Mbonu Chinedu Emmanuel
Mohamed Muhidin
Mohamed Shafie Abdi
Mokayed Hamam
Momo Lyse Naomi Wamba
Moore Stephen E.
Muchiri Eric
Muhammad Shamsuddeen Hassan
Mwase Christine
Ndolela Lolwethu
Njoroge Samuel
Obiefuna Nnaemeka
Ochieng Millicent
Ogayo Perez
Ogbu Onyekachi Raphael
Ojo Jessica
Olatoye Temitayo
Omotayo Abdul-Hakeem
Opoku Bernard
Osei Salomey
Otiende Verrah Akinyi
Rei Ricardo
Sari Sakayo Toadoum
Shode Iyanuoluwa
Siro Clemencia
Stenetorp Pontus
Wang Jiayi
Yuehgoh Foutse
Publication venue: arXiv.org
Publication date: 16/11/2023
Field of study

Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measuring this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments. Learned metrics such as COMET have higher correlation; however, the lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLM-R) to create the state-of-the-art MT evaluation metrics for African languages with respect to Spearman-rank correlation with human judgments (0.441)

Aston Publications Explorer