46 research outputs found
Benchmarking Arabic AI with Large Language Models
With large Foundation Models (FMs), language technologies (AI in general) are
entering a new paradigm: eliminating the need for developing large-scale
task-specific datasets and supporting a variety of tasks through set-ups
ranging from zero-shot to few-shot learning. However, understanding FMs
capabilities requires a systematic benchmarking effort by comparing FMs
performance with the state-of-the-art (SOTA) task-specific models. With that
goal, past work focused on the English language and included a few efforts with
multiple languages. Our study contributes to ongoing research by evaluating FMs
performance for standard Arabic NLP and Speech processing, including a range of
tasks from sequence tagging to content classification across diverse domains.
We start with zero-shot learning using GPT-3.5-turbo, Whisper, and USM,
addressing 33 unique tasks using 59 publicly available datasets resulting in 96
test setups. For a few tasks, FMs performs on par or exceeds the performance of
the SOTA models but for the majority it under-performs. Given the importance of
prompt for the FMs performance, we discuss our prompt strategies in detail and
elaborate on our findings. Our future work on Arabic AI will explore few-shot
prompting, expand the range of tasks, and investigate additional open-source
models.Comment: Foundation Models, Large Language Models, Arabic NLP, Arabic Speech,
Arabic AI, , CHatGPT Evaluation, USM Evaluation, Whisper Evaluatio
Fighting the COVID-19 Infodemic:Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society
With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic. Fighting this infodemic has been declared one of the most important focus areas of the World Health Organization, with dangers ranging from promoting fake cures, rumors, and conspiracy theories to spreading xenophobia and panic. Addressing the issue requires solving a number of challenging problems such as identifying messages containing claims, determining their check-worthiness and factuality, and their potential to do harm as well as the nature of that harm, to mention just a few. To address this gap, we release a large dataset of 16K manually annotated tweets for fine-grained disinformation analysis that (i) focuses on COVID-19, (ii) combines the perspectives and the interests of journalists, fact-checkers, social media platforms, policy makers, and society, and (iii) covers Arabic, Bulgarian, Dutch, and English. Finally, we show strong evaluation results using pretrained Transformers, thus confirming the practical utility of the dataset in monolingual vs. multilingual, and single task vs. multitask settings
Are Feral Cats a New Threat for the Reptiles of Bangladesh? Observations of Predation of Snakes and Lizards by Feral Cats
Physalin H from Solanum nigrum as an Hh signaling inhibitor blocks GLI1–DNA-complex formation
Hedgehog (Hh) signaling plays an important role in embryonic development, cell maintenance and cell proliferation. Moreover, Hh signaling contributes to the growth of cancer cells. Physalins are highly oxidized natural products with a complex structure. Physalins (1–7) were isolated from Solanum nigrum (Solanaceae) collected in Bangladesh by using our cell-based assay. The isolated physalins included the previously reported Hh inhibitors 5 and 6. Compounds 1 and 4 showed strong inhibition of GLI1 transcriptional activity, and exhibited cytotoxicity against cancer cell lines with an aberrant activation of Hh signaling. Compound 1 inhibited the production of the Hh-related proteins patched (PTCH) and BCL2. Analysis of the structures of different physalins showed that the left part of the physalins was important for Hh inhibitory activity. Interestingly, physalin H (1) disrupted GLI1 binding to its DNA binding domain, while the weak inhibitor physalin G (2) did not show inhibition of GLI1-DNA complex formation
QCRI at SemEval-2023 Task 3: News Genre, Framing and Persuasion Techniques Detection using Multilingual Models
Misinformation spreading in mainstream and social media has been misleading
users in different ways. Manual detection and verification efforts by
journalists and fact-checkers can no longer cope with the great scale and quick
spread of misleading information. This motivated research and industry efforts
to develop systems for analyzing and verifying news spreading online. The
SemEval-2023 Task 3 is an attempt to address several subtasks under this
overarching problem, targeting writing techniques used in news articles to
affect readers' opinions. The task addressed three subtasks with six languages,
in addition to three ``surprise'' test languages, resulting in 27 different
test setups. This paper describes our participating system to this task. Our
team is one of the 6 teams that successfully submitted runs for all setups. The
official results show that our system is ranked among the top 3 systems for 10
out of the 27 setups.Comment: Accepted at SemEval-23 (ACL-23, propaganda, disinformation,
misinformation, fake new