46 research outputs found

    Benchmarking Arabic AI with Large Language Models

    Full text link
    With large Foundation Models (FMs), language technologies (AI in general) are entering a new paradigm: eliminating the need for developing large-scale task-specific datasets and supporting a variety of tasks through set-ups ranging from zero-shot to few-shot learning. However, understanding FMs capabilities requires a systematic benchmarking effort by comparing FMs performance with the state-of-the-art (SOTA) task-specific models. With that goal, past work focused on the English language and included a few efforts with multiple languages. Our study contributes to ongoing research by evaluating FMs performance for standard Arabic NLP and Speech processing, including a range of tasks from sequence tagging to content classification across diverse domains. We start with zero-shot learning using GPT-3.5-turbo, Whisper, and USM, addressing 33 unique tasks using 59 publicly available datasets resulting in 96 test setups. For a few tasks, FMs performs on par or exceeds the performance of the SOTA models but for the majority it under-performs. Given the importance of prompt for the FMs performance, we discuss our prompt strategies in detail and elaborate on our findings. Our future work on Arabic AI will explore few-shot prompting, expand the range of tasks, and investigate additional open-source models.Comment: Foundation Models, Large Language Models, Arabic NLP, Arabic Speech, Arabic AI, , CHatGPT Evaluation, USM Evaluation, Whisper Evaluatio

    Fighting the COVID-19 Infodemic:Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society

    Get PDF
    With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic. Fighting this infodemic has been declared one of the most important focus areas of the World Health Organization, with dangers ranging from promoting fake cures, rumors, and conspiracy theories to spreading xenophobia and panic. Addressing the issue requires solving a number of challenging problems such as identifying messages containing claims, determining their check-worthiness and factuality, and their potential to do harm as well as the nature of that harm, to mention just a few. To address this gap, we release a large dataset of 16K manually annotated tweets for fine-grained disinformation analysis that (i) focuses on COVID-19, (ii) combines the perspectives and the interests of journalists, fact-checkers, social media platforms, policy makers, and society, and (iii) covers Arabic, Bulgarian, Dutch, and English. Finally, we show strong evaluation results using pretrained Transformers, thus confirming the practical utility of the dataset in monolingual vs. multilingual, and single task vs. multitask settings

    Bio-active Natural Products with TRAIL-Resistance Overcoming Activity

    No full text

    Physalin H from Solanum nigrum as an Hh signaling inhibitor blocks GLI1–DNA-complex formation

    No full text
    Hedgehog (Hh) signaling plays an important role in embryonic development, cell maintenance and cell proliferation. Moreover, Hh signaling contributes to the growth of cancer cells. Physalins are highly oxidized natural products with a complex structure. Physalins (1–7) were isolated from Solanum nigrum (Solanaceae) collected in Bangladesh by using our cell-based assay. The isolated physalins included the previously reported Hh inhibitors 5 and 6. Compounds 1 and 4 showed strong inhibition of GLI1 transcriptional activity, and exhibited cytotoxicity against cancer cell lines with an aberrant activation of Hh signaling. Compound 1 inhibited the production of the Hh-related proteins patched (PTCH) and BCL2. Analysis of the structures of different physalins showed that the left part of the physalins was important for Hh inhibitory activity. Interestingly, physalin H (1) disrupted GLI1 binding to its DNA binding domain, while the weak inhibitor physalin G (2) did not show inhibition of GLI1-DNA complex formation

    QCRI at SemEval-2023 Task 3: News Genre, Framing and Persuasion Techniques Detection using Multilingual Models

    Full text link
    Misinformation spreading in mainstream and social media has been misleading users in different ways. Manual detection and verification efforts by journalists and fact-checkers can no longer cope with the great scale and quick spread of misleading information. This motivated research and industry efforts to develop systems for analyzing and verifying news spreading online. The SemEval-2023 Task 3 is an attempt to address several subtasks under this overarching problem, targeting writing techniques used in news articles to affect readers' opinions. The task addressed three subtasks with six languages, in addition to three ``surprise'' test languages, resulting in 27 different test setups. This paper describes our participating system to this task. Our team is one of the 6 teams that successfully submitted runs for all setups. The official results show that our system is ranked among the top 3 systems for 10 out of the 27 setups.Comment: Accepted at SemEval-23 (ACL-23, propaganda, disinformation, misinformation, fake new
    corecore