3,026 research outputs found

    Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder

    Full text link
    Although image captioning has a vast array of applications, it has not reached its full potential in languages other than English. Arabic, for instance, although the native language of more than 400 million people, remains largely underrepresented in this area. This is due to the lack of labeled data and powerful Arabic generative models. We alleviate this issue by presenting a novel vision-language model dedicated to Arabic, dubbed \textit{Violet}. Our model is based on a vision encoder and a Gemini text decoder that maintains generation fluency while allowing fusion between the vision and language components. To train our model, we introduce a new method for automatically acquiring data from available English datasets. We also manually prepare a new dataset for evaluation. \textit{Violet} performs sizeably better than our baselines on all of our evaluation datasets. For example, it reaches a CIDEr score of 61.261.2 on our manually annotated dataset and achieves an improvement of 1313 points on Flickr8k.Comment: Accepted in ArabicNLP Conferenc

    21st Century Ottoman: The Ottoman Turkish Linguistic Revival in Digital Affinity Spaces

    Get PDF
    In 2014, the Turkish National Education Council recommending teaching Ottoman Turkish as a mandatory subject in all high schools. Since that time, this historical register of the Turkish language has been making a popular comeback. This is especially true online, where participants are creating and sharing new content written in Ottoman. This article examines evidence of the revival of Ottoman Turkish in digital “affinity spaces” in order to show it is not only being excavated, but is developing independently from its own historical past. In taking into consideration new calligraphic styles, the political and cultural subtext of memes, and the rewriting of modern Turkish back into the Ottoman lexicon, this paper will identify the form of Ottoman emerging in digital spaces as a unique new iteration of the language

    How to utilise students' cultural and linguistic experiences to promote language learning : looking beyond the school

    Get PDF
    This article will first give an overview of multicultural, multilingual Britain and place it within the context of community literacy. It will then discuss education reforms and policies that have aimed to place multicultural and multiple literacy at their centre before considering theoretical basis for current understanding of additional language learning in the UK. Then the article describes a project, based on ethnographic case studies of an early years centre and its reception classes. Preliminary results and conclusions are discussed

    Several categories of Large Language Models (LLMs): A Short Survey

    Full text link
    Large Language Models(LLMs)have become effective tools for natural language processing and have been used in many different fields. This essay offers a succinct summary of various LLM subcategories. The survey emphasizes recent developments and efforts made for various LLM kinds, including task-based financial LLMs, multilingual language LLMs, biomedical and clinical LLMs, vision language LLMs, and code language models. The survey gives a general summary of the methods, attributes, datasets, transformer models, and comparison metrics applied in each category of LLMs. Furthermore, it highlights unresolved problems in the field of developing chatbots and virtual assistants, such as boosting natural language processing, enhancing chatbot intelligence, and resolving moral and legal dilemmas. The purpose of this study is to provide readers, developers, academics, and users interested in LLM-based chatbots and virtual intelligent assistant technologies with useful information and future directions

    Deep Learning: Our Miraculous Year 1990-1991

    Full text link
    In 2020, we will celebrate that many of the basic ideas behind the deep learning revolution were published three decades ago within fewer than 12 months in our "Annus Mirabilis" or "Miraculous Year" 1990-1991 at TU Munich. Back then, few people were interested, but a quarter century later, neural networks based on these ideas were on over 3 billion devices such as smartphones, and used many billions of times per day, consuming a significant fraction of the world's compute.Comment: 37 pages, 188 references, based on work of 4 Oct 201

    Learning About And Becoming Aware Of Reading Strategies And Metacognition In English By Adult Second Language Learners

    Get PDF
    The purpose of the present study was to investigate the need to learn and/or become aware of reading strategies and metacognitive strategies by adult English language learners while making sense of English texts. A mixed method grounded theory (MMGT) in a sequential design (quantitative Qualitative) with a qualitative dominant status was employed to collect and analyze data. In the quantitative phase of this study, data were collected by administrating a background questionnaire and the Survey of Reading Strategies (SORS). Data collected by these tools were statistically analyzed with descriptive analysis and one-way Analysis of Variance. In the qualitative phase of this study, data were collected through retrospective miscue analysis (RMA) and semi-structured interviews. Data collected by these methods were coded in order to verify reading patterns among participants. Quantitative results demonstrated that second language learners from different language backgrounds and English proficiency levels perceived the use of reading strategies differently. Qualitative results demonstrated that Saudi-Arabian second language learners tend to transfer their reading strategy in relying on small grain size units while reading in English. These results bring a new perspective to the second language reading field by demonstrating that second language learners from different language backgrounds apply reading strategies differently based on their initial reading development in their first languages

    Fuṣḥá, ‘āmmīyah, or both?: Towards a theoretical framework for written Cairene Arabic

    Get PDF
    The Arabic language is a complex, diglossic language, with varying written (fuṣḥá) and spoken (‘āmmīyah) forms. While the study of mixing between fuṣḥá and ‘āmmīyah in spoken Arabic has received some scholarly attention, far less attention has been paid to mixing in writing, which this study seeks to address. Badawi’s (1973) landmark study of Egyptian Arabic use identified five language levels, assuming naturally that written Arabic exists as either Classical or Modern Standard Arabic, while mixing between written and spoken forms is reserved as a feature of Educated Spoken Arabic (ESA), despite the proliferation of mixed literary works by renowned writers such as Tawfiq al-Hakim, Yusuf Idris and Yusuf Sibai at the time. Since Badawi’s (1973) study, studies of mixed Arabic have centred around ESA (Eid, 1988; Bassiouney, 2006), uncovering to some extent the type and degree of, and motivations for, mixing, which have been used as a backdrop for the examination of mixed writing in this study. More recently, Høigilt & Mejdell (2017), Mejdell (2014), Ibrahim (2010), and Rosenbaum (2000) have identified occurrences of mixing in written Arabic. The aim of this study therefore, is to take a holistic view of Arabic writing, across different times and media, towards establishing a theoretical framework for Egyptian Arabic writing, including fuṣḥá, ‘āmmīyah and socalled ‘mixed’ forms. The catalyst for this study, as well as for the proliferation of mixed and ‘āmmīyah writing, has been the expansion of the internet and the rapid increase in online writing. For Arabic at least, the Arab Spring and social media within it, have played an important role in the widespread use of ‘āmmīyah in writing, which this study aims to place within the wider context of Arabic writing
    corecore