Search CORE

3,026 research outputs found

Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder

Author: Abdul-Mageed Muhammad
Alwajih Fakhraddin
Inciarte Alcides Alcoba
Mohamed Abdelrahman
Nagoudi El Moatez Billah
Publication venue
Publication date: 15/11/2023
Field of study

Although image captioning has a vast array of applications, it has not reached its full potential in languages other than English. Arabic, for instance, although the native language of more than 400 million people, remains largely underrepresented in this area. This is due to the lack of labeled data and powerful Arabic generative models. We alleviate this issue by presenting a novel vision-language model dedicated to Arabic, dubbed \textit{Violet}. Our model is based on a vision encoder and a Gemini text decoder that maintains generation fluency while allowing fusion between the vision and language components. To train our model, we introduce a new method for automatically acquiring data from available English datasets. We also manually prepare a new dataset for evaluation. \textit{Violet} performs sizeably better than our baselines on all of our evaluation datasets. For example, it reaches a CIDEr score of

61.2

on our manually annotated dataset and achieves an improvement of

13

points on Flickr8k.Comment: Accepted in ArabicNLP Conferenc

arXiv.org e-Print Archive

21st Century Ottoman: The Ottoman Turkish Linguistic Revival in Digital Affinity Spaces

Author: Matthew Chovanec
Publication venue: 'Modern Language Association'
Publication date: 01/01/2017
Field of study

In 2014, the Turkish National Education Council recommending teaching Ottoman Turkish as a mandatory subject in all high schools. Since that time, this historical register of the Turkish language has been making a popular comeback. This is especially true online, where participants are creating and sharing new content written in Ottoman. This article examines evidence of the revival of Ottoman Turkish in digital “affinity spaces” in order to show it is not only being excavated, but is developing independently from its own historical past. In taking into consideration new calligraphic styles, the political and cultural subtext of memes, and the rewriting of modern Turkish back into the Ottoman lexicon, this paper will identify the form of Ottoman emerging in digital spaces as a unique new iteration of the language

Humanities Commons

How to utilise students' cultural and linguistic experiences to promote language learning : looking beyond the school

Author: Issa Tözün
Publication venue: 'Universitat Autonoma de Barcelona'
Publication date: 01/01/2009
Field of study

This article will first give an overview of multicultural, multilingual Britain and place it within the context of community literacy. It will then discuss education reforms and policies that have aimed to place multicultural and multiple literacy at their centre before considering theoretical basis for current understanding of additional language learning in the UK. Then the article describes a project, based on ethnographic case studies of an early years centre and its reception classes. Preliminary results and conclusions are discussed

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Diposit Digital de Documents de la UAB

Recommended from our members

Leveraging Text-to-Scene Generation for Language Elicitation and Documentation

Author: Ulinski Morgan Elizabeth
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

Text-to-scene generation systems take input in the form of a natural language text and output a 3D scene illustrating the meaning of that text. A major benefit of text-to-scene generation is that it allows users to create custom 3D scenes without requiring them to have a background in 3D graphics or knowledge of specialized software packages. This contributes to making text-to-scene useful in scenarios from creative applications to education. The primary goal of this thesis is to explore how we can use text-to-scene generation in a new way: as a tool to facilitate the elicitation and formal documentation of language. In particular, we use text-to-scene generation (a) to assist field linguists studying endangered languages; (b) to provide a cross-linguistic framework for formally modeling spatial language; and (c) to collect language data using crowdsourcing. As a side effect of these goals, we also explore the problem of multilingual text-to-scene generation, that is, systems for generating 3D scenes from languages other than English. The contributions of this thesis are the following. First, we develop a novel tool suite (the WordsEye Linguistics Tools, or WELT) that uses the WordsEye text-to-scene system to assist field linguists with eliciting and documenting endangered languages. WELT allows linguists to create custom elicitation materials and to document semantics in a formal way. We test WELT with two endangered languages, Nahuatl and Arrernte. Second, we explore the question of how to learn a syntactic parser for WELT. We show that an incremental learning method using a small number of annotated dependency structures can produce reasonably accurate results. We demonstrate that using a parser trained in this way can significantly decrease the time it takes an annotator to label a new sentence with dependency information. Third, we develop a framework that generates 3D scenes from spatial and graphical semantic primitives. We incorporate this system into the WELT tools for creating custom elicitation materials, allowing users to directly manipulate the underlying semantics of a generated scene. Fourth, we introduce a deep semantic representation of spatial relations and use this to create a new resource, SpatialNet, which formally declares the lexical semantics of spatial relations for a language. We demonstrate how SpatialNet can be used to support multilingual text-to-scene generation. Finally, we show how WordsEye and the semantic resources it provides can be used to facilitate elicitation of language using crowdsourcing

Columbia University Academic Commons

Recommended from our members

B!SON: A Tool for Open Access Journal Recommendation

Author: Entrup Elias
Eppelin Anita
Ewerth Ralph
Hartwig Josephine
Hoppe Anett
Tullney Marco
Wohlgemuth Michael
Publication venue: Heidelberg : Springer
Publication date: 01/01/2022
Field of study

Finding a suitable open access journal to publish scientific work is a complex task: Researchers have to navigate a constantly growing number of journals, institutional agreements with publishers, funders’ conditions and the risk of Predatory Publishers. To help with these challenges, we introduce a web-based journal recommendation system called B!SON. It is developed based on a systematic requirements analysis, built on open data, gives publisher-independent recommendations and works across domains. It suggests open access journals based on title, abstract and references provided by the user. The recommendation quality has been evaluated using a large test set of 10,000 articles. Development by two German scientific libraries ensures the longevity of the project

Repositorium für Naturwissenschaften und Technik

Several categories of Large Language Models (LLMs): A Short Survey

Author: Chandrasekharan Manoj
Pahune Saurabh
Publication venue
Publication date: 05/07/2023
Field of study

Large Language Models(LLMs)have become effective tools for natural language processing and have been used in many different fields. This essay offers a succinct summary of various LLM subcategories. The survey emphasizes recent developments and efforts made for various LLM kinds, including task-based financial LLMs, multilingual language LLMs, biomedical and clinical LLMs, vision language LLMs, and code language models. The survey gives a general summary of the methods, attributes, datasets, transformer models, and comparison metrics applied in each category of LLMs. Furthermore, it highlights unresolved problems in the field of developing chatbots and virtual assistants, such as boosting natural language processing, enhancing chatbot intelligence, and resolving moral and legal dilemmas. The purpose of this study is to provide readers, developers, academics, and users interested in LLM-based chatbots and virtual intelligent assistant technologies with useful information and future directions

arXiv.org e-Print Archive

Deep Learning: Our Miraculous Year 1990-1991

Author: Schmidhuber Juergen
Publication venue
Publication date: 12/05/2020
Field of study

In 2020, we will celebrate that many of the basic ideas behind the deep learning revolution were published three decades ago within fewer than 12 months in our "Annus Mirabilis" or "Miraculous Year" 1990-1991 at TU Munich. Back then, few people were interested, but a quarter century later, neural networks based on these ideas were on over 3 billion devices such as smartphones, and used many billions of times per day, consuming a significant fraction of the world's compute.Comment: 37 pages, 188 references, based on work of 4 Oct 201

arXiv.org e-Print Archive

Learning About And Becoming Aware Of Reading Strategies And Metacognition In English By Adult Second Language Learners

Author: Biazotto André Aline
Publication venue: ISU ReD: Research and eData
Publication date: 07/06/2018
Field of study

The purpose of the present study was to investigate the need to learn and/or become aware of reading strategies and metacognitive strategies by adult English language learners while making sense of English texts. A mixed method grounded theory (MMGT) in a sequential design (quantitative Qualitative) with a qualitative dominant status was employed to collect and analyze data. In the quantitative phase of this study, data were collected by administrating a background questionnaire and the Survey of Reading Strategies (SORS). Data collected by these tools were statistically analyzed with descriptive analysis and one-way Analysis of Variance. In the qualitative phase of this study, data were collected through retrospective miscue analysis (RMA) and semi-structured interviews. Data collected by these methods were coded in order to verify reading patterns among participants. Quantitative results demonstrated that second language learners from different language backgrounds and English proficiency levels perceived the use of reading strategies differently. Qualitative results demonstrated that Saudi-Arabian second language learners tend to transfer their reading strategy in relying on small grain size units while reading in English. These results bring a new perspective to the second language reading field by demonstrating that second language learners from different language backgrounds apply reading strategies differently based on their initial reading development in their first languages

ISU ReD: Research and eData

Fuṣḥá, ‘āmmīyah, or both?: Towards a theoretical framework for written Cairene Arabic

Author: Khalil Saussan
Publication venue: University of Leeds
Publication date: 01/09/2018
Field of study

The Arabic language is a complex, diglossic language, with varying written (fuṣḥá) and spoken (‘āmmīyah) forms. While the study of mixing between fuṣḥá and ‘āmmīyah in spoken Arabic has received some scholarly attention, far less attention has been paid to mixing in writing, which this study seeks to address. Badawi’s (1973) landmark study of Egyptian Arabic use identified five language levels, assuming naturally that written Arabic exists as either Classical or Modern Standard Arabic, while mixing between written and spoken forms is reserved as a feature of Educated Spoken Arabic (ESA), despite the proliferation of mixed literary works by renowned writers such as Tawfiq al-Hakim, Yusuf Idris and Yusuf Sibai at the time. Since Badawi’s (1973) study, studies of mixed Arabic have centred around ESA (Eid, 1988; Bassiouney, 2006), uncovering to some extent the type and degree of, and motivations for, mixing, which have been used as a backdrop for the examination of mixed writing in this study. More recently, Høigilt & Mejdell (2017), Mejdell (2014), Ibrahim (2010), and Rosenbaum (2000) have identified occurrences of mixing in written Arabic. The aim of this study therefore, is to take a holistic view of Arabic writing, across different times and media, towards establishing a theoretical framework for Egyptian Arabic writing, including fuṣḥá, ‘āmmīyah and socalled ‘mixed’ forms. The catalyst for this study, as well as for the proliferation of mixed and ‘āmmīyah writing, has been the expansion of the internet and the rapid increase in online writing. For Arabic at least, the Arab Spring and social media within it, have played an important role in the widespread use of ‘āmmīyah in writing, which this study aims to place within the wider context of Arabic writing

White Rose E-theses Online