3,026 research outputs found
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder
Although image captioning has a vast array of applications, it has not
reached its full potential in languages other than English. Arabic, for
instance, although the native language of more than 400 million people, remains
largely underrepresented in this area. This is due to the lack of labeled data
and powerful Arabic generative models. We alleviate this issue by presenting a
novel vision-language model dedicated to Arabic, dubbed \textit{Violet}. Our
model is based on a vision encoder and a Gemini text decoder that maintains
generation fluency while allowing fusion between the vision and language
components. To train our model, we introduce a new method for automatically
acquiring data from available English datasets. We also manually prepare a new
dataset for evaluation. \textit{Violet} performs sizeably better than our
baselines on all of our evaluation datasets. For example, it reaches a CIDEr
score of on our manually annotated dataset and achieves an improvement
of points on Flickr8k.Comment: Accepted in ArabicNLP Conferenc
21st Century Ottoman: The Ottoman Turkish Linguistic Revival in Digital Affinity Spaces
In 2014, the Turkish National Education Council recommending teaching Ottoman Turkish as a mandatory subject in all high schools. Since that time, this historical register of the Turkish language has been making a popular comeback. This is especially true online, where participants are creating and sharing new content written in Ottoman. This article examines evidence of the revival of Ottoman Turkish in digital “affinity spaces” in order to show it is not only being excavated, but is developing independently from its own historical past. In taking into consideration new calligraphic styles, the political and cultural subtext of memes, and the rewriting of modern Turkish back into the Ottoman lexicon, this paper will identify the form of Ottoman emerging in digital spaces as a unique new iteration of the language
How to utilise students' cultural and linguistic experiences to promote language learning : looking beyond the school
This article will first give an overview of multicultural, multilingual Britain and place it within the context of community literacy. It will then discuss education reforms and policies that have aimed to place multicultural and multiple literacy at their centre before considering theoretical basis for current understanding of additional language learning in the UK. Then the article describes a project, based on ethnographic case studies of an early years centre and its reception classes. Preliminary results and conclusions are discussed
Recommended from our members
Leveraging Text-to-Scene Generation for Language Elicitation and Documentation
Text-to-scene generation systems take input in the form of a natural language text and output a 3D scene illustrating the meaning of that text. A major benefit of text-to-scene generation is that it allows users to create custom 3D scenes without requiring them to have a background in 3D graphics or knowledge of specialized software packages. This contributes to making text-to-scene useful in scenarios from creative applications to education. The primary goal of this thesis is to explore how we can use text-to-scene generation in a new way: as a tool to facilitate the elicitation and formal documentation of language. In particular, we use text-to-scene generation (a) to assist field linguists studying endangered languages; (b) to provide a cross-linguistic framework for formally modeling spatial language; and (c) to collect language data using crowdsourcing. As a side effect of these goals, we also explore the problem of multilingual text-to-scene generation, that is, systems for generating 3D scenes from languages other than English.
The contributions of this thesis are the following. First, we develop a novel tool suite (the WordsEye Linguistics Tools, or WELT) that uses the WordsEye text-to-scene system to assist field linguists with eliciting and documenting endangered languages. WELT allows linguists to create custom elicitation materials and to document semantics in a formal way. We test WELT with two endangered languages, Nahuatl and Arrernte. Second, we explore the question of how to learn a syntactic parser for WELT. We show that an incremental learning method using a small number of annotated dependency structures can produce reasonably accurate results. We demonstrate that using a parser trained in this way can significantly decrease the time it takes an annotator to label a new sentence with dependency information. Third, we develop a framework that generates 3D scenes from spatial and graphical semantic primitives. We incorporate this system into the WELT tools for creating custom elicitation materials, allowing users to directly manipulate the underlying semantics of a generated scene. Fourth, we introduce a deep semantic representation of spatial relations and use this to create a new resource, SpatialNet, which formally declares the lexical semantics of spatial relations for a language. We demonstrate how SpatialNet can be used to support multilingual text-to-scene generation. Finally, we show how WordsEye and the semantic resources it provides can be used to facilitate elicitation of language using crowdsourcing
Recommended from our members
B!SON: A Tool for Open Access Journal Recommendation
Finding a suitable open access journal to publish scientific work is a complex task: Researchers have to navigate a constantly growing number of journals, institutional agreements with publishers, funders’ conditions and the risk of Predatory Publishers. To help with these challenges, we introduce a web-based journal recommendation system called B!SON. It is developed based on a systematic requirements analysis, built on open data, gives publisher-independent recommendations and works across domains. It suggests open access journals based on title, abstract and references provided by the user. The recommendation quality has been evaluated using a large test set of 10,000 articles. Development by two German scientific libraries ensures the longevity of the project
Several categories of Large Language Models (LLMs): A Short Survey
Large Language Models(LLMs)have become effective tools for natural language
processing and have been used in many different fields. This essay offers a
succinct summary of various LLM subcategories. The survey emphasizes recent
developments and efforts made for various LLM kinds, including task-based
financial LLMs, multilingual language LLMs, biomedical and clinical LLMs,
vision language LLMs, and code language models. The survey gives a general
summary of the methods, attributes, datasets, transformer models, and
comparison metrics applied in each category of LLMs. Furthermore, it highlights
unresolved problems in the field of developing chatbots and virtual assistants,
such as boosting natural language processing, enhancing chatbot intelligence,
and resolving moral and legal dilemmas. The purpose of this study is to provide
readers, developers, academics, and users interested in LLM-based chatbots and
virtual intelligent assistant technologies with useful information and future
directions
Deep Learning: Our Miraculous Year 1990-1991
In 2020, we will celebrate that many of the basic ideas behind the deep
learning revolution were published three decades ago within fewer than 12
months in our "Annus Mirabilis" or "Miraculous Year" 1990-1991 at TU Munich.
Back then, few people were interested, but a quarter century later, neural
networks based on these ideas were on over 3 billion devices such as
smartphones, and used many billions of times per day, consuming a significant
fraction of the world's compute.Comment: 37 pages, 188 references, based on work of 4 Oct 201
Learning About And Becoming Aware Of Reading Strategies And Metacognition In English By Adult Second Language Learners
The purpose of the present study was to investigate the need to learn and/or become aware of reading strategies and metacognitive strategies by adult English language learners while making sense of English texts. A mixed method grounded theory (MMGT) in a sequential design (quantitative Qualitative) with a qualitative dominant status was employed to collect and analyze data. In the quantitative phase of this study, data were collected by administrating a background questionnaire and the Survey of Reading Strategies (SORS). Data collected by these tools were statistically analyzed with descriptive analysis and one-way Analysis of Variance. In the qualitative phase of this study, data were collected through retrospective miscue analysis (RMA) and semi-structured interviews. Data collected by these methods were coded in order to verify reading patterns among participants. Quantitative results demonstrated that second language learners from different language backgrounds and English proficiency levels perceived the use of reading strategies differently. Qualitative results demonstrated that Saudi-Arabian second language learners tend to transfer their reading strategy in relying on small grain size units while reading in English. These results bring a new perspective to the second language reading field by demonstrating that second language learners from different language backgrounds apply reading strategies differently based on their initial reading development in their first languages
Fuṣḥá, ‘āmmīyah, or both?: Towards a theoretical framework for written Cairene Arabic
The Arabic language is a complex, diglossic language, with varying written
(fuṣḥá) and spoken (‘āmmīyah) forms. While the study of mixing between
fuṣḥá and ‘āmmīyah in spoken Arabic has received some scholarly attention,
far less attention has been paid to mixing in writing, which this study seeks
to address.
Badawi’s (1973) landmark study of Egyptian Arabic use identified five
language levels, assuming naturally that written Arabic exists as either
Classical or Modern Standard Arabic, while mixing between written and
spoken forms is reserved as a feature of Educated Spoken Arabic (ESA),
despite the proliferation of mixed literary works by renowned writers such as
Tawfiq al-Hakim, Yusuf Idris and Yusuf Sibai at the time. Since Badawi’s
(1973) study, studies of mixed Arabic have centred around ESA (Eid, 1988;
Bassiouney, 2006), uncovering to some extent the type and degree of, and
motivations for, mixing, which have been used as a backdrop for the
examination of mixed writing in this study. More recently, Høigilt & Mejdell
(2017), Mejdell (2014), Ibrahim (2010), and Rosenbaum (2000) have
identified occurrences of mixing in written Arabic.
The aim of this study therefore, is to take a holistic view of Arabic writing,
across different times and media, towards establishing a theoretical
framework for Egyptian Arabic writing, including fuṣḥá, ‘āmmīyah and socalled
‘mixed’ forms.
The catalyst for this study, as well as for the proliferation of mixed and
‘āmmīyah writing, has been the expansion of the internet and the rapid
increase in online writing. For Arabic at least, the Arab Spring and social
media within it, have played an important role in the widespread use of
‘āmmīyah in writing, which this study aims to place within the wider context
of Arabic writing
- …