Search CORE

147,148 research outputs found

Automatic Dish Name Extraction from User-generated Content Using LLM

Author: Hibschman Johann
Lin Bo
Oshima Kathleen
Publication venue: Technical Disclosure Commons
Publication date: 07/11/2023
Field of study

Extraction of dish names from user-provided content such as food photographs and captions, restaurant reviews, and other free-form text is a challenging task. Rule-based approaches are difficult to maintain and improve. Pattern matching against a predefined dictionary often suffers from low recall. Conventional machine learning models require large amounts of labeled data to perform named entity recognition (e.g., to recognize dish names) which is often costly and does not scale well across multiple languages and countries. This disclosure describes the use of a multimodal large language model to automatically extract dish names from user-generated content such as food photographs and associated free-form text such as tags, captions, etc. Dish name extraction from the user-provided tags can be formulated as an open vocabulary dish name entity recognition and discovery task, which fits naturally with the framework of pre-trained LLMs, and leverages the model capability in handling multilingual, multicultural text understanding

Technical Disclosure Common

Bilingual language processing

Author: Desmet Timothy
Duyck Wouter
Publication venue: 'Wiley'
Publication date: 01/01/2007
Field of study

Ghent University Academic Bibliography

Using Dual-Language Books to Preserve Language & Culture in Alaska Native Communities

Author: Bartles Jonathan
Ohle Kathryn
Publication venue
Publication date: 11/09/2016
Field of study

“Children learn their language on their mother’s lap.” This conventional wisdom from a Cup’ik Elder describes the approach used by many Alaska Native peoples to promote native language acquisition. Presumably, the children learn by listening to stories and tales from a trusted parent or caregiver. However, what happens when the caregiver does not speak the native language? This chapter describes an effort to address this issue while also promoting better educational outcomes by providing access to diverse dual-language books in Alaska Native languages through the use of a digital children’s library. Potential benefits from these efforts include an increase in resources for schools, a revitalization of Indigenous languages, and an increase in access, with hopes that future work will show evidence that using these dual-language books encourage greater parent support and involvement in education, support second language acquisition, and promote a strong sense of identity. Implications and future efforts follow.Ye

ScholarWorks@UA

KnowNER: Incremental Multilingual Knowledge in Named Entity Recognition

Author: Del Corro Luciano
Dembelova Tatiana
Hoffart Johannes
Seyler Dominic
Weikum Gerhard
Publication venue
Publication date: 01/01/2017
Field of study

KnowNER is a multilingual Named Entity Recognition (NER) system that leverages different degrees of external knowledge. A novel modular framework divides the knowledge into four categories according to the depth of knowledge they convey. Each category consists of a set of features automatically generated from different information sources (such as a knowledge-base, a list of names or document-specific semantic annotations) and is used to train a conditional random field (CRF). Since those information sources are usually multilingual, KnowNER can be easily trained for a wide range of languages. In this paper, we show that the incorporation of deeper knowledge systematically boosts accuracy and compare KnowNER with state-of-the-art NER approaches across three languages (i.e., English, German and Spanish) performing amongst state-of-the art systems in all of them

arXiv.org e-Print Archive

MPG.PuRe

Relearning Athabascan languages in Alaska: Creating sustainable language communities through creolization

Author: Holton Gary
Publication venue: Cambridge Scholars Press
Publication date: 01/01/2009
Field of study

ScholarWorks@UA

Recommended from our members

The lexical fallacy in emotion research: Mistaking vernacular words for psychological entities.

Author: Fiske Alan Page
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Vernacular lexemes appear self-evident, so we unwittingly reify them. But the words and phrases of natural languages comprise a treacherous basis for identifying valid psychological constructs, as I illustrate in emotion research. Like other vernacular lexemes, the emotion labels in natural languages do not have definite, stable, mutually transparent meanings, and any one vernacular word may be used to denote multiple scientifically distinct entities. In addition, the consequential choice of one lexeme to name a scientific construct rather than any of its partial synonyms is often arbitrary. Furthermore, a given vernacular lexeme from any one of the world's 7000 languages rarely maps one-to-one into an exactly corresponding vernacular lexeme in other languages. Words related to anger in different languages illustrate this. Since each language constitutes a distinct taxonomy of things in the world, most or all languages must fail to cut nature at its joints. In short, it is pernicious to use one language's dictionary as the source of psychological constructs. So scientists need to coin new technical names for scientifically derived constructs-names precisely defined in terms of the constellation of features or components that characterize the constructs they denote. The development of the kama muta construct illustrates one way to go about this. Kama muta is the emotion evoked by sudden intensification of communal sharing-universally experienced but not isomorphic with any vernacular lexeme such as heart warming, moving, touching, collective pride, tender, nostalgic, sentimental, Awww-so cute!. (PsycINFO Database Record (c) 2019 APA, all rights reserved)

eScholarship - University of California