147,148 research outputs found

    Automatic Dish Name Extraction from User-generated Content Using LLM

    Get PDF
    Extraction of dish names from user-provided content such as food photographs and captions, restaurant reviews, and other free-form text is a challenging task. Rule-based approaches are difficult to maintain and improve. Pattern matching against a predefined dictionary often suffers from low recall. Conventional machine learning models require large amounts of labeled data to perform named entity recognition (e.g., to recognize dish names) which is often costly and does not scale well across multiple languages and countries. This disclosure describes the use of a multimodal large language model to automatically extract dish names from user-generated content such as food photographs and associated free-form text such as tags, captions, etc. Dish name extraction from the user-provided tags can be formulated as an open vocabulary dish name entity recognition and discovery task, which fits naturally with the framework of pre-trained LLMs, and leverages the model capability in handling multilingual, multicultural text understanding

    Bilingual language processing

    Get PDF

    Using Dual-Language Books to Preserve Language & Culture in Alaska Native Communities

    Get PDF
    “Children learn their language on their mother’s lap.” This conventional wisdom from a Cup’ik Elder describes the approach used by many Alaska Native peoples to promote native language acquisition. Presumably, the children learn by listening to stories and tales from a trusted parent or caregiver. However, what happens when the caregiver does not speak the native language? This chapter describes an effort to address this issue while also promoting better educational outcomes by providing access to diverse dual-language books in Alaska Native languages through the use of a digital children’s library. Potential benefits from these efforts include an increase in resources for schools, a revitalization of Indigenous languages, and an increase in access, with hopes that future work will show evidence that using these dual-language books encourage greater parent support and involvement in education, support second language acquisition, and promote a strong sense of identity. Implications and future efforts follow.Ye

    KnowNER: Incremental Multilingual Knowledge in Named Entity Recognition

    Full text link
    KnowNER is a multilingual Named Entity Recognition (NER) system that leverages different degrees of external knowledge. A novel modular framework divides the knowledge into four categories according to the depth of knowledge they convey. Each category consists of a set of features automatically generated from different information sources (such as a knowledge-base, a list of names or document-specific semantic annotations) and is used to train a conditional random field (CRF). Since those information sources are usually multilingual, KnowNER can be easily trained for a wide range of languages. In this paper, we show that the incorporation of deeper knowledge systematically boosts accuracy and compare KnowNER with state-of-the-art NER approaches across three languages (i.e., English, German and Spanish) performing amongst state-of-the art systems in all of them
    • …
    corecore