Search CORE

27 research outputs found

Refining Implicit Argument Annotation for UCCA

Author: Cui Ruixiang
Hershcovich Daniel
Publication venue
Publication date: 01/01/2020
Field of study

Predicate-argument structure analysis is a central component in meaning representations of text. The fact that some arguments are not explicitly mentioned in a sentence gives rise to ambiguity in language understanding, and renders it difficult for machines to interpret text correctly. However, only few resources represent implicit roles for NLU, and existing studies in NLP only make coarse distinctions between categories of arguments omitted from linguistic form. This paper proposes a typology for fine-grained implicit argument annotation on top of Universal Conceptual Cognitive Annotation's foundational layer. The proposed implicit argument categorisation is driven by theories of implicit role interpretation and consists of six types: Deictic, Generic, Genre-based, Type-identifiable, Non-specific, and Iterated-set. We exemplify our design by revisiting part of the UCCA EWT corpus, providing a new dataset annotated with the refinement layer, and making a comparative analysis with other schemes.Comment: DMR 202

arXiv.org e-Print Archive

Copenhagen University Research Information System

Cultural Adaptation of Recipes

Author: Cao Yong
Cui Ruixiang
Dare Megan
Donatelli Lucia
Hershcovich Daniel
Karamolegkou Antonia
Kementchedjhieva Yova
Zhou Li
Publication venue
Publication date: 26/10/2023
Field of study

Building upon the considerable advances in Large Language Models (LLMs), we are now equipped to address more sophisticated tasks demanding a nuanced understanding of cross-cultural contexts. A key example is recipe adaptation, which goes beyond simple translation to include a grasp of ingredients, culinary techniques, and dietary preferences specific to a given culture. We introduce a new task involving the translation and cultural adaptation of recipes between Chinese and English-speaking cuisines. To support this investigation, we present CulturalRecipes, a unique dataset comprised of automatically paired recipes written in Mandarin Chinese and English. This dataset is further enriched with a human-written and curated test set. In this intricate task of cross-cultural recipe adaptation, we evaluate the performance of various methods, including GPT-4 and other LLMs, traditional machine translation, and information retrieval techniques. Our comprehensive analysis includes both automatic and human evaluation metrics. While GPT-4 exhibits impressive abilities in adapting Chinese recipes into English, it still lags behind human expertise when translating English recipes into Chinese. This underscores the multifaceted nature of cultural adaptations. We anticipate that these insights will significantly contribute to future research on culturally-aware language models and their practical application in culturally diverse contexts.Comment: Accepted to TAC

arXiv.org e-Print Archive

AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

Author: Chen Weizhu
Cui Ruixiang
Duan Nan
Guo Yiduo
Liang Yaobo
Lu Shuai
Saied Amin
Wang Yanlin
Zhong Wanjun
Publication venue
Publication date: 13/04/2023
Field of study

Evaluating the general abilities of foundation models to tackle human-level tasks is a vital aspect of their development and application in the pursuit of Artificial General Intelligence (AGI). Traditional benchmarks, which rely on artificial datasets, may not accurately represent human-level capabilities. In this paper, we introduce AGIEval, a novel benchmark specifically designed to assess foundation model in the context of human-centric standardized exams, such as college entrance exams, law school admission tests, math competitions, and lawyer qualification tests. We evaluate several state-of-the-art foundation models, including GPT-4, ChatGPT, and Text-Davinci-003, using this benchmark. Impressively, GPT-4 surpasses average human performance on SAT, LSAT, and math competitions, attaining a 95% accuracy rate on the SAT Math test and a 92.5% accuracy on the English test of the Chinese national college entrance exam. This demonstrates the extraordinary performance of contemporary foundation models. In contrast, we also find that GPT-4 is less proficient in tasks that require complex reasoning or specific domain knowledge. Our comprehensive analyses of model capabilities (understanding, knowledge, reasoning, and calculation) reveal these models' strengths and limitations, providing valuable insights into future directions for enhancing their general capabilities. By concentrating on tasks pertinent to human cognition and decision-making, our benchmark delivers a more meaningful and robust evaluation of foundation models' performance in real-world scenarios. The data, code, and all model outputs are released in https://github.com/microsoft/AGIEval.Comment: 19 page

arXiv.org e-Print Archive

Great Service! Fine-grained Parsing of Implicit Arguments

Author: Cui Ruixiang
Hershcovich Daniel
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2021
Field of study

Broad-coverage meaning representations in NLP mostly focus on explicitly expressed content. More importantly, the scarcity of datasets annotating diverse implicit roles limits empirical studies into their linguistic nuances. For example, in the web review "Great service!", the provider and consumer are implicit arguments of different types. We examine an annotated corpus of fine-grained implicit arguments (Cui and Hershcovich, 2020) by carefully re-annotating it, resolving several inconsistencies. Subsequently, we present the first transition-based neural parser that can handle implicit arguments dynamically, and experiment with two different transition systems on the improved dataset. We find that certain types of implicit arguments are more difficult to parse than others and that the simpler system is more accurate in recovering implicit arguments, despite having a lower overall parsing score, attesting current reasoning limitations of NLP models. This work will facilitate a better understanding of implicit and underspecified language, by incorporating it holistically into meaning representations.Comment: IWPT 202

arXiv.org e-Print Archive

Copenhagen University Research Information System

HUJI-KU at MRP 2020:Two Transition-based Neural Parsers

Author: Arviv Ofir
Cui Ruixiang
Hershcovich Daniel
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

Crossref

Copenhagen University Research Information System

How Conservative are Language Models? Adapting to the Introduction of Gender-Neutral Pronouns

Author: Brandl Stephanie
Cui Ruixiang
Søgaard Anders
Publication venue
Publication date: 03/05/2022
Field of study

Gender-neutral pronouns have recently been introduced in many languages to a) include non-binary people and b) as a generic singular. Recent results from psycholinguistics suggest that gender-neutral pronouns (in Swedish) are not associated with human processing difficulties. This, we show, is in sharp contrast with automated processing. We show that gender-neutral pronouns in Danish, English, and Swedish are associated with higher perplexity, more dispersed attention patterns, and worse downstream performance. We argue that such conservativity in language models may limit widespread adoption of gender-neutral pronouns and must therefore be resolved.Comment: To appear at NAACL 202

arXiv.org e-Print Archive

Analysis of Drought Vulnerability Characteristics and Risk Assessment Based on Information Distribution and Diffusion in Southwest China

Author: Chuan Liang
Lu Zhao
Ningbo Cui
Ruixiang Yang
Shouzheng Jiang
Publication venue: 'MDPI AG'
Publication date: 01/06/2018
Field of study

Drought vulnerability characteristics and risk assessment form the basis of drought risk management. In this study, the standardized precipitation index (SPI) and drought damage rates (DDR) were combined to analyze drought vulnerability characteristics and drought risk in Southwest China (SC). The information distribution method was applied to estimate the probability density of the drought strength (DS) and the two-dimensional normal information diffusion method was used to construct the vulnerability relationships between DS and drought damage (DD). The risk was then evaluated by combining the probability function of the DS and the DD vulnerability curve. The results showed that the relationship between the DS and the DD was nonlinear in SC and its provinces. With the increase in DS, the degree of DD increased gradually, stabilized, or decreased toward the end. However, the vulnerability characteristics of the different provinces varied widely due to multiple risk-bearing bodies and abilities to resist disasters. The risk values obtained across the range of time scales of the SPI were not significantly different. The yielding probabilities will be reduced for the crop area by 10%, 30%, and 70% due to drought. Compared to a normal year in SC, the probability values were 16.04%, 10.29%, and 2.70%, respectively. These results have the potential to provide a reference for agricultural production and drought risk management

Directory of Open Access Journals

Can AMR Assist Legal and Logical Reasoning?

Author: Cui Ruixiang
Hershcovich Daniel
López-Acosta Hugo-Andrés
Schrack Nikolaus
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2022
Field of study

Online Research Database In Technology

Flexible ITO-free organic solar cells over 10% by employing drop-coated conductive PEDOT:PSS transparent anodes

Author: Cui Huiqin
Fanady Billy
Ge Ziyi
Huang Jiaming
Peng Ruixiang
Song Wei
Zhang Jianfeng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Flexible ITO-free organic solar cells over 10% by employing drop-coated conductive PEDOT:PSS transparent anode

Institutional Repository of Ningbo Institute of Material Technology & Engineering, CAS