8 research outputs found
Mapping Brains with Language Models: A Survey
Over the years, many researchers have seemingly made the same observation:
Brain and language model activations exhibit some structural similarities,
enabling linear partial mappings between features extracted from neural
recordings and computational language models. In an attempt to evaluate how
much evidence has been accumulated for this observation, we survey over 30
studies spanning 10 datasets and 8 metrics. How much evidence has been
accumulated, and what, if anything, is missing before we can draw conclusions?
Our analysis of the evaluation methods used in the literature reveals that some
of the metrics are less conservative. We also find that the accumulated
evidence, for now, remains ambiguous, but correlations with model size and
quality provide grounds for cautious optimism
Copyright Violations and Large Language Models
Language models may memorize more than just facts, including entire chunks of
texts seen during training. Fair use exemptions to copyright laws typically
allow for limited use of copyrighted material without permission from the
copyright holder, but typically for extraction of information from copyrighted
materials, rather than {\em verbatim} reproduction. This work explores the
issue of copyright violations and large language models through the lens of
verbatim memorization, focusing on possible redistribution of copyrighted text.
We present experiments with a range of language models over a collection of
popular books and coding problems, providing a conservative characterization of
the extent to which language models can redistribute these materials. Overall,
this research highlights the need for further examination and the potential
impact on future developments in natural language processing to ensure
adherence to copyright regulations. Code is at
\url{https://github.com/coastalcph/CopyrightLLMs}.Comment: EMNLP 202
Cultural Compass: Predicting Transfer Learning Success in Offensive Language Detection with Cultural Features
The increasing ubiquity of language technology necessitates a shift towards
considering cultural diversity in the machine learning realm, particularly for
subjective tasks that rely heavily on cultural nuances, such as Offensive
Language Detection (OLD). Current understanding underscores that these tasks
are substantially influenced by cultural values, however, a notable gap exists
in determining if cultural features can accurately predict the success of
cross-cultural transfer learning for such subjective tasks. Addressing this,
our study delves into the intersection of cultural features and transfer
learning effectiveness. The findings reveal that cultural value surveys indeed
possess a predictive power for cross-cultural transfer learning success in OLD
tasks and that it can be further improved using offensive word distance. Based
on these results, we advocate for the integration of cultural information into
datasets. Additionally, we recommend leveraging data sources rich in cultural
information, such as surveys, to enhance cultural adaptability. Our research
signifies a step forward in the quest for more inclusive, culturally sensitive
language technologies.Comment: Findings of EMNLP 202
Large Language Models Converge on Brain-Like Word Representations
One of the greatest puzzles of all time is how understanding arises from
neural mechanics. Our brains are networks of billions of biological neurons
transmitting chemical and electrical signals along their connections. Large
language models are networks of millions or billions of digital neurons,
implementing functions that read the output of other functions in complex
networks. The failure to see how meaning would arise from such mechanics has
led many cognitive scientists and philosophers to various forms of dualism --
and many artificial intelligence researchers to dismiss large language models
as stochastic parrots or jpeg-like compressions of text corpora. We show that
human-like representations arise in large language models. Specifically, the
larger neural language models get, the more their representations are
structurally similar to neural response measurements from brain imaging.Comment: Work in proces
Cultural Adaptation of Recipes
Building upon the considerable advances in Large Language Models (LLMs), we
are now equipped to address more sophisticated tasks demanding a nuanced
understanding of cross-cultural contexts. A key example is recipe adaptation,
which goes beyond simple translation to include a grasp of ingredients,
culinary techniques, and dietary preferences specific to a given culture. We
introduce a new task involving the translation and cultural adaptation of
recipes between Chinese and English-speaking cuisines. To support this
investigation, we present CulturalRecipes, a unique dataset comprised of
automatically paired recipes written in Mandarin Chinese and English. This
dataset is further enriched with a human-written and curated test set. In this
intricate task of cross-cultural recipe adaptation, we evaluate the performance
of various methods, including GPT-4 and other LLMs, traditional machine
translation, and information retrieval techniques. Our comprehensive analysis
includes both automatic and human evaluation metrics. While GPT-4 exhibits
impressive abilities in adapting Chinese recipes into English, it still lags
behind human expertise when translating English recipes into Chinese. This
underscores the multifaceted nature of cultural adaptations. We anticipate that
these insights will significantly contribute to future research on
culturally-aware language models and their practical application in culturally
diverse contexts.Comment: Accepted to TAC
Argument Mining: Claim Annotation, Identification, Verification
Researchers writing scientific articles summarize their work in the abstracts mentioning the final outcome of their study. Argumentation mining can be used to extract the claim of the researchers as well as the evidence that could support their claim. The rapid growth of scientific articles demands automated tools that could help in the detection and evaluation of the scientific claims’ veracity. However, there are neither a lot of studies focusing on claim identification and verification neither a lot of annotated corpora available to effectively train deep learning models. For this reason, we annotated two argument mining corpora and perform several experiments with state-of-the-art BERT-based models aiming to identify and verify scientific claims. We find that using SciBERT provides optimal results regardless of the dataset. Furthermore, increasing the amount of training data can improve the performance of every model we used. These findings highlight the need for large-scale argument mining corpora, as well as domain-specific pre-trained models
Investigation of Transfer Languages for Parsing Latin: Italic Branch vs. Hellenic Branch
Choosing a transfer language is a crucial step in transfer learning. In much previous research on dependency parsing, related languages have successfully been used. However, when parsing Latin, it has been suggested that languages such as ancient Greek could be helpful. In this work we parse Latin in a low-resource scenario, with the main goal to investigate if Greek languages are more helpful for parsing Latin than related Italic languages, and show that this is indeed the case. We further investigate the influence of other factors including training set size and content as well as linguistic distances. We find that one explanatory factor seems to be the syntactic similarity between Latin and Ancient Greek. The influence of genres or shared annotation projects seems to have a smaller impact
Investigation of Transfer Languages for Parsing Latin: Italic Branch vs. Hellenic Branch
Choosing a transfer language is a crucial step in transfer learning. In much previous research on dependency parsing, related languages have successfully been used. However, when parsing Latin, it has been suggested that languages such as ancient Greek could be helpful. In this work we parse Latin in a low-resource scenario, with the main goal to investigate if Greek languages are more helpful for parsing Latin than related Italic languages, and show that this is indeed the case. We further investigate the influence of other factors including training set size and content as well as linguistic distances. We find that one explanatory factor seems to be the syntactic similarity between Latin and Ancient Greek. The influence of genres or shared annotation projects seems to have a smaller impact