1,928 research outputs found
LT3: sentiment analysis of figurative tweets: piece of cake #NotReally
This paper describes our contribution to the SemEval-2015 Task 11 on sentiment analysis of figurative language in Twitter. We considered two approaches, classification and regression, to provide fine-grained sentiment scores for a set of tweets that are rich in sarcasm, irony and metaphor. To this end, we combined a variety of standard lexical and syntactic features with specific features for capturing figurative content. All experiments were done using supervised learning with LIBSVM. For both runs, our system ranked fourth among fifteen submissions
Multilingual Multi-Figurative Language Detection
Figures of speech help people express abstract concepts and evoke stronger
emotions than literal expressions, thereby making texts more creative and
engaging. Due to its pervasive and fundamental character, figurative language
understanding has been addressed in Natural Language Processing, but it's
highly understudied in a multilingual setting and when considering more than
one figure of speech at the same time. To bridge this gap, we introduce
multilingual multi-figurative language modelling, and provide a benchmark for
sentence-level figurative language detection, covering three common figures of
speech and seven languages. Specifically, we develop a framework for figurative
language detection based on template-based prompt learning. In so doing, we
unify multiple detection tasks that are interrelated across multiple figures of
speech and languages, without requiring task- or language-specific modules.
Experimental results show that our framework outperforms several strong
baselines and may serve as a blueprint for the joint modelling of other
interrelated tasks.Comment: Accepted to ACL 2023 (Findings
Multilingual Multi-Figurative Language Detection
Figures of speech help people express abstract concepts and evoke stronger emotions than literal expressions, thereby making texts more creative and engaging. Due to its pervasive and fundamental character, figurative language understanding has been addressed in Natural Language Processing, but it's highly understudied in a multilingual setting and when considering more than one figure of speech at the same time. To bridge this gap, we introduce multilingual multi-figurative language modelling, and provide a benchmark for sentence-level figurative language detection, covering three common figures of speech and seven languages. Specifically, we develop a framework for figurative language detection based on template-based prompt learning. In so doing, we unify multiple detection tasks that are interrelated across multiple figures of speech and languages, without requiring task- or language-specific modules. Experimental results show that our framework outperforms several strong baselines and may serve as a blueprint for the joint modelling of other interrelated tasks.</p
Exploring Metaphorical Senses and Word Representations for Identifying Metonyms
A metonym is a word with a figurative meaning, similar to a metaphor. Because
metonyms are closely related to metaphors, we apply features that are used
successfully for metaphor recognition to the task of detecting metonyms. On the
ACL SemEval 2007 Task 8 data with gold standard metonym annotations, our system
achieved 86.45% accuracy on the location metonyms. Our code can be found on
GitHub.Comment: 9 pages, 8 pages conten
Specializing distributional vectors of allwords for lexical entailment
Semantic specialization methods fine-tune distributional word vectors using lexical knowledge from external resources (e.g., WordNet) to accentuate a particular relation between words. However, such post-processing methods suffer from limited coverage as they affect only vectors of words seen in the external resources. We present the first postprocessing method that specializes vectors of all vocabulary words – including those unseen in the resources – for the asymmetric relation of lexical entailment (LE) (i.e., hyponymyhypernymy relation). Leveraging a partially LE-specialized distributional space, our POSTLE (i.e., post-specialization for LE) model learns an explicit global specialization function, allowing for specialization of vectors of unseen words, as well as word vectors from other languages via cross-lingual transfer. We capture the function as a deep feedforward neural network: its objective re-scales vector norms to reflect the concept hierarchy while simultaneously attracting hyponymyhypernymy pairs to better reflect semantic similarity. An extended model variant augments the basic architecture with an adversarial discriminator. We demonstrate the usefulness and versatility of POSTLE models with different input distributional spaces in different scenarios (monolingual LE and zero-shot cross-lingual LE transfer) and tasks (binary and graded LE). We report consistent gains over state-of-the-art LE-specialization methods, and successfully LE-specialize word vectors for languages without any external lexical knowledge
"With 1 follower I must be AWESOME :P". Exploring the role of irony markers in irony recognition
Conversations in social media often contain the use of irony or sarcasm, when
the users say the opposite of what they really mean. Irony markers are the
meta-communicative clues that inform the reader that an utterance is ironic. We
propose a thorough analysis of theoretically grounded irony markers in two
social media platforms: and . Classification and frequency
analysis show that for , typographic markers such as emoticons and
emojis are the most discriminative markers to recognize ironic utterances,
while for the morphological markers (e.g., interjections, tag
questions) are the most discriminative.Comment: ICWSM 201
Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings
The notions of concreteness and imageability, traditionally important in
psycholinguistics, are gaining significance in semantic-oriented natural
language processing tasks. In this paper we investigate the predictability of
these two concepts via supervised learning, using word embeddings as
explanatory variables. We perform predictions both within and across languages
by exploiting collections of cross-lingual embeddings aligned to a single
vector space. We show that the notions of concreteness and imageability are
highly predictable both within and across languages, with a moderate loss of up
to 20% in correlation when predicting across languages. We further show that
the cross-lingual transfer via word embeddings is more efficient than the
simple transfer via bilingual dictionaries
OYXOY: A Modern NLP Test Suite for Modern Greek
This paper serves as a foundational step towards the development of a
linguistically motivated and technically relevant evaluation suite for Greek
NLP. We initiate this endeavor by introducing four expert-verified evaluation
tasks, specifically targeted at natural language inference, word sense
disambiguation (through example comparison or sense selection) and metaphor
detection. More than language-adapted replicas of existing tasks, we contribute
two innovations which will resonate with the broader resource and evaluation
community. Firstly, our inference dataset is the first of its kind, marking not
just \textit{one}, but rather \textit{all} possible inference labels,
accounting for possible shifts due to e.g. ambiguity or polysemy. Secondly, we
demonstrate a cost-efficient method to obtain datasets for under-resourced
languages. Using ChatGPT as a language-neutral parser, we transform the
Dictionary of Standard Modern Greek into a structured format, from which we
derive the other three tasks through simple projections. Alongside each task,
we conduct experiments using currently available state of the art machinery.
Our experimental baselines affirm the challenging nature of our tasks and
highlight the need for expedited progress in order for the Greek NLP ecosystem
to keep pace with contemporary mainstream research
Correlation-based Intrinsic Evaluation of Word Vector Representations
We introduce QVEC-CCA--an intrinsic evaluation metric for word vector
representations based on correlations of learned vectors with features
extracted from linguistic resources. We show that QVEC-CCA scores are an
effective proxy for a range of extrinsic semantic and syntactic tasks. We also
show that the proposed evaluation obtains higher and more consistent
correlations with downstream tasks, compared to existing approaches to
intrinsic evaluation of word vectors that are based on word similarity.Comment: RepEval 2016, 5 page
- …