400 research outputs found
Doctor of Philosophy in Computer Science
dissertationOver the last decade, social media has emerged as a revolutionary platform for informal communication and social interactions among people. Publicly expressing thoughts, opinions, and feelings is one of the key characteristics of social media. In this dissertation, I present research on automatically acquiring knowledge from social media that can be used to recognize people's affective state (i.e., what someone feels at a given time) in text. This research addresses two types of affective knowledge: 1) hashtag indicators of emotion consisting of emotion hashtags and emotion hashtag patterns, and 2) affective understanding of similes (a form of figurative comparison). My research introduces a bootstrapped learning algorithm for learning hashtag in- dicators of emotions from tweets with respect to five emotion categories: Affection, Anger/Rage, Fear/Anxiety, Joy, and Sadness/Disappointment. With a few seed emotion hashtags per emotion category, the bootstrapping algorithm iteratively learns new hashtags and more generalized hashtag patterns by analyzing emotion in tweets that contain these indicators. Emotion phrases are also harvested from the learned indicators to train additional classifiers that use the surrounding word context of the phrases as features. This is the first work to learn hashtag indicators of emotions. My research also presents a supervised classification method for classifying affective polarity of similes in Twitter. Using lexical, semantic, and sentiment properties of different simile components as features, supervised classifiers are trained to classify a simile into a positive or negative affective polarity class. The property of comparison is also fundamental to the affective understanding of similes. My research introduces a novel framework for inferring implicit properties that 1) uses syntactic constructions, statistical association, dictionary definitions and word embedding vector similarity to generate and rank candidate properties, 2) re-ranks the top properties using influence from multiple simile components, and 3) aggregates the ranks of each property from different methods to create a final ranked list of properties. The inferred properties are used to derive additional features for the supervised classifiers to further improve affective polarity recognition. Experimental results show substantial improvements in affective understanding of similes over the use of existing sentiment resources
A hybrid representation based simile component extraction
Simile, a special type of metaphor, can help people to express their ideas more clearly. Simile component extraction is to extract tenors and vehicles from sentences. This task has a realistic significance since it is useful for building cognitive knowledge base. With the development of deep neural networks, researchers begin to apply neural models to component extraction. Simile components should be in cross-domain. According to our observations, words in cross-domain always have different concepts. Thus, concept is important when identifying whether two words are simile components or not. However, existing models do not integrate concept into their models. It is difficult for these models to identify the concept of a word. What’s more, corpus about simile component extraction is limited. There are a number of rare words or unseen words, and the representations of these words are always not proper enough. Exiting models can hardly extract simile components accurately when there are low-frequency words in sentences. To solve these problems, we propose a hybrid representation-based component extraction (HRCE) model. Each word in HRCE is represented in three different levels: word level, concept level and character level. Concept representations (representations in concept level) can help HRCE to identify the words in cross-domain more accurately. Moreover, with the help of character representations (representations in character levels), HRCE can represent the meaning of a word more properly since words are consisted of characters and these characters can partly represent the meaning of words. We conduct experiments to compare the performance between HRCE and existing models. The experiment results show that HRCE significantly outperforms current models
ValenTo: Sentiment Analysis of Figurative Language Tweets with Irony and Sarcasm
This paper describes the system used by the ValenTo team in the Task 11, Sentiment Analysis of Figurative Language in Twitter, at SemEval
2015. Our system used a regression model and additional external resources to assign polarity values. A distinctive feature of our approach is that we used not only word-sentiment lexicons providing polarity annotations, but also novel resources for dealing with emotions and psycholinguistic information. These are important aspects to tackle in figurative language such as irony and sarcasm, which were represented in the dataset. The system also exploited novel and standard structural features of tweets. Considering the different kinds of figurative language in the dataset our submission obtained good results in recognizing sentiment polarity in both ironic and sarcastic tweets
Linguistic-based Patterns for Figurative Language Processing: The Case of Humor Recognition and Irony Detection
El lenguaje figurado representa una de las tareas más difĂciles del procesamiento del lenguaje natural. A
diferencia del lenguaje literal, el lenguaje figurado hace uso de recursos lingĂĽĂsticos tales como la
ironĂa, el humor, el sarcasmo, la metáfora, la analogĂa, entre otros, para comunicar significados
indirectos que la mayorĂa de las veces no son interpretables sĂłlo en tĂ©rminos de informaciĂłn sintáctica
o semántica. Por el contrario, el lenguaje figurado refleja patrones del pensamiento que adquieren
significado pleno en contextos comunicativos y sociales, lo cual hace que tanto su representaciĂłn
lingĂĽĂstica, asĂ como su procesamiento computacional, se vuelvan tareas por demás complejas.
En este contexto, en esta tesis de doctorado se aborda una problemática relacionada con el
procesamiento del lenguaje figurado a partir de patrones lingĂĽĂsticos. En particular, nuestros esfuerzos
se centran en la creaciĂłn de un sistema capaz de detectar automáticamente instancias de humor e ironĂa
en textos extraĂdos de medios sociales. Nuestra hipĂłtesis principal se basa en la premisa de que el
lenguaje refleja patrones de conceptualizaciĂłn; es decir, al estudiar el lenguaje, estudiamos tales
patrones. Por tanto, al analizar estos dos dominios del lenguaje figurado, pretendemos dar argumentos
respecto a cĂłmo la gente los concibe, y sobre todo, a cĂłmo esa concepciĂłn hace que tanto humor como
ironĂa sean verbalizados de una forma particular en diversos medios sociales. En este contexto, uno de
nuestros mayores intereses es demostrar cómo el conocimiento que proviene del análisis de diferentes
niveles de estudio lingĂĽĂstico puede representar un conjunto de patrones relevantes para identificar
automáticamente usos figurados del lenguaje. Cabe destacar que contrario a la mayorĂa de
aproximaciones que se han enfocado en el estudio del lenguaje figurado, en nuestra investigaciĂłn no
buscamos dar argumentos basados Ăşnicamente en ejemplos prototĂpicos, sino en textos cuyas
caracterĂsticasReyes PĂ©rez, A. (2012). Linguistic-based Patterns for Figurative Language Processing: The Case of Humor Recognition and Irony Detection [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/16692Palanci
On the difficulty of automatically detecting irony: beyond a simple case of negation
The final publication is available at Springer via http://dx.doi.org/10.1007/s10115-013-0652-8It is well known that irony is one of the most subtle devices used to, in a refined way and without a negation marker, deny what is literally said. As such, its automatic detection would represent valuable knowledge regarding tasks as diverse as sentiment analysis, information extraction, or decision making. The research described in this article is focused on identifying key values of components to represent underlying characteristics of this linguistic phenomenon. In the absence of a negation marker, we focus on representing the core of irony by means of three conceptual layers. These layers involve 8 different textual features. By representing four available data sets with these features, we try to find hints about how to deal with this unexplored task from a computational point of view. Our findings are assessed by human annotators in two strata: isolated sentences and entire documents. The results show how complex and subjective the task of automatically detecting irony could be.The research work of Paolo Rosso was done in the framework of the European Commission WIQ-EI Web Information Quality Evaluation Initiative (IRSES grant no. 269180) project within the FP 7 Marie Curie People, the DIANA-APPLICATIONS - Finding Hidden Knowledge in Texts: Applications (TIN2012-38603-C02-01) project, and the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems.Reyes Pérez, A.; Rosso, P. (2014). On the difficulty of automatically detecting irony: beyond a simple case of negation. Knowledge and Information Systems. 40(3):595-614. https://doi.org/10.1007/s10115-013-0652-8S595614403Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Comput Linguistics 34(4):555–596Atserias J, Casas B, Comelles E, González M, Padró L, Padró M (2006) Freeling 1.3: syntactic and semantic services in an open-source nlp library. In: Proceedings of the 5th international conference on language resources and evaluation, pp 48–55Attardo S (2007) Irony as relevant inappropriateness. In: Gibbs R, Colston H (eds) Irony in language and thought. Taylor and Francis Group, London, pp 135–174Banerjee S, Agarwal N (2012) Analyzing collective behavior from blogs using swarm intelligence. Knowl Inf Syst. doi: 10.1007/s10115-012-0512-yBeydoun G, Hoffmann A (2012) Dynamic evaluation of the development process of knowledge-based information systems. Knowl Inf Syst. doi: 10.1007/s10115-012-0491-zBurfoot C, Baldwin T (2009) Automatic satire detection: are you having a laugh? In: ACL-IJCNLP ’09: proceedings of the ACL-IJCNLP 2009 conference short papers, pp 161–164Carvalho P, Sarmento L, Silva M, de Oliveira E (2009) Clues for detecting irony in user-generated contents: oh...!! It’s “so easy”; -). In: TSA ’09: proceeding of the 1st international CIKM workshop on topic-sentiment analysis for mass opinion. ACM, Hong Kong, China, pp 53–56Clark H, Gerrig R (1984) On the pretense theory of irony. J Exp Psychol Gen 113(1):121–126Colston H (2007) On necessary conditions for verbal irony comprehension. In: Gibbs R, Colston H (eds) Irony in language and thought. Taylor and Francis Group, London, pp 97–134Colston H, Gibbs R (2007) A brief history of irony. In: Gibbs R, Colston H (eds) Irony in language and thought. Taylor and Francis Group, London, pp 3–24Curcó C (2007) Irony: negation, echo, and metarepresentation. In: Gibbs R, Colston H (eds) Irony in language and thought. Taylor and Francis Group, London, pp 269–296Davidov D, Tsur O, Rappoport A (2010) Semi-supervised recognition of sarcastic sentences in Twitter and Amazon. In: Proceedings of the 14th conference on computational natural language learning, CoNLL ’10. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 107–116Francisco V, Gervás P, Peinado F (2010) Ontological reasoning for improving the treatment of emotions in text. Knowl Inf Syst 24(2):23Gibbs R (2007) Irony in talk among friends. In: Gibbs R, Colston H (eds) Irony in language and thought. Taylor and Francis Group, London, pp 339–360Gibbs R, Colston H (2007) The future of irony studies. In: Gibbs R, Colston H (eds) Irony in language and thought. Taylor and Francis Group, LondonGiora R (1995) On irony and negation. Discourse Process 19(2):239–264Giora R, Balaban N, Fein O, Alkabets I (2005) Negation as positivity in disguise. In: Colston H, Katz A (eds) Figurative language comprehension: social and cultural influences. Erlbaum, Hillsdale, pp 233–258Giora R, Federman S, Kehat A, Fein O, Sabah H (2005) Irony aptness. Humor 18:23–39Grice H (1975) Logic and conversation. In: Cole P, Morgan JL (eds) Syntax and semantics, vol 3. Academic Press, New York, pp 41–58Horn L, Kato Y (2000) Introduction: negation and polarity at the millennium. In: Horn L, Kato Y (eds) Studies in negation and polarity. Oxford University Press, Oxford, pp 1–19Kaup B, Lüdtke J, Zwaan R (2006) Processing negated sentences with contradictory predicates: is a door that is not open mentally closed? J Pragmat 38:1033–1050Kisilevich S, Ang CS, Last M (2011) Large-scale analysis of self-disclosure patterns among online social networks users: A Russian context. Knowl Inf Syst. doi: 10.1007/s10115-011-0443-zKreuz R (2001) Using figurative language to increase advertising effectiveness. In: Office of Naval Research Military Personnel Research Science Workshop. University of Memphis, Memphis, TNKumon-Nakamura S, Glucksberg S, Brown M (2007) How about another piece of pie: the allusional pretense theory of discourse irony. In: Gibbs R, Colston H (eds) Irony in language and thought. Taylor and Francis Group, LondonLangacker R (1991) Concept, image and symbol, the cognitive basis of grammar. Mounton de Gruyter, BerlinLiu J, Wang K (2012) Anonymizing bag-valued sparse data by semantic similarity-based clustering. Knowl Inf Syst. doi: 10.1007/s10115-012-0515-8Lucariello J (2007) Situational irony: a concept of events gone away. In: Gibbs R, Colston H (eds) Irony in language and thought. Taylor and Francis Group, London, pp 467–498Miller G (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the ACL, pp 271–278Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the 2002 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Morristown, NJ, USA, pp 79–86Pedersen T, Patwardhan S, Michelizzi J (2004) Wordnet:similarity—measuring the relatedness of concepts. In: Proceeding of the 9th national conference on artificial intelligence (AAAI-04). Association for Computational Linguistics, Morristown, NJ, USA, pp 1024–1025Reyes A, Rosso P (2011) Mining subjective knowledge from customer reviews: a specific case of irony detection. In: Proceedings of the 2nd workshop on computational approaches to subjectivity and sentiment analysis (WASSA 2.011). Association for Computational Linguistics, pp 118–124Reyes A, Rosso P (2012) Making objective decisions from subjective data: detecting irony in customers reviews. Decis Support Syst 53(4):754–760. doi: 10.1016/j.dss.2012.05.027Reyes A, Rosso P, Buscaldi D (2012) From humor recognition to irony detection: the figurative language of social media. Data Knowl Eng 74:1–12. doi: 10.1016/j.datak.2012.02.005Sarmento L, Carvalho P, Silva M, de Oliveira E (2009) Automatic creation of a reference corpus for political opinion mining in user-generated content, In: TSA ’09: proceeding of the 1st international CIKM workshop on topic-sentiment analysis for mass opinion. ACM, Hong Kong, China, pp 29–36Sperber D, Wilson D (1992) On verbal irony. Lingua 87:53–76Tsur O, Davidov D, Rappoport A (2010) ICWSM—a great catchy name: semi-supervised recognition of sarcastic sentences in online product reviews. In: Cohen WW, Gosling S (eds) Proceedings of the 4t international AAAI conference on weblogs and social media. The AAAI Press, Washington, DC, pp 162–169Utsumi A (1996) A unified theory of irony and its computational formalization. In: Proceedings of the 16th conference on computational linguistics. Association for Computational Linguistics, Morristown, NJ, USA, pp 962–967Veale T, Hao Y (2009) Support structures for linguistic creativity: a computational analysis of creative irony in similes. In: Proceedings of CogSci 2009, the 31st annual meeting of the cognitive science society, pp 1376–1381Veale T, Hao Y (2010) Detecting ironic intent in creative comparisons. In: Proceedings of 19th European conference on artificial intelligence—ECAI 2010. IOS Press, Amsterdam, The Netherlands, pp 765–770Whissell C (2009) Using the revised dictionary of affect in language to quantify the emotional undertones of samples of natural language. Psychol Rep 105(2):509–521Wilson D, Sperber D (2007) On verbal irony. In: Gibbs R, Colston H (eds) Irony in language and thought. Taylor and Francis Group, London, pp 35–56Zagibalov T, Belyatskaya K, Carroll J (2010) Comparable English-Russian book review corpora for sentiment analysis. In: Proceedings of the 1st workshop on computational approaches to subjectivity and sentiment analysis. Lisbon, Portugal, pp 67–7
From humor recognition to Irony detection: The figurative language of social media
[EN] The research described in this paper is focused on analyzing two playful domains of language: humor and irony, in order to identify key values components for their automatic processing. In particular, we are focused on describing a model for recognizing these phenomena in social media, such as "tweets". Our experiments are centered on five data sets retrieved from Twitter taking advantage of user-generated tags, such as "#humor" and "#irony". The model, which is based on textual features, is assessed on two dimensions: representativeness and relevance. The results, apart from providing some valuable insights into the creative and figurative usages of language, are positive regarding humor, and encouraging regarding irony. (C) 2012 Elsevier B.V. All rights reserved.This work has been done in the framework of the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems and it has been partially funded by the European Commission as part of the WIQEI IRSES project (grant no. 269180) within the FP 7 Marie Curie People Framework, and by MICINN as part of the Text-Enterprise 2.0 project (TIN2009-13391-C04-03) within the Plan I + D + I. The National Council for Science and Technology (CONACyT - Mexico) has funded the research work of Antonio Reyes.Reyes PĂ©rez, A.; Rosso, P.; Buscaldi, D. (2012). From humor recognition to Irony detection: The figurative language of social media. Data and Knowledge Engineering. 74:1-12. https://doi.org/10.1016/j.datak.2012.02.005S1127
Neural Simile Recognition with Cyclic Multitask Learning and Local Attention
Simile recognition is to detect simile sentences and to extract simile
components, i.e., tenors and vehicles. It involves two subtasks: {\it simile
sentence classification} and {\it simile component extraction}. Recent work has
shown that standard multitask learning is effective for Chinese simile
recognition, but it is still uncertain whether the mutual effects between the
subtasks have been well captured by simple parameter sharing. We propose a
novel cyclic multitask learning framework for neural simile recognition, which
stacks the subtasks and makes them into a loop by connecting the last to the
first. It iteratively performs each subtask, taking the outputs of the previous
subtask as additional inputs to the current one, so that the interdependence
between the subtasks can be better explored. Extensive experiments show that
our framework significantly outperforms the current state-of-the-art model and
our carefully designed baselines, and the gains are still remarkable using
BERT.Comment: AAAI 202
I-WAS: a Data Augmentation Method with GPT-2 for Simile Detection
Simile detection is a valuable task for many natural language processing
(NLP)-based applications, particularly in the field of literature. However,
existing research on simile detection often relies on corpora that are limited
in size and do not adequately represent the full range of simile forms. To
address this issue, we propose a simile data augmentation method based on
\textbf{W}ord replacement And Sentence completion using the GPT-2 language
model. Our iterative process called I-WAS, is designed to improve the quality
of the augmented sentences. To better evaluate the performance of our method in
real-world applications, we have compiled a corpus containing a more diverse
set of simile forms for experimentation. Our experimental results demonstrate
the effectiveness of our proposed data augmentation method for simile
detection.Comment: 15 pages, 1 figur
- …