1,676 research outputs found

    Irony and Sarcasm Detection in Twitter: The Role of Affective Content

    Full text link
    Tesis por compendioSocial media platforms, like Twitter, offer a face-saving ability that allows users to express themselves employing figurative language devices such as irony to achieve different communication purposes. Dealing with such kind of content represents a big challenge for computational linguistics. Irony is closely associated with the indirect expression of feelings, emotions and evaluations. Interest in detecting the presence of irony in social media texts has grown significantly in the recent years. In this thesis, we introduce the problem of detecting irony in social media under a computational linguistics perspective. We propose to address this task by focusing, in particular, on the role of affective information for detecting the presence of such figurative language device. Attempting to take advantage of the subjective intrinsic value enclosed in ironic expressions, we present a novel model, called emotIDM, for detecting irony relying on a wide range of affective features. For characterising an ironic utterance, we used an extensive set of resources covering different facets of affect from sentiment to finer-grained emotions. Results show that emotIDM has a competitive performance across the experiments carried out, validating the effectiveness of the proposed approach. Another objective of the thesis is to investigate the differences among tweets labeled with #irony and #sarcasm. Our aim is to contribute to the less investigated topic in computational linguistics on the separation between irony and sarcasm in social media, again, with a special focus on affective features. We also studied a less explored hashtag: #not. We find data-driven arguments on the differences among tweets containing these hashtags, suggesting that the above mentioned hashtags are used to refer different figurative language devices. We identify promising features based on affect-related phenomena for discriminating among different kinds of figurative language devices. We also analyse the role of polarity reversal in tweets containing ironic hashtags, observing that the impact of such phenomenon varies. In the case of tweets labeled with #sarcasm often there is a full reversal, whereas in the case of those tagged with #irony there is an attenuation of the polarity. We analyse the impact of irony and sarcasm on sentiment analysis, observing a drop in the performance of NLP systems developed for this task when irony is present. Therefore, we explored the possible use of our findings in irony detection for the development of an irony-aware sentiment analysis system, assuming that the identification of ironic content could help to improve the correct identification of sentiment polarity. To this aim, we incorporated emotIDM into a pipeline for determining the polarity of a given Twitter message. We compared our results with the state of the art determined by the "Semeval-2015 Task 11" shared task, demonstrating the relevance of considering affective information together with features alerting on the presence of irony for performing sentiment analysis of figurative language for this kind of social media texts. To summarize, we demonstrated the usefulness of exploiting different facets of affective information for dealing with the presence of irony in Twitter.Las plataformas de redes sociales, como Twitter, ofrecen a los usuarios la posibilidad de expresarse de forma libre y espontanea haciendo uso de diferentes recursos lingüísticos como la ironía para lograr diferentes propósitos de comunicación. Manejar ese tipo de contenido representa un gran reto para la lingüística computacional. La ironía está estrechamente vinculada con la expresión indirecta de sentimientos, emociones y evaluaciones. El interés en detectar la presencia de ironía en textos de redes sociales ha aumentado significativamente en los últimos años. En esta tesis, introducimos el problema de detección de ironía en redes sociales desde una perspectiva de la lingüística computacional. Proponemos abordar dicha tarea enfocándonos, particularmente, en el rol de información relativa al afecto y las emociones para detectar la presencia de dicho recurso lingüístico. Con la intención de aprovechar el valor intrínseco de subjetividad contenido en las expresiones irónicas, presentamos un modelo para detectar la presencia de ironía denominado emotIDM, el cual está basado en una amplia variedad de rasgos afectivos. Para caracterizar instancias irónicas, utilizamos un amplio conjunto de recursos que cubren diferentes ámbitos afectivos: desde sentimientos (positivos o negativos) hasta emociones específicas definidas con una granularidad fina. Los resultados obtenidos muestran que emotIDM tiene un desempeño competitivo en los experimentos realizados, validando la efectividad del enfoque propuesto. Otro objetivo de la tesis es investigar las diferencias entre tweets etiquetados con #irony y #sarcasm. Nuestra finalidad es contribuir a un tema menos investigado en lingüística computacional: la separación entre el uso de ironía y sarcasmo en redes sociales, con especial énfasis en rasgos afectivos. Además, estudiamos un hashtag que ha sido menos analizado: #not. Nuestros resultados parecen evidenciar que existen diferencias entre los tweets que contienen dichos hashtags, sugiriendo que son utilizados para hacer referencia de diferentes recursos lingüísticos. Identificamos un conjunto de características basadas en diferentes fenómenos afectivos que parecen ser útiles para discriminar entre diferentes tipos de recursos lingüísticos. Adicionalmente analizamos la reversión de polaridad en tweets que contienen hashtags irónicos, observamos que el impacto de dicho fenómeno es diferente en cada uno de ellos. En el caso de los tweets que están etiquetados con el hashtag #sarcasm, a menudo hay una reversión total, mientras que en el caso de los tweets etiquetados con el hashtag #irony se produce una atenuación de la polaridad. Llevamos a cabo un estudio del impacto de la ironía y el sarcasmo en el análisis de sentimientos, observamos una disminución en el rendimiento de los sistemas de PLN desarrollados para dicha tarea cuando la ironía está presente. Por consiguiente, exploramos la posibilidad de utilizar nuestros resultados en detección de ironía para el desarrollo de un sistema de análisis de sentimientos que considere de la presencia de ironía, suponiendo que la detección de contenido irónico podría ayudar a mejorar la correcta identificación del sentimiento expresado en un texto dado. Con este objetivo, incorporamos emotIDM como la primera fase en un sistema de análisis de sentimientos para determinar la polaridad de mensajes en Twitter. Comparamos nuestros resultados con el estado del arte establecido en la tarea de evaluación "Semeval-2015 Task 11", demostrando la importancia de utilizar información afectiva en conjunto con características que alertan de la presencia de la ironía para desempeñar análisis de sentimientos en textos con lenguaje figurado que provienen de redes sociales. En resumen, demostramos la utilidad de aprovechar diferentes aspectos de información relativa al afecto y las emociones para tratar cuestiones relativas a la presencia de la ironíLes plataformes de xarxes socials, com Twitter, oferixen als usuaris la possibilitat d'expressar-se de forma lliure i espontània fent ús de diferents recursos lingüístics com la ironia per aconseguir diferents propòsits de comunicació. Manejar aquest tipus de contingut representa un gran repte per a la lingüística computacional. La ironia està estretament vinculada amb l'expressió indirecta de sentiments, emocions i avaluacions. L'interés a detectar la presència d'ironia en textos de xarxes socials ha augmentat significativament en els últims anys. En aquesta tesi, introduïm el problema de detecció d'ironia en xarxes socials des de la perspectiva de la lingüística computacional. Proposem abordar aquesta tasca enfocant-nos, particularment, en el rol d'informació relativa a l'afecte i les emocions per detectar la presència d'aquest recurs lingüístic. Amb la intenció d'aprofitar el valor intrínsec de subjectivitat contingut en les expressions iròniques, presentem un model per a detectar la presència d'ironia denominat emotIDM, el qual està basat en una àmplia varietat de trets afectius. Per caracteritzar instàncies iròniques, utilitzàrem un ampli conjunt de recursos que cobrixen diferents àmbits afectius: des de sentiments (positius o negatius) fins emocions específiques definides de forma molt detallada. Els resultats obtinguts mostres que emotIDM té un rendiment competitiu en els experiments realitzats, validant l'efectivitat de l'enfocament proposat. Un altre objectiu de la tesi és investigar les diferències entre tweets etiquetats com a #irony i #sarcasm. La nostra finalitat és contribuir a un tema menys investigat en lingüística computacional: la separació entre l'ús d'ironia i sarcasme en xarxes socials, amb especial èmfasi amb els trets afectius. A més, estudiem un hashtag que ha sigut menys estudiat: #not. Els nostres resultats pareixen evidenciar que existixen diferències entre els tweets que contenen els hashtags esmentats, cosa que suggerix que s'utilitzen per fer referència de diferents recursos lingüístics. Identifiquem un conjunt de característiques basades en diferents fenòmens afectius que pareixen ser útils per a discriminar entre diferents tipus de recursos lingüístics. Addicionalment analitzem la reversió de polaritat en tweets que continguen hashtags irònics, observant que l'impacte del fenomen esmentat és diferent per a cadascun d'ells. En el cas dels tweet que estan etiquetats amb el hashtag #sarcasm, a sovint hi ha una reversió total, mentre que en el cas dels tweets etiquetats amb el hashtag #irony es produïx una atenuació de polaritat. Duem a terme un estudi de l'impacte de la ironia i el sarcasme en l'anàlisi de sentiments, on observem una disminució en el rendiment dels sistemes de PLN desenvolupats per a aquestes tasques quan la ironia està present. Per consegüent, vam explorar la possibilitat d'utilitzar els nostres resultats en detecció d'ironia per a desenvolupar un sistema d'anàlisi de sentiments que considere la presència d'ironia, suposant que la detecció de contingut irònic podria ajudar a millorar la correcta identificació del sentiment expressat en un text donat. Amb aquest objectiu, incorporem emotIDM com la primera fase en un sistema d'anàlisi de sentiments per determinar la polaritat de missatges en Twitter. Hem comparat els nostres resultats amb l'estat de l'art establert en la tasca d'avaluació "Semeval-2015 Task 11", demostrant la importància d'utilitzar informació afectiva en conjunt amb característiques que alerten de la presència de la ironia per exercir anàlisi de sentiments en textos amb llenguatge figurat que provenen de xarxes socials. En resum, hem demostrat la utilitat d'aprofitar diferents aspectes d'informació relativa a l'afecte i les emocions per tractar qüestions relatives a la presència d'ironia en Twitter.Hernández Farias, DI. (2017). Irony and Sarcasm Detection in Twitter: The Role of Affective Content [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/90544TESISCompendi

    A survey on author profiling, deception, and irony detection for the Arabic language

    Full text link
    "This is the peer reviewed version of the following article: [FULL CITE], which has been published in final form at [Link to final article using the DOI]. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving."[EN] The possibility of knowing people traits on the basis of what they write is a field of growing interest named author profiling. To infer a user's gender, age, native language, language variety, or even when the user lies, simply by analyzing her texts, opens a wide range of possibilities from the point of view of security. In this paper, we review the state of the art about some of the main author profiling problems, as well as deception and irony detection, especially focusing on the Arabic language.Qatar National Research Fund, Grant/Award Number: NPRP 9-175-1-033Rosso, P.; Rangel-Pardo, FM.; Hernandez-Farias, DI.; Cagnina, L.; Zaghouani, W.; Charfi, A. (2018). A survey on author profiling, deception, and irony detection for the Arabic language. Language and Linguistics Compass. 12(4):1-20. https://doi.org/10.1111/lnc3.12275S120124Abuhakema , G. Faraj , R. Feldman , A. Fitzpatrick , E. 2008 Annotating an arabic learner corpus for error Proceedings of The sixth international conference on Language Resources and Evaluation, LREC 2008Adouane , W. Dobnik , S. 2017 Identification of languages in algerian arabic multilingual documents Proceedings of The Third Arabic Natural Language Processing Workshop (WANLP)Adouane , W. Semmar , N. Johansson , R 2016a Romanized berber and romanized arabic automatic language identification using machine learning Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects; COLING 53 61Adouane , W. Semmar , N. Johansson , R. 2016b ASIREM participation at the discriminating similar languages shared task 2016 Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects; COLING 163 169Adouane , W. Semmar , N. Johansson , R. Bobicev , V. 2016c Automatic detection of arabicized berber and arabic varieties Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects; COLING 63 72Alfaifi , A. Atwell , E. Hedaya , I. 2014 Arabic learner corpus (ALC) v2: A new written and spoken corpus of Arabic learnersAlharbi , K. 2015 The irony volcano explodes black comedyAli , A. Bell , P. Renals , S. 2015 Automatic dialect detection in Arabic broadcast speechAlmeman , K. Lee , M. 2013 Automatic building of Arabic multi dialect text corpora by bootstrapping dialect words 1 6Aloshban , N. Al-Dossari , H. 2016 A new approach for group spam detection in social media for Arabic language (AGSD) 20 23Al-Sabbagh , R. Girju , R. 2012 YADAC: Yet another dialectal Arabic corpusAlsmearat , K. Al-Ayyoub , M. Al-Shalabi , R. 2014 An extensive study of the bag-of-words approach for gender identification of Arabic articlesAlsmearat , K. Shehab , M. Al-Ayyoub , M. Al-Shalabi , R. Kanaan , G. 2015 Emotion analysis of Arabic articles and its impact on identifying the authors genderArfath , P. Al-Badrashiny , M. Diab , M. El Kholy , A. Eskander , R. Habash , N. Pooleery , M. Rambow , O. Roth , R. M. 2014 MADAMIRA: A fast, comprehensive tool for morphological analysis and disambiguation of ArabicBarbieri , F. Basile , V. Croce , D. Nissim , M. Novielli , N. Patti , V. 2016 Overview of the Evalita 2016 sentiment polarity classification taskBarbieri , F. Saggion , H 2014 Modelling irony in twitter 56 64Barbieri , F. Saggion , H. Ronzano , F 2014 Modelling sarcasm in Twitter, a novel approachBasile , V. Bolioli , A. Nissim , M. Patti , V. Rosso , P. 2014 Overview of the Evalita 2014 sentiment polarity classification taskBlanchard, D., Tetreault, J., Higgins, D., Cahill, A., & Chodorow, M. (2013). TOEFL11: A CORPUS OF NON-NATIVE ENGLISH. ETS Research Report Series, 2013(2), i-15. doi:10.1002/j.2333-8504.2013.tb02331.xBosco, C., Patti, V., & Bolioli, A. (2013). Developing Corpora for Sentiment Analysis: The Case of Irony and Senti-TUT. IEEE Intelligent Systems, 28(2), 55-63. doi:10.1109/mis.2013.28Bouamor , H. Habash , N. Salameh , M. Zaghouani , W. Rambow , O. Abdulrahim , D. Oflazer , K. 2018 The MADAR Arabic Dialect Corpus and LexiconBouchlaghem , R. Elkhlifi , A. Faiz , R. 2014 Tunisian dialect Wordnet creation and enrichment using web resources and other Wordnets 104 113 https://doi.org/10.3115/v1/W14-3613Boujelbane , R. BenAyed , S. Belguith , L. H. 2013 Building bilingual lexicon to create dialect Tunisian corpora and adapt language modelCagnina L. Rosso , P 2015 Classification of deceptive opinions using a low dimensionality representationCavalli-Sforza , V. Saddiki , H. Bouzoubaa , K. Abouenour , L. Maamouri , M. Goshey , E. 2013 Bootstrapping a Wordnet for an Arabic dialect from other Wordnets and dictionary resourcesCotterell , R. Callison-Burch , C. 2014 A multi-dialect, multi-genre corpus of informal written ArabicDahlmeier , D. Tou Ng , H. Mei Wu , S. 2013 Building a large annotated corpus of learner English: the NUS corpus of learner English 22 31Darwish , K. Sajjad , H. Mubarak , H. 2014 Verifiably effective Arabic dialect identification 1465 1468Duh , K. Kirchhoff , K. 2006 Lexicon acquisition for dialectal Arabic using transductive learningElfardy , E. Diab , M. T. 2013 Sentence level dialect identification in Arabic 456 461Estival , D. Gaustad , T. Hutchinson , B. Bao-Pham , S. Radford , W. 2008 Author profiling for English and Arabic emailsFitzpatrick, E., Bachenko, J., & Fornaciari, T. (2015). Automatic Detection of Verbal Deception. Synthesis Lectures on Human Language Technologies, 8(3), 1-119. doi:10.2200/s00656ed1v01y201507hlt029Franco-Salvador, M., Rangel, F., Rosso, P., Taulé, M., & Antònia Martít, M. (2015). Language Variety Identification Using Distributed Representations of Words and Documents. Experimental IR Meets Multilinguality, Multimodality, and Interaction, 28-40. doi:10.1007/978-3-319-24027-5_3Ghosh , A. Li , G. Veale , T. Rosso , P. Shutova , E. Barnden , J. Reyes , A. 2015 Semeval-2015 task 11: Sentiment analysis of figurative language in twitter 470 478Graff , D. Maamouri , M. 2012 Developing LMF-XML bilingual dictionaries for colloquial Arabic dialects 269 274Habash , N. Khalifa , S. Eryani , F. Rambow , O. Abdulrahim , D. Erdmann , A. Saddiki , H. 2018 Unified Guidelines and Resources for Arabic Dialect OrthographyHabash , N. Rambow , O. Kiraz , G. 2005 Morphological analysis and generation for Arabic dialectsHaggan, M. (1991). Spelling errors in native Arabic-speaking English majors: A comparison between remedial students and fourth year students. System, 19(1-2), 45-61. doi:10.1016/0346-251x(91)90007-cHassan , H. Daud , N. M. 2011 Corpus analysis of conjunctions: Arabic learners difficulties with collocationsHayes-Harb, R. (2006). Native Speakers of Arabic and ESL Texts: Evidence for the Transfer of Written Word Identification Processes. TESOL Quarterly, 40(2), 321. doi:10.2307/40264525Hernández-Farías, I., Benedí, J.-M., & Rosso, P. (2015). Applying Basic Features from Sentiment Analysis for Automatic Irony Detection. Lecture Notes in Computer Science, 337-344. doi:10.1007/978-3-319-19390-8_38Hernández Fusilier, D., Montes-y-Gómez, M., Rosso, P., & Guzmán Cabrera, R. (2015). Detecting positive and negative deceptive opinions using PU-learning. Information Processing & Management, 51(4), 433-443. doi:10.1016/j.ipm.2014.11.001Karoui , J. Benamara , F. Moriceau , V. Aussenac-Gilles , N. Hadrich Belguith , L. 2015 Towards a contextual pragmatic model to detect irony in tweetsKaroui , J. Zitoune , F. B. Moriceau , V. 2017 SOUKHRIA: Towards an irony detection system for Arabic in social mediaLjubesic , N. Mikelic , N. Boras , D. 2007 Language identification: How to distinguish similar languagesLópez-Monroy, A. P., Montes-y-Gómez, M., Escalante, H. J., Villaseñor-Pineda, L., & Stamatatos, E. (2015). Discriminative subprofile-specific representations for author profiling in social media. Knowledge-Based Systems, 89, 134-147. doi:10.1016/j.knosys.2015.06.024Magdy, W., Darwish, K., & Weber, I. (2016). #FailedRevolutions: Using Twitter to study the antecedents of ISIS support. First Monday. doi:10.5210/fm.v21i2.6372Maier , W. Gomez-Rodriguez , C. 2014 Language variety identification in Spanish tweetsMalmasi , S. Dras , M. 2014 Arabic native language identificationMechti , S. Abbassi , A. Belguith , L. H. Faiz , R. 2016 An empirical method using features combination for Arabic native language identificationMukherjee, A., Liu, B., & Glance, N. (2012). Spotting fake reviewer groups in consumer reviews. Proceedings of the 21st international conference on World Wide Web - WWW ’12. doi:10.1145/2187836.2187863Proceedings of the EMNLP’2014 Workshop on Language Technology for Closely Related Languages and Language Variants. (2014). doi:10.3115/v1/w14-42Pennebaker , J. W. Chung , C. K. Ireland , M. E. Gonzales , A. L. Booth , R. J. 2007 The development and psychometric properties of LIWC2007 http://www.liwc.net/LIWC2007LanguageManual.pdf http://liwc.netPotthast , M. Rangel , F. Tschuggnall , M. Stamatatos , E. Rosso , P. Stein , B. 2017 Overview of PAN'17 G. Jones 10456 Springer, ChamRandall M. Groom , N. 2009 The BUiD Arab learner corpus: a resource for studying the acquisition of l2 English spellingRangel , F. Rosso , P. 2015 On the multilingual and genre robustness of emographs for author profiling in social media 274 280 Springer-Verlag, LNCSRangel, F., & Rosso, P. (2016). On the impact of emotions on author profiling. Information Processing & Management, 52(1), 73-92. doi:10.1016/j.ipm.2015.06.003Rangel , F. Rosso , P. Koppel , M. Stamatatos , E. Inches , G. 2013 Overview of the author profiling task at PAN 2013 P. Forner R. Navigli D. TufisRangel , F. Rosso , P. Potthast , M. Stein , B. Daelemans , W. 2015 Overview of the 3rd author profiling task at PAN 2015 L. Cappellato N. Ferro G. Jones E. San JuanRangel , F. Rosso , P. Verhoeven , B. Daelemans , W. Potthast , M. Stein , B. 2016 Overview of the 4th author profiling task at PAN 2016: Cross-genre evaluationsRefaee , E. Rieser , V. 2014 An Arabic twitter corpus for subjectivity and sentiment analysis 2268 2273Reyes, A., Rosso, P., & Buscaldi, D. (2012). From humor recognition to irony detection: The figurative language of social media. Data & Knowledge Engineering, 74, 1-12. doi:10.1016/j.datak.2012.02.005Reyes, A., Rosso, P., & Veale, T. (2012). A multidimensional approach for detecting irony in Twitter. Language Resources and Evaluation, 47(1), 239-268. doi:10.1007/s10579-012-9196-xRosso, P., & Cagnina, L. C. (2017). Deception Detection and Opinion Spam. Socio-Affective Computing, 155-171. doi:10.1007/978-3-319-55394-8_8Saâdane , H. 2015 Traitement Automatique de L'Arabe Dialectalise: Aspects Methodologiques et AlgorithmiquesSaâdane , H. Nouvel , D. Seffih , H. Fluhr , C. 2017 Une approche linguistique pour la détection des dialectes arabesSadat , F. Kazemi , F. Farzindar , A. 2014 Automatic identification of Arabic language varieties and dialects in social mediaSadhwani , P. 2005 Phonological and orthographic knowledge: An Arab-Emirati perspectiveSchler , J. Koppel , M. Argamon , S. Pennebaker , J. W. 2006 Effects of age and gender on blogging 199 205Shoufan , A. Al-Ameri , S. 2015 Natural language processing for dialectical Arabic: A surveySoliman , T. Elmasry , M. Hedar , A-R. Doss , M. 2013 MINING SOCIAL NETWORKS' ARABIC SLANG COMMENTSSulis, E., Irazú Hernández Farías, D., Rosso, P., Patti, V., & Ruffo, G. (2016). Figurative messages and affect in Twitter: Differences between #irony, #sarcasm and #not. Knowledge-Based Systems, 108, 132-143. doi:10.1016/j.knosys.2016.05.035Tetreault , J. Blanchard , D. Cahill , A. 2013 A report on the first native language identification shared task Proceedings of the 8th Workshop on Innovative Use of NLP for Building Educational Applications 48 57Tillmann , C. Mansour , S. Al Onaizan , Y. 2014 Improved sentence-level Arabic dialect classification Proceedings of the VarDia006C Workshop 110 119Tono, Y. (2012). International Corpus of Crosslinguistic Interlanguage: Project overview and a case study on the acquisition of new verb co-occurrence patterns. Tokyo University of Foreign Studies, 27-46. doi:10.1075/tufs.4.07tonWahsheh , H. A. Al-Kabi , M. N. Alsmadi , I. M. 2013b SPAR: A system to detect spam in Arabic opinionsZaghouani , W. Charfi , A. 2018a Arap-Tweet: A Large Multi-Dialect Twitter Corpus for Gender, Age and Language Variety Identification Miyazaki, JapanZaghouani , W. Charfi , A. 2018b Guidelines and Annotation Framework for Arabic Author Profiling Miyazaki, JapanZaghouani , W. Mohit , B. Habash , N. Obeid , O. Tomeh , N. Rozovskaya , A. Farra , N. Alkuhlani , S. Oflazer , K. 2014 Large scale Arabic error annotation: Guidelines and frameworkZaghouani , W. Habash , N. Bouamor , H. Rozovskaya , A. Mohit , B. Heider , A. Oflazer , K. 2015 Correction annotation for non-native Arabic texts: Guidelines and corpus Proceedings of the Association for Computational Linguistics, Fourth Linguistic Annotation Workshop 129 139Zaidan , O. F. Callison-Burch , C 2011 The Arabic online commentary dataset: An annotated dataset of informal Arabic with high dialectal content Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers -Volume 2 Association for Computational Linguistics 37 41Zaidan, O. F., & Callison-Burch, C. (2014). Arabic Dialect Identification. Computational Linguistics, 40(1), 171-202. doi:10.1162/coli_a_00169Zampieri , M. Gebre , B. G. 2012 Automatic identification of language varieties: The case of PortugueseZampieri , M. Tan , L. Ljubesic , N. Tiedemann , J. 2014 A report on the DSL shared task 2014 Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects 58 67Zampieri , M. Tan , L. Ljubesic , N. Tiedemann , J. Nakov , P. 2015 Overview of the DSL shared task 2015 1Zbib , R. Malchiodi , E. Devlin , J. Stallard , D. Matsoukas , S. Schwartz , R. Makhoul , J. Zaidan , O. F. Callison Burch , C. 2012 Machine translation of Arabic dialects Proceedings of the 2012 conference of the North American chapter of the Association for Computational Linguistics: Human language technologies Association for Computational Linguistics 49 5

    Feature extraction and classification of movie reviews

    Get PDF

    Automatic Irony Detection using Feature Fusion and Ensemble Classifier

    Get PDF
    With the advent of micro-blogging sites, users are pioneer in expressing their sentiments and emotions on global issues through text. Automatic detection and classification of sentiments like sarcastic or ironic content in microblogging reviews is a challenging task. It requires a system that manages some kind of knowledge to interpret the sentiment expressed in text. The available approaches are quite limited in their capabilities and scope to detect ironic utterances present in the text. In this regards, the paper propose feature fusion to provide knowledge to the system by alternative sets of features obtained using linguistic and content based text features. The proposed work extracts five sets of linguistic features and fuses with features selected using two stages of a feature selection method. In order to demonstrate the effectiveness of the proposed method, we conduct extensive experimentation by selecting different feature subsets. The performances of the proposed method are evaluated using Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Decision Tree (DT) and ensemble classifiers. The experimental result shows the proposed approach significantly out-performs the conventional methods
    • …
    corecore