18 research outputs found

    A survey on sentiment analysis in Urdu: A resource-poor language

    Get PDF
    © 2020 Background/introduction: The dawn of the internet opened the doors to the easy and widespread sharing of information on subject matters such as products, services, events and political opinions. While the volume of studies conducted on sentiment analysis is rapidly expanding, these studies mostly address English language concerns. The primary goal of this study is to present state-of-art survey for identifying the progress and shortcomings saddling Urdu sentiment analysis and propose rectifications. Methods: We described the advancements made thus far in this area by categorising the studies along three dimensions, namely: text pre-processing lexical resources and sentiment classification. These pre-processing operations include word segmentation, text cleaning, spell checking and part-of-speech tagging. An evaluation of sophisticated lexical resources including corpuses and lexicons was carried out, and investigations were conducted on sentiment analysis constructs such as opinion words, modifiers, negations. Results and conclusions: Performance is reported for each of the reviewed study. Based on experimental results and proposals forwarded through this paper provides the groundwork for further studies on Urdu sentiment analysis

    Creating large semantic lexical resources for the Finnish language

    Get PDF
    Finnish belongs into the Finno-Ugric language family, and it is spoken by the vast majority of the people living in Finland. The motivation for this thesis is to contribute to the development of a semantic tagger for Finnish. This tool is a parallel of the English Semantic Tagger which has been developed at the University Centre for Computer Corpus Research on Language (UCREL) at Lancaster University since the beginning of the 1990s and which has over the years proven to be a very powerful tool in automatic semantic analysis of English spoken and written data. The English Semantic Tagger has various successful applications in the fields of natural language processing and corpus linguistics, and new application areas emerge all the time. The semantic lexical resources that I have created in this thesis provide the knowledge base for the Finnish Semantic Tagger. My main contributions are the lexical resources themselves, along with a set of methods and guidelines for their creation and expansion as a general language resource and as tailored for domain-specific applications. Furthermore, I propose and carry out several methods for evaluating semantic lexical resources. In addition to the English Semantic Tagger, which was developed first, and the Finnish Semantic Tagger second, equivalent semantic taggers have now been developed for Czech, Chinese, Dutch, French, Italian, Malay, Portuguese, Russian, Spanish, Urdu, and Welsh. All these semantic taggers taken together form a program framework called the UCREL Semantic Analysis System (USAS) which enables the development of not only monolingual but also various types of multilingual applications. Large-scale semantic lexical resources designed for Finnish using semantic fields as the organizing principle have not been attempted previously. Thus, the Finnish semantic lexicons created in this thesis are a unique and novel resource. The lexical coverage on the test corpora containing general modern standard Finnish, which has been the focus of the lexicon development, ranges from 94.58% to 97.91%. However, the results are also very promising in the analysis of domain-specific text (95.36%), older Finnish text (92.11–93.05%), and Internet discussions (91.97–94.14%). The results of the evaluation of lexical coverage are comparable to the results obtained with the English equivalents and thus indicate that the Finnish semantic lexical resources indeed cover the majority of core Finnish vocabulary

    Sentiment Analysis for Social Media

    Get PDF
    Sentiment analysis is a branch of natural language processing concerned with the study of the intensity of the emotions expressed in a piece of text. The automated analysis of the multitude of messages delivered through social media is one of the hottest research fields, both in academy and in industry, due to its extremely high potential applicability in many different domains. This Special Issue describes both technological contributions to the field, mostly based on deep learning techniques, and specific applications in areas like health insurance, gender classification, recommender systems, and cyber aggression detection

    Beyond the “Bhai-Bhai” Rhetoric : China-India Literary Relations, 1950-1990

    Get PDF
    This thesis examines the multi-layered relationship between the literary spheres of the People’s Republic of China (1949-) and the Republic of India (1947-) from the 1950s to the 1980s. Drawing on previously underexplored materials in Chinese, Hindi, and English, this thesis focuses on a range of writerly, textual, and readerly contacts — three aspects of what, following Karen Thornber (2009), I call “literary relations” — between the two newly established Asian nation-states. Considering literary relations as inextricable from political relations, I argue that China and India embarked on similar and related paths since 1950, but in order to understand these relations we need to keep multiple frames in mind: of each country’s national culture and foreign policy; of bilateral relations; and of broader leftist internationalism, the anti-imperialist Third World solidarity movement, and Cold War world politics. Specifically, I identify and analyse five different yet overlapping trajectories that tied modern Chinese and Indian literatures together: first, a bilateral mechanism of writerly contact intended to enhance the China-India friendship; second, a multinational forum of Afro-Asian writers designed to advance cultural self-determination and literary solidarity in the Third World; third, India’s enthusiastic import of modern Chinese fiction under the rubric of “revolutionary” with the Foreign Languages Press acting as the main text provider; fourth, China’s systematic reception of “progressive” Indian fiction as part of the PRC’s model of world literature; and fifth, a counter-intuitive yet strikingly productive and cross-media transplantation of Hindi popular fiction in 1980s China. Although post-1950 China and India shared considerable common grounds for developing literary contact, nevertheless the ways they engaged with each other’s modern literature differed significantly due to their different literary cultures, political systems, and Cold War ideologies. The result is a landscape of literary relations that is markedly horizontal but nonetheless asymmetrical

    Developing a large scale FrameNet for Italian - The IFrameNet experience

    Get PDF
    In this thesis we present the development and the current status of the IFrameNet project, aimed at the construction of a large-scale lexical semantic resource for the Italian language based on Frame Semantics theories. We will begin by contextualizing our work in the wider context of Frame Semantics and of the FrameNet project, which, since 1997, has attempted to apply these theories to lexicography. We will then analyse and discuss the applicability of the structure of the American resource to Italian and more specifically we will focus on the domain of fear, worry, and anxiety. We will finally propose some modifications aimed at improving this domain of the resource in relation to its coherence, its ability to accurately represent the linguistic reality and in particular in order to make it possible to apply it to Italian

    24th Nordic Conference on Computational Linguistics (NoDaLiDa)

    Get PDF

    The Proceedings of the European Conference on Social Media ECSM 2014 University of Brighton

    Get PDF
    corecore