370 research outputs found

    Sentiment Analysis: State of the Art

    Get PDF
    We present the state of art in sentiment analysis which covers the purpose of sentiment analysis, levels of sentiment analysis and processes that could be used to measure polarity and classify labels. Moreover, brief details about some resources of sentiment analysis are included

    A hybrid model for aspect-based sentiment analysis on customer feedback: research on the mobile commerce sector in Vietnam

    Get PDF
    Feedback and comments on mobile commerce applications are extremely useful and valuable information sources that reflect the quality of products or services to determine whether data is positive or negative and help businesses monitor brand and product sentiment in customers’ feedback and understand customers’ needs. However, the increasing number of comments makes it increasingly difficult to understand customers using manual methods. To solve this problem, this study builds a hybrid research model based on aspect mining and comment classification for aspect-based sentiment analysis (ABSA) to deeply comprehend the customer and their experiences. Based on previous classification results, we first construct a dictionary of positive and negative words in the e-commerce field. Then, the POS tagging technique is applied for word classification in Vietnamese to extract aspects of model commerce related to positive or negative words. The model is implemented with machine and deep learning methods on a corpus comprising more than 1,000,000 customer opinions collected from Vietnam's four largest mobile commerce applications. Experimental results show that the Bi-LSTM method has the highest accuracy with 92.01%; it is selected for the proposed model to analyze the viewpoint of words on real data. The findings are that the proposed hybrid model can be applied to monitor online customer experience in real time, enable administrators to make timely and accurate decisions, and improve the quality of products and services to take a competitive advantage

    On the Use of Parsing for Named Entity Recognition

    Get PDF
    [Abstract] Parsing is a core natural language processing technique that can be used to obtain the structure underlying sentences in human languages. Named entity recognition (NER) is the task of identifying the entities that appear in a text. NER is a challenging natural language processing task that is essential to extract knowledge from texts in multiple domains, ranging from financial to medical. It is intuitive that the structure of a text can be helpful to determine whether or not a certain portion of it is an entity and if so, to establish its concrete limits. However, parsing has been a relatively little-used technique in NER systems, since most of them have chosen to consider shallow approaches to deal with text. In this work, we study the characteristics of NER, a task that is far from being solved despite its long history; we analyze the latest advances in parsing that make its use advisable in NER settings; we review the different approaches to NER that make use of syntactic information; and we propose a new way of using parsing in NER based on casting parsing itself as a sequence labeling task.Xunta de Galicia; ED431C 2020/11Xunta de Galicia; ED431G 2019/01This work has been funded by MINECO, AEI and FEDER of UE through the ANSWER-ASAP project (TIN2017-85160-C2-1-R); and by Xunta de Galicia through a Competitive Reference Group grant (ED431C 2020/11). CITIC, as Research Center of the Galician University System, is funded by the Consellería de Educación, Universidade e Formación Profesional of the Xunta de Galicia through the European Regional Development Fund (ERDF/FEDER) with 80%, the Galicia ERDF 2014-20 Operational Programme, and the remaining 20% from the Secretaría Xeral de Universidades (Ref. ED431G 2019/01). Carlos Gómez-Rodríguez has also received funding from the European Research Council (ERC), under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, Grant No. 714150)

    A Survey of Sentiment Analysis and Sarcasm Detection: Challenges, Techniques, and Trends

    Get PDF
    In recent years, more people have been using the internet and social media to express their opinions on various subjects, such as institutions, services, or specific ideas. This increase highlights the importance of developing automated tools for accurate sentiment analysis. Moreover, addressing sarcasm in text is crucial, as it can significantly impact the efficacy of sentiment analysis models. This paper aims to provide a comprehensive overview of the conducted research on sentiment analysis and sarcasm detection, focusing on the time from 2018 to 2023. It explores the challenges faced and the methods used to address them. It conducts a comparison of these methods. It also aims to identify emerging trends that will likely influence the future of sentiment analysis and sarcasm detection, ensuring their continued effectiveness. This paper enhances the existing knowledge by offering a comprehensive analysis of 40 research works, evaluating performance, addressing multilingual challenges, and highlighting future trends in sarcasm detection and sentiment analysis. It is a valuable resource for researchers and experts interested in the field, facilitating further advancements in sentiment analysis techniques and applications. It categorizes sentiment analysis methods into ML, lexical, and hybrid approaches, highlighting deep learning, especially Recurrent Neural Networks (RNNs), for effective textual classification with labeled or unlabeled data

    Understanding Word Embedding Stability Across Languages and Applications

    Full text link
    Despite the recent popularity of word embedding methods, there is only a small body of work exploring the limitations of these representations. In this thesis, we consider several aspects of embedding spaces, including their stability. First, we propose a definition of stability, and show that common English word embeddings are surprisingly unstable. We explore how properties of data, words, and algorithms relate to instability. We extend this work to approximately 100 world languages, considering how linguistic typology relates to stability. Additionally, we consider contextualized output embedding spaces. Using paraphrases, we explore properties and assumptions of BERT, a popular embedding algorithm. Second, we consider how stability and other word embedding properties affect tasks where embeddings are commonly used. We consider both word embeddings used as features in downstream applications and corpus-centered applications, where embeddings are used to study characteristics of language and individual writers. In addition to stability, we also consider other word embedding properties, specifically batching and curriculum learning, and how methodological choices made for these properties affect downstream tasks. Finally, we consider how knowledge of stability affects how we use word embeddings. Throughout this thesis, we discuss strategies to mitigate instability and provide analyses highlighting the strengths and weaknesses of word embeddings in different scenarios and languages. We show areas where more work is needed to improve embeddings, and we show where embeddings are already a strong tool.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162917/1/lburdick_1.pd

    Influencer analysis on social networks

    Get PDF
    Along with the growth of social networks, social marketing, and influencer marketing, the analysis of influencers, especially the analysis of micro-influencers on social networks, becomes an interesting topic that attracts many researchers as well as industry organizations. An ideal micro-influencer is a user who is able to create positive content relevance to the business and get high engagement from his/her audience, continuously and consistently. Various approaches have been suggested for this problem but there is still a research gap that no effort meets these requirements yet. Resolving this gap is the main aim of this study. This study proposes a novel approach for identifying influencers on social networks using three metrics including the amplification factor to evaluate the information propagation, the passion point to measure the user's preference for a brand or its products and services, and the content creation score to estimate user's ability in creating contents on a social network. The main hypothesis is that the approach using these metrics will propose high-performing influencers. The study has been compared with some recent relevant methods as the baseline. It is also tested in the real world and the experiment shows that the proposed method's influencers deliver a react-to-purchase conversion rate's efficiency and a good return on investment in the influencer marketing campaign.Spolu s růstem sociálních sítí, sociálního marketingu a influencer marketingu se analýza influencerů, obzvlášť analýza mikroinfluencerů na sociálních sítích se stává zajímavým předmětem, který přitahuje mnoho výzkumníků i organizací různých oborů. Ideálním mikroinfluencerem je člověk, který dokáže vytvořit pozitivní obsah,relevantní pro danou organizaci a je schopen získat od svého publika neustálou a konstantní míru angažovanosti. Pro tento předmět byly navrženy různé metody mikroinfluencerů, ale studie ukazují, že stále existuje mezera ve výzkumu, která dosud nesplňovala tyto požadavky. Hlavním cílem této práce je vyřešení zmiňované mezery navrhující efektivní metody sociálního marketingu se zapojením influenecerů. Tato studie navrhuje nový přístup k identifikaci influencerů na sociálních sítích pomocí tří následujících metrik: amplifikační faktory pro ohodnocení propagovaného obsahu, oblast, o kterou je spotřebitelský zájem k určení zájmu daného uživatele u značky a jejich produktů a služeb, skóre tvorby obsahu pro odhad schopnosti uživatele vytvářet obsah na sociální síti. Hlavní hypotézou je, že metody využívající těchto metrik dokážou navrhnout kvalitní influencery, kteří podávají vysoký výkon. Tato studie byla porovnána se současnými metodami relevantní k této studii. Experimentální výsledky ukazují, že influenceři navržení tímto nástrojem, přinášejí efektivnější konverzní poměr a vyšší návratnost investice influencer marketingové kampaně.460 - Katedra informatikyvyhově

    Exploring the State of the Art in Legal QA Systems

    Full text link
    Answering questions related to the legal domain is a complex task, primarily due to the intricate nature and diverse range of legal document systems. Providing an accurate answer to a legal query typically necessitates specialized knowledge in the relevant domain, which makes this task all the more challenging, even for human experts. QA (Question answering systems) are designed to generate answers to questions asked in human languages. They use natural language processing to understand questions and search through information to find relevant answers. QA has various practical applications, including customer service, education, research, and cross-lingual communication. However, they face challenges such as improving natural language understanding and handling complex and ambiguous questions. Answering questions related to the legal domain is a complex task, primarily due to the intricate nature and diverse range of legal document systems. Providing an accurate answer to a legal query typically necessitates specialized knowledge in the relevant domain, which makes this task all the more challenging, even for human experts. At this time, there is a lack of surveys that discuss legal question answering. To address this problem, we provide a comprehensive survey that reviews 14 benchmark datasets for question-answering in the legal field as well as presents a comprehensive review of the state-of-the-art Legal Question Answering deep learning models. We cover the different architectures and techniques used in these studies and the performance and limitations of these models. Moreover, we have established a public GitHub repository where we regularly upload the most recent articles, open data, and source code. The repository is available at: \url{https://github.com/abdoelsayed2016/Legal-Question-Answering-Review}