2,472 research outputs found

    Exploiting BERT and RoBERTa to Improve Performance for Aspect Based Sentiment Analysis

    Get PDF
    Sentiment Analysis also known as opinion mining is a type of text research that analyses people’s opinions expressed in written language. Sentiment analysis brings together various research areas such as Natural Language Processing (NLP), Data Mining, and Text Mining, and is fast becoming of major importance to companies and organizations as it is started to incorporate online commerce data for analysis. Often the data on which sentiment analysis is performed will be reviews. The data can range from reviews of a small product to a big multinational corporation. The goal of performing sentiment analysis is to extract information from those reviews to gauge public opinion for market research, monitor brand and product reputation, and understand customer experiences. Reviews written on the online platform are often in the form of free text and they do not have any standard structure. Dealing with unstructured data is a challenging problem. Sentiment analysis can be done at different levels, and the focus of this research is on aspect-level sentiment analysis. In aspect-level sentiment analysis, there are two tasks that need to be addressed. The first task is aspect identification which is the process of discovering those attributes of the object that people are commenting on. These attributes of the object are called aspects. The second task is the sentiment classification of those reviews using these extracted aspects. For the sentiment analysis, transformer-based pre-trained models such as BERT (Bidirectional Encoder Representations from Transformers) and RoBERTa (A robustly optimized BERT) are used in this research as they make use of embedding vector space that is rich in context. The purpose of this research is to propose a framework for extracting the aspects from the data which can be applied to these pre-trained models. For the first part of the experiment, both the BERT and RoBERTa models are developed without the aspect-based approach. For the second part of the experiment, the aspect-based approach is applied to the same models and their results are compared and evaluated against the equivalent models. The experiment results show that aspect-based approach has increased the performance of the models by almost 1% than the traditional models and the BERT model with the aspect-based approach had the highest accuracy and performance among all the models evaluated in this research.

    Analyzing fluctuation of topics and public sentiment through social media data

    Get PDF
    Over the past decade years, Internet users were expending rapidly in the world. They form various online social networks through such Internet platforms as Twitter, Facebook and Instagram. These platforms provide a fast way that helps their users receive and disseminate information and express personal opinions in virtual space. When dealing with massive and chaotic social media data, how to accurately determine what events or concepts users are discussing is an interesting and important problem. This dissertation work mainly consists of two parts. First, this research pays attention to mining the hidden topics and user interest trend by analyzing real-world social media activities. Topic modeling and sentiment analysis methods are proposed to classify the social media posts into different sentiment classes and then discover the trend of sentiment based on different topics over time. The presented case study focuses on COVID-19 pandemic that started in 2019. A large amount of Twitter data is collected and used to discover the vaccine-related topics during the pre- and post-vaccine emergency use period. By using the proposed framework, 11 vaccine-related trend topics are discovered. Ultimately the discovered topics can be used to improve the readability of confusing messages about vaccines on social media and provide effective results to support policymakers in making their policy their informed decisions about public health. Second, using conventional topic models cannot deal with the sparsity problem of short text. A novel topic model, named Topic Noise based-Biterm Topic Model with FastText embeddings (TN-BTMF), is proposed to deal with this problem. Word co-occurrence patterns (i.e. biterms) are dirctly generated in BTM. A scoring method based on word co-occurrence and semantic similarity is proposed to detect noise biterms. In th

    Context-Guided BERT for Targeted Aspect-Based Sentiment Analysis

    Full text link
    Aspect-based sentiment analysis (ABSA) and Targeted ASBA (TABSA) allow finer-grained inferences about sentiment to be drawn from the same text, depending on context. For example, a given text can have different targets (e.g., neighborhoods) and different aspects (e.g., price or safety), with different sentiment associated with each target-aspect pair. In this paper, we investigate whether adding context to self-attention models improves performance on (T)ABSA. We propose two variants of Context-Guided BERT (CG-BERT) that learn to distribute attention under different contexts. We first adapt a context-aware Transformer to produce a CG-BERT that uses context-guided softmax-attention. Next, we propose an improved Quasi-Attention CG-BERT model that learns a compositional attention that supports subtractive attention. We train both models with pretrained BERT on two (T)ABSA datasets: SentiHood and SemEval-2014 (Task 4). Both models achieve new state-of-the-art results with our QACG-BERT model having the best performance. Furthermore, we provide analyses of the impact of context in the our proposed models. Our work provides more evidence for the utility of adding context-dependencies to pretrained self-attention-based language models for context-based natural language tasks

    Multi-task learning for aspect level semantic classification combining complex aspect target semantic enhancement and adaptive local focus

    Get PDF
    Aspect-based sentiment analysis (ABSA) is a fine-grained and diverse task in natural language processing. Existing deep learning models for ABSA face the challenge of balancing the demand for finer granularity in sentiment analysis with the scarcity of training corpora for such granularity. To address this issue, we propose an enhanced BERT-based model for multi-dimensional aspect target semantic learning. Our model leverages BERT's pre-training and fine-tuning mechanisms, enabling it to capture rich semantic feature parameters. In addition, we propose a complex semantic enhancement mechanism for aspect targets to enrich and optimize fine-grained training corpora. Third, we combine the aspect recognition enhancement mechanism with a CRF model to achieve more robust and accurate entity recognition for aspect targets. Furthermore, we propose an adaptive local attention mechanism learning model to focus on sentiment elements around rich aspect target semantics. Finally, to address the varying contributions of each task in the joint training mechanism, we carefully optimize this training approach, allowing for a mutually beneficial training of multiple tasks. Experimental results on four Chinese and five English datasets demonstrate that our proposed mechanisms and methods effectively improve ABSA models, surpassing some of the latest models in multi-task and single-task scenarios

    Recolha, extração e classificação de opiniões sobre aplicações lúdicas para saúde e bem-estar

    Get PDF
    Nowadays, mobile apps are part of the life of anyone who owns a smartphone. With technological evolution, new apps come with new features, which brings a greater demand from users when using an application. Moreover, at a time when health and well-being are a priority, more and more apps provide a better user experience, not only in terms of health monitoring but also a pleasant experience in terms of entertainment and well-being. However, there are still some limitations regarding user experience and usability. What can best translate user satisfaction and experience are application reviews. Therefore, to have a perception of the most relevant aspects of the current applications, a collection of reviews and respective classifications was performed. This thesis aims to develop a system that allows the presentation of the most relevant aspects of a given health and wellness application after collecting the reviews and later extracting the aspects and classifying them. In the reviews collection task, two Python libraries, one for the Google Play Store and one for the App Store, provide methods for extracting data about an application. For the extraction and classification of aspects, the LCF-ATEPC model was chosen given its performance in aspects-based sentiment analysis studies.Atualmente, as aplicações móveis fazem parte da vida de qualquer pessoa que possua um smartphone. Com a evolução tecnológica, novas aplicações surgem com novas funcionalidades, o que traz uma maior exigência por parte dos utilizadores quando usam uma aplicação. Numa altura em que a saúde e bem-estar são uma prioridade, existem cada vez mais aplicações com o intuito de providenciar uma melhor experiência ao utilizador, não só a nível de monitorização de saúde, mas também de uma experiência agradável em termos de entertenimento e bem estar. Contudo, existem ainda algumas limitações no que toca à experiência e usabilidade do utilizador. O que melhor pode traduzir a satisfação e experiência do utilizador são as reviews das aplicações. Assim sendo, para ter uma perceção dos aspetos mais relevantes das atuais aplicações, foi feita uma recolha das reviews e respetivas classificações. O objetivo desta tese consiste no desenvolvimento de um sistema que permita apresentar os aspetos mais relevantes de uma determinada aplicação de saúde e bem estar, após a recolha das reviews e posterior extração dos aspetos e classificação dos mesmos. No processo de recolha de reviews, foram usadas duas bibliotecas em Python, uma relativa à Google Play Store e outra à App Store, que providenciam métodos para extrair dados relativamente a uma aplicação. Para a extração e classificação dos aspetos, o modelo LCF-ATEPC foi o escolhido dada a sua performance em estudos de análise de sentimento baseada em aspectos.Mestrado em Engenharia de Computadores e Telemátic

    Recuperação e identificação de momentos em imagens

    Get PDF
    In our modern society almost anyone is able to capture moments and record events due to the ease accessibility to smartphones. This leads to the question, if we record so much of our life how can we easily retrieve specific moments? The answer to this question would open the door for a big leap in human life quality. The possibilities are endless, from trivial problems like finding a photo of a birthday cake to being capable of analyzing the progress of mental illnesses in patients or even tracking people with infectious diseases. With so much data being created everyday, the answer to this question becomes more complex. There is no stream lined approach to solve the problem of moment localization in a large dataset of images and investigations into this problem have only started a few years ago. ImageCLEF is one competition where researchers participate and try to achieve new and better results in the task of moment retrieval. This complex problem, along with the interest in participating in the ImageCLEF Lifelog Moment Retrieval Task posed a good challenge for the development of this dissertation. The proposed solution consists in developing a system capable of retriving images automatically according to specified moments described in a corpus of text without any sort of user interaction and using only state-of-the-art image and text processing methods. The developed retrieval system achieves this objective by extracting and categorizing relevant information from text while being able to compute a similarity score with the extracted labels from the image processing stage. In this way, the system is capable of telling if images are related to the specified moment in text and therefore able to retrieve the pictures accordingly. In the ImageCLEF Life Moment Retrieval 2020 subtask the proposed automatic retrieval system achieved a score of 0.03 in the F1-measure@10 evaluation methodology. Even though this scores are not competitve when compared to other teams systems scores, the built system presents a good baseline for future work.Na sociedade moderna, praticamente qualquer pessoa consegue capturar momentos e registar eventos devido à facilidade de acesso a smartphones. Isso leva à questão, se registamos tanto da nossa vida, como podemos facilmente recuperar momentos específicos? A resposta a esta questão abriria a porta para um grande salto na qualidade da vida humana. As possibilidades são infinitas, desde problemas triviais como encontrar a foto de um bolo de aniversário até ser capaz de analisar o progresso de doenças mentais em pacientes ou mesmo rastrear pessoas com doenças infecciosas. Com tantos dados a serem criados todos os dias, a resposta a esta pergunta torna-se mais complexa. Não existe uma abordagem linear para resolver o problema da localização de momentos num grande conjunto de imagens e investigações sobre este problema começaram há apenas poucos anos. O ImageCLEF é uma competição onde investigadores participam e tentam alcançar novos e melhores resultados na tarefa de recuperação de momentos a cada ano. Este problema complexo, em conjunto com o interesse em participar na tarefa ImageCLEF Lifelog Moment Retrieval, apresentam-se como um bom desafio para o desenvolvimento desta dissertação. A solução proposta consiste num sistema capaz de recuperar automaticamente imagens de momentos descritos em formato de texto, sem qualquer tipo de interação de um utilizador, utilizando apenas métodos estado da arte de processamento de imagem e texto. O sistema de recuperação desenvolvido alcança este objetivo através da extração e categorização de informação relevante de texto enquanto calcula um valor de similaridade com os rótulos extraídos durante a fase de processamento de imagem. Dessa forma, o sistema consegue dizer se as imagens estão relacionadas ao momento especificado no texto e, portanto, é capaz de recuperar as imagens de acordo. Na subtarefa ImageCLEF Life Moment Retrieval 2020, o sistema de recuperação automática de imagens proposto alcançou uma pontuação de 0.03 na metodologia de avaliação F1-measure@10. Mesmo que estas pontuações não sejam competitivas quando comparadas às pontuações de outros sistemas de outras equipas, o sistema construído apresenta-se como uma boa base para trabalhos futuros.Mestrado em Engenharia Eletrónica e Telecomunicaçõe

    Describing Images by Semantic Modeling using Attributes and Tags

    Get PDF
    This dissertation addresses the problem of describing images using visual attributes and textual tags, a fundamental task that narrows down the semantic gap between the visual reasoning of humans and machines. Automatic image annotation assigns relevant textual tags to the images. In this dissertation, we propose a query-specific formulation based on Weighted Multi-view Non-negative Matrix Factorization to perform automatic image annotation. Our proposed technique seamlessly adapt to the changes in training data, naturally solves the problem of feature fusion and handles the challenge of the rare tags. Unlike tags, attributes are category-agnostic, hence their combination models an exponential number of semantic labels. Motivated by the fact that most attributes describe local properties, we propose exploiting localization cues, through semantic parsing of human face and body to improve person-related attribute prediction. We also demonstrate that image-level attribute labels can be effectively used as weak supervision for the task of semantic segmentation. Next, we analyze the Selfie images by utilizing tags and attributes. We collect the first large-scale Selfie dataset and annotate it with different attributes covering characteristics such as gender, age, race, facial gestures, and hairstyle. We then study the popularity and sentiments of the selfies given an estimated appearance of various semantic concepts. In brief, we automatically infer what makes a good selfie. Despite its extensive usage, the deep learning literature falls short in understanding the characteristics and behavior of the Batch Normalization. We conclude this dissertation by providing a fresh view, in light of information geometry and Fisher kernels to why the batch normalization works. We propose Mixture Normalization that disentangles modes of variation in the underlying distribution of the layer outputs and confirm that it effectively accelerates training of different batch-normalized architectures including Inception-V3, Densely Connected Networks, and Deep Convolutional Generative Adversarial Networks while achieving better generalization error