2,472 research outputs found
Exploiting BERT and RoBERTa to Improve Performance for Aspect Based Sentiment Analysis
Sentiment Analysis also known as opinion mining is a type of text research that analyses people’s opinions expressed in written language. Sentiment analysis brings together various research areas such as Natural Language Processing (NLP), Data Mining, and Text Mining, and is fast becoming of major importance to companies and organizations as it is started to incorporate online commerce data for analysis. Often the data on which sentiment analysis is performed will be reviews. The data can range from reviews of a small product to a big multinational corporation. The goal of performing sentiment analysis is to extract information from those reviews to gauge public opinion for market research, monitor brand and product reputation, and understand customer experiences. Reviews written on the online platform are often in the form of free text and they do not have any standard structure. Dealing with unstructured data is a challenging problem. Sentiment analysis can be done at different levels, and the focus of this research is on aspect-level sentiment analysis. In aspect-level sentiment analysis, there are two tasks that need to be addressed. The first task is aspect identification which is the process of discovering those attributes of the object that people are commenting on. These attributes of the object are called aspects. The second task is the sentiment classification of those reviews using these extracted aspects. For the sentiment analysis, transformer-based pre-trained models such as BERT (Bidirectional Encoder Representations from Transformers) and RoBERTa (A robustly optimized BERT) are used in this research as they make use of embedding vector space that is rich in context. The purpose of this research is to propose a framework for extracting the aspects from the data which can be applied to these pre-trained models. For the first part of the experiment, both the BERT and RoBERTa models are developed without the aspect-based approach. For the second part of the experiment, the aspect-based approach is applied to the same models and their results are compared and evaluated against the equivalent models. The experiment results show that aspect-based approach has increased the performance of the models by almost 1% than the traditional models and the BERT model with the aspect-based approach had the highest accuracy and performance among all the models evaluated in this research.
Analyzing fluctuation of topics and public sentiment through social media data
Over the past decade years, Internet users were expending rapidly in the world. They form various online social networks through such Internet platforms as Twitter, Facebook and Instagram. These platforms provide a fast way that helps their users receive and disseminate information and express personal opinions in virtual space. When dealing with massive and chaotic social media data, how to accurately determine what events or concepts users are discussing is an interesting and important problem.
This dissertation work mainly consists of two parts. First, this research pays attention to mining the hidden topics and user interest trend by analyzing real-world social media activities. Topic modeling and sentiment analysis methods are proposed to classify the social media posts into different sentiment classes and then discover the trend of sentiment based on different topics over time. The presented case study focuses on COVID-19 pandemic that started in 2019. A large amount of Twitter data is collected and used to discover the vaccine-related topics during the pre- and post-vaccine emergency use period. By using the proposed framework, 11 vaccine-related trend topics are discovered. Ultimately the discovered topics can be used to improve the readability of confusing messages about vaccines on social media and provide effective results to support policymakers in making their policy their informed decisions about public health. Second, using conventional topic models cannot deal with the sparsity problem of short text. A novel topic model, named Topic Noise based-Biterm Topic Model with FastText embeddings (TN-BTMF), is proposed to deal with this problem. Word co-occurrence patterns (i.e. biterms) are dirctly generated in BTM. A scoring method based on word co-occurrence and semantic similarity is proposed to detect noise biterms. In th
Context-Guided BERT for Targeted Aspect-Based Sentiment Analysis
Aspect-based sentiment analysis (ABSA) and Targeted ASBA (TABSA) allow
finer-grained inferences about sentiment to be drawn from the same text,
depending on context. For example, a given text can have different targets
(e.g., neighborhoods) and different aspects (e.g., price or safety), with
different sentiment associated with each target-aspect pair. In this paper, we
investigate whether adding context to self-attention models improves
performance on (T)ABSA. We propose two variants of Context-Guided BERT
(CG-BERT) that learn to distribute attention under different contexts. We first
adapt a context-aware Transformer to produce a CG-BERT that uses context-guided
softmax-attention. Next, we propose an improved Quasi-Attention CG-BERT model
that learns a compositional attention that supports subtractive attention. We
train both models with pretrained BERT on two (T)ABSA datasets: SentiHood and
SemEval-2014 (Task 4). Both models achieve new state-of-the-art results with
our QACG-BERT model having the best performance. Furthermore, we provide
analyses of the impact of context in the our proposed models. Our work provides
more evidence for the utility of adding context-dependencies to pretrained
self-attention-based language models for context-based natural language tasks
Multi-task learning for aspect level semantic classification combining complex aspect target semantic enhancement and adaptive local focus
Aspect-based sentiment analysis (ABSA) is a fine-grained and diverse task in natural language processing. Existing deep learning models for ABSA face the challenge of balancing the demand for finer granularity in sentiment analysis with the scarcity of training corpora for such granularity. To address this issue, we propose an enhanced BERT-based model for multi-dimensional aspect target semantic learning. Our model leverages BERT's pre-training and fine-tuning mechanisms, enabling it to capture rich semantic feature parameters. In addition, we propose a complex semantic enhancement mechanism for aspect targets to enrich and optimize fine-grained training corpora. Third, we combine the aspect recognition enhancement mechanism with a CRF model to achieve more robust and accurate entity recognition for aspect targets. Furthermore, we propose an adaptive local attention mechanism learning model to focus on sentiment elements around rich aspect target semantics. Finally, to address the varying contributions of each task in the joint training mechanism, we carefully optimize this training approach, allowing for a mutually beneficial training of multiple tasks. Experimental results on four Chinese and five English datasets demonstrate that our proposed mechanisms and methods effectively improve ABSA models, surpassing some of the latest models in multi-task and single-task scenarios
Recolha, extração e classificação de opiniões sobre aplicações lúdicas para saúde e bem-estar
Nowadays, mobile apps are part of the life of anyone who owns a smartphone.
With technological evolution, new apps come with new features, which brings a
greater demand from users when using an application. Moreover, at a time when
health and well-being are a priority, more and more apps provide a better user
experience, not only in terms of health monitoring but also a pleasant experience
in terms of entertainment and well-being. However, there are still some limitations
regarding user experience and usability. What can best translate user satisfaction
and experience are application reviews. Therefore, to have a perception of the most
relevant aspects of the current applications, a collection of reviews and respective
classifications was performed.
This thesis aims to develop a system that allows the presentation of the most relevant
aspects of a given health and wellness application after collecting the reviews
and later extracting the aspects and classifying them. In the reviews collection task,
two Python libraries, one for the Google Play Store and one for the App Store, provide
methods for extracting data about an application. For the extraction and
classification of aspects, the LCF-ATEPC model was chosen given its performance
in aspects-based sentiment analysis studies.Atualmente, as aplicações móveis fazem parte da vida de qualquer pessoa que possua
um smartphone. Com a evolução tecnológica, novas aplicações surgem com
novas funcionalidades, o que traz uma maior exigência por parte dos utilizadores
quando usam uma aplicação. Numa altura em que a saúde e bem-estar são uma
prioridade, existem cada vez mais aplicações com o intuito de providenciar uma
melhor experiência ao utilizador, não só a nível de monitorização de saúde, mas
também de uma experiência agradável em termos de entertenimento e bem estar.
Contudo, existem ainda algumas limitações no que toca à experiência e usabilidade
do utilizador. O que melhor pode traduzir a satisfação e experiência do utilizador
são as reviews das aplicações. Assim sendo, para ter uma perceção dos aspetos
mais relevantes das atuais aplicações, foi feita uma recolha das reviews e respetivas
classificações.
O objetivo desta tese consiste no desenvolvimento de um sistema que permita
apresentar os aspetos mais relevantes de uma determinada aplicação de saúde e
bem estar, após a recolha das reviews e posterior extração dos aspetos e classificação
dos mesmos. No processo de recolha de reviews, foram usadas duas
bibliotecas em Python, uma relativa à Google Play Store e outra à App Store, que
providenciam métodos para extrair dados relativamente a uma aplicação. Para a
extração e classificação dos aspetos, o modelo LCF-ATEPC foi o escolhido dada a
sua performance em estudos de análise de sentimento baseada em aspectos.Mestrado em Engenharia de Computadores e Telemátic
Recuperação e identificação de momentos em imagens
In our modern society almost anyone is able to capture moments and record
events due to the ease accessibility to smartphones. This leads to the question,
if we record so much of our life how can we easily retrieve specific
moments? The answer to this question would open the door for a big leap
in human life quality. The possibilities are endless, from trivial problems like
finding a photo of a birthday cake to being capable of analyzing the progress
of mental illnesses in patients or even tracking people with infectious diseases.
With so much data being created everyday, the answer to this question becomes
more complex. There is no stream lined approach to solve the problem
of moment localization in a large dataset of images and investigations into
this problem have only started a few years ago. ImageCLEF is one competition
where researchers participate and try to achieve new and better results
in the task of moment retrieval.
This complex problem, along with the interest in participating in the ImageCLEF
Lifelog Moment Retrieval Task posed a good challenge for the
development of this dissertation.
The proposed solution consists in developing a system capable of retriving
images automatically according to specified moments described in a corpus
of text without any sort of user interaction and using only state-of-the-art
image and text processing methods.
The developed retrieval system achieves this objective by extracting and
categorizing relevant information from text while being able to compute a
similarity score with the extracted labels from the image processing stage. In
this way, the system is capable of telling if images are related to the specified
moment in text and therefore able to retrieve the pictures accordingly.
In the ImageCLEF Life Moment Retrieval 2020 subtask the proposed automatic
retrieval system achieved a score of 0.03 in the F1-measure@10
evaluation methodology. Even though this scores are not competitve when
compared to other teams systems scores, the built system presents a good
baseline for future work.Na sociedade moderna, praticamente qualquer pessoa consegue capturar
momentos e registar eventos devido à facilidade de acesso a smartphones.
Isso leva à questão, se registamos tanto da nossa vida, como podemos facilmente
recuperar momentos específicos? A resposta a esta questão abriria a
porta para um grande salto na qualidade da vida humana. As possibilidades
são infinitas, desde problemas triviais como encontrar a foto de um bolo
de aniversário até ser capaz de analisar o progresso de doenças mentais em
pacientes ou mesmo rastrear pessoas com doenças infecciosas.
Com tantos dados a serem criados todos os dias, a resposta a esta pergunta
torna-se mais complexa. Não existe uma abordagem linear para resolver
o problema da localização de momentos num grande conjunto de imagens
e investigações sobre este problema começaram há apenas poucos anos.
O ImageCLEF é uma competição onde investigadores participam e tentam
alcançar novos e melhores resultados na tarefa de recuperação de momentos
a cada ano.
Este problema complexo, em conjunto com o interesse em participar na
tarefa ImageCLEF Lifelog Moment Retrieval, apresentam-se como um bom
desafio para o desenvolvimento desta dissertação.
A solução proposta consiste num sistema capaz de recuperar automaticamente
imagens de momentos descritos em formato de texto, sem qualquer
tipo de interação de um utilizador, utilizando apenas métodos estado da arte
de processamento de imagem e texto.
O sistema de recuperação desenvolvido alcança este objetivo através da extração
e categorização de informação relevante de texto enquanto calcula
um valor de similaridade com os rótulos extraídos durante a fase de processamento
de imagem. Dessa forma, o sistema consegue dizer se as imagens
estão relacionadas ao momento especificado no texto e, portanto, é capaz
de recuperar as imagens de acordo.
Na subtarefa ImageCLEF Life Moment Retrieval 2020, o sistema de recuperação
automática de imagens proposto alcançou uma pontuação de 0.03
na metodologia de avaliação F1-measure@10. Mesmo que estas pontuações
não sejam competitivas quando comparadas às pontuações de outros sistemas
de outras equipas, o sistema construído apresenta-se como uma boa
base para trabalhos futuros.Mestrado em Engenharia Eletrónica e Telecomunicaçõe
Describing Images by Semantic Modeling using Attributes and Tags
This dissertation addresses the problem of describing images using visual attributes and textual tags, a fundamental task that narrows down the semantic gap between the visual reasoning of humans and machines. Automatic image annotation assigns relevant textual tags to the images. In this dissertation, we propose a query-specific formulation based on Weighted Multi-view Non-negative Matrix Factorization to perform automatic image annotation. Our proposed technique seamlessly adapt to the changes in training data, naturally solves the problem of feature fusion and handles the challenge of the rare tags. Unlike tags, attributes are category-agnostic, hence their combination models an exponential number of semantic labels. Motivated by the fact that most attributes describe local properties, we propose exploiting localization cues, through semantic parsing of human face and body to improve person-related attribute prediction. We also demonstrate that image-level attribute labels can be effectively used as weak supervision for the task of semantic segmentation. Next, we analyze the Selfie images by utilizing tags and attributes. We collect the first large-scale Selfie dataset and annotate it with different attributes covering characteristics such as gender, age, race, facial gestures, and hairstyle. We then study the popularity and sentiments of the selfies given an estimated appearance of various semantic concepts. In brief, we automatically infer what makes a good selfie. Despite its extensive usage, the deep learning literature falls short in understanding the characteristics and behavior of the Batch Normalization. We conclude this dissertation by providing a fresh view, in light of information geometry and Fisher kernels to why the batch normalization works. We propose Mixture Normalization that disentangles modes of variation in the underlying distribution of the layer outputs and confirm that it effectively accelerates training of different batch-normalized architectures including Inception-V3, Densely Connected Networks, and Deep Convolutional Generative Adversarial Networks while achieving better generalization error
- …