Search CORE

5 research outputs found

Tourism Review Sentiment Classification Using a Bidirectional Recurrent Neural Network with an Attention Mechanism and Topic-Enriched Word Vectors

Author: Hu Jianjun
Hu Jie
Li Qin
Li Shaobo
Zhang Sen
Publication venue: Scholar Commons
Publication date: 17/09/2018
Field of study

Sentiment analysis of online tourist reviews is playing an increasingly important role in tourism. Accurately capturing the attitudes of tourists regarding different aspects of the scenic sites or the overall polarity of their online reviews is key to tourism analysis and application. However, the performances of current document sentiment analysis methods are not satisfactory as they either neglect the topics of the document or do not consider that not all words contribute equally to the meaning of the text. In this work, we propose a bidirectional gated recurrent unit neural network model (BiGRULA) for sentiment analysis by combining a topic model (lda2vec) and an attention mechanism. Lda2vec is used to discover all the main topics of review corpus, which are then used to enrich the word vector representation of words with context. The attention mechanism is used to learn to attribute different weights of the words to the overall meaning of the text. Experiments over 20 NewsGroup and IMDB datasets demonstrate the effectiveness of our model. Furthermore, we applied our model to hotel review data analysis, which allows us to get more coherent topics from these reviews and achieve good performance in sentiment classification

Scholar Commons - Institutional Repository of the University of South Carolina

Bittm: A core biterms-based topic model for targeted analysis

Author: Chen L
Li L
Wang J
Wu X
Publication venue: 'MDPI AG'
Publication date: 18/05/2022
Field of study

While most of the existing topic models perform a full analysis on a set of documents to discover all topics, it is noticed recently that in many situations users are interested in fine-grained topics related to some specific aspects only. As a result, targeted analysis (or focused analysis) has been proposed to address this problem. Given a corpus of documents from a broad area, targeted analysis discovers only topics related with user-interested aspects that are expressed by a set of user-provided query keywords. Existing approaches for targeted analysis suffer from problems such as topic loss and topic suppression because of their inherent assumptions and strategies. Moreover, existing approaches are not designed to address computation efficiency, while targeted analysis is supposed to provide responses to user queries as soon as possible. In this paper, we propose a core BiTerms-based Topic Model (BiTTM). By modelling topics from core biterms that are potentially relevant to the target query, on one hand, BiTTM captures the context information across documents to alleviate the problem of topic loss or suppression; on the other hand, our proposed model enables the efficient modelling of topics related to specific aspects. Our experiments on nine real-world datasets demonstrate BiTTM outperforms existing approaches in terms of both effectiveness and efficiency

OPUS - University of Technology Sydney

StoryNet: A 5W1H-based knowledge graph to connect stories

Author: Nagireddy Srichakradhar Reddy
Publication venue
Publication date: 07/01/2022
Field of study

Title from PDF of title page viewed January 19, 2022Thesis advisor: Yugyung LeeVitaIncludes bibliographical references (page 149-164)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2021Stories are a powerful medium through which the human community has exchanged information since the dawn of the information age. They have taken multiple forms like articles, movies, books, plays, short films, magazines, mythologies, etc. With the ever-growing complexity of information representation, exchange, and interaction, it became highly important to find ways that convey the stories more effectively. With a world that is diverging more and more, it is harder to draw parallels and connect the information from all around the globe. Even though there have been efforts to consolidate the information on a large scale like Wikipedia, Wiki Data, etc, they are devoid of any real-time happenings. With the recent advances in Natural Language Processing (NLP), we propose a framework to connect these stories together making it easier to find the links between them thereby helping us understand and explore the links between the stories and possibilities that revolve around them. Our framework is based on the 5W + 1H (What, Who, Where, When, Why, and How) format that represents stories in a format that is both easily understandable by humans and accurately generated by the deep learning models. We have used 311 calls and cyber security datasets as case studies for which a few NLP techniques like classification, Topic Modelling, Question Answering, and Question Generation were used along with the 5W1H framework to segregate the stories into clusters. This is a generic framework and can be used to apply to any field. We have evaluated two approaches for generating results - training-based and rule-based. For the rule-based approach, we used Stanford NLP parsers to identify patterns for the 5W + 1H terms, and for the training based approach, BERT embeddings were used and both were compared using an ensemble score (average of CoLA, SST-2, MRPC, QQP, STS-B, MNLI, QNLI, and RTE) along with BLEU and ROUGE scores. A few approaches are studied for training-based analysis - using BERT, Roberta, XLNet, ALBERT, ELECTRA, and AllenNLP Transformer QA with the datasets - CVE, NVD, SQuAD v1.1, and SQuAD v2.0, and compared them with custom annotations for identifying 5W + 1H. We've presented the performance and accuracy of both approaches in the results section. Our method gave a boost in the score from 30% (baseline) to 91% when trained on the 5W+1H annotations.Introduction -- Related work -- The 5W1H Framework and the models included -- StoryNet Application: Evaluation and Results -- Conclusion and Future Wor

University of Missouri: MOspace

Recurrent Attentional Topic Model

Author: Li Shuangyin
Mao Mingzhi
Pan Rong
Yang Yang
Zhang Yu
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 12/02/2017
Field of study

In a document, the topic distribution of a sentence depends on both the topics of preceding sentences and its own content, and it is usually affected by the topics of the preceding sentences with different weights. It is natural that a document can be treated as a sequence of sentences. Most existing works for Bayesian document modeling do not take these points into consideration. To fill this gap, we propose a Recurrent Attentional Topic Model (RATM) for document embedding. The RATM not only takes advantage of the sequential orders among sentence but also use the attention mechanism to model the relations among successive sentences. In RATM, we propose a Recurrent Attentional Bayesian Process (RABP) to handle the sequences. Based on the RABP, RATM fully utilizes the sequential information of the sentences in a document. Experiments on two copora show that our model outperforms state-of-the-art methods on document modeling and classification

Association for the Advancement of Artificial Intelligence: AAAI Publications

Bi-Directional Recurrent Attentional Topic Model

Author: Ahmed Amr
Bahdanau Dzmitry
Blei David M.
Bottou Léon
Chen Zhiyuan
Chorowski Jan K.
David
Dieng Adji B.
Gan Zhe
Geoffrey
Griffiths Thomas L.
Gruber Amit
Henao Ricardo
Hoffman Matthew D.
Jon
Jordan
Kiros Ryan
Lai Siwei
Larochelle Hugo
Le Quoc
Li Shuangyin
Marlin Benjamin M.
Mikolov Tomas
Mikolov Tomas
Mikolov Tomas
Newman David
Nitish Srivastava
Rong Pan
Serban Iulian V.
Shuangyin Li
Srivastava Nitish
Srivastava Nitish
Sutskever Ilya
Tang Jian
Wei Xing
Xu Mingyang
Yang Min
Yu Zhang
Zhai Ke
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref