3,041 research outputs found

    A hybrid model for aspect-based sentiment analysis on customer feedback: research on the mobile commerce sector in Vietnam

    Get PDF
    Feedback and comments on mobile commerce applications are extremely useful and valuable information sources that reflect the quality of products or services to determine whether data is positive or negative and help businesses monitor brand and product sentiment in customers’ feedback and understand customers’ needs. However, the increasing number of comments makes it increasingly difficult to understand customers using manual methods. To solve this problem, this study builds a hybrid research model based on aspect mining and comment classification for aspect-based sentiment analysis (ABSA) to deeply comprehend the customer and their experiences. Based on previous classification results, we first construct a dictionary of positive and negative words in the e-commerce field. Then, the POS tagging technique is applied for word classification in Vietnamese to extract aspects of model commerce related to positive or negative words. The model is implemented with machine and deep learning methods on a corpus comprising more than 1,000,000 customer opinions collected from Vietnam's four largest mobile commerce applications. Experimental results show that the Bi-LSTM method has the highest accuracy with 92.01%; it is selected for the proposed model to analyze the viewpoint of words on real data. The findings are that the proposed hybrid model can be applied to monitor online customer experience in real time, enable administrators to make timely and accurate decisions, and improve the quality of products and services to take a competitive advantage

    Cultural evolution in Vietnam’s early 20th century: a Bayesian networks analysis of Franco-Chinese house designs

    Get PDF
    The study of cultural evolution has taken on an increasingly interdisciplinary and diverse approach in explicating phenomena of cultural transmission and adoptions. Inspired by this computational movement, this study uses Bayesian networks analysis, combining both the frequentist and the Hamiltonian Markov chain Monte Carlo (MCMC) approach, to investigate the highly representative elements in the cultural evolution of a Vietnamese city’s architecture in the early 20th century. With a focus on the façade design of 68 old houses in Hanoi’s Old Quarter (based on 78 data lines extracted from 248 photos), the study argues that it is plausible to look at the aesthetics, architecture, and designs of the house façade to find traces of cultural evolution in Vietnam, which went through more than six decades of French colonization and centuries of sociocultural influence from China. The in-depth technical analysis, though refuting the presumed model on the probabilistic dependency among the variables, yields several results, the most notable of which is the strong influence of Buddhism over the decorations of the house façade. Particularly, in the top 5 networks with the best Bayesian Information Criterion (BIC) scores and p\u3c0.05, the variable for decorations (DC) always has a direct probabilistic dependency on the variable B for Buddhism. The paper then checks the robustness of these models using Hamiltonian MCMC method and find the posterior distributions of the models’ coefficients all satisfy the technical requirement. Finally, this study suggests integrating Bayesian statistics in the social sciences in general and for the study of cultural evolution and architectural transformation in particular

    A Survey on Deep Learning Techniques for Sentiment Analysis

    Get PDF
    Social media is a rich source of information nowadays. If we look into social media, sentiment analysis is one of the challenging problems. Sentiment analysis is a substantial area of research in the field of Natural Language Processing. This survey paper reviews and provides the comparative study of deep learning approaches CNN, RNN, LSTM and ensemble-based methods

    ViCGCN: Graph Convolutional Network with Contextualized Language Models for Social Media Mining in Vietnamese

    Full text link
    Social media processing is a fundamental task in natural language processing with numerous applications. As Vietnamese social media and information science have grown rapidly, the necessity of information-based mining on Vietnamese social media has become crucial. However, state-of-the-art research faces several significant drawbacks, including imbalanced data and noisy data on social media platforms. Imbalanced and noisy are two essential issues that need to be addressed in Vietnamese social media texts. Graph Convolutional Networks can address the problems of imbalanced and noisy data in text classification on social media by taking advantage of the graph structure of the data. This study presents a novel approach based on contextualized language model (PhoBERT) and graph-based method (Graph Convolutional Networks). In particular, the proposed approach, ViCGCN, jointly trained the power of Contextualized embeddings with the ability of Graph Convolutional Networks, GCN, to capture more syntactic and semantic dependencies to address those drawbacks. Extensive experiments on various Vietnamese benchmark datasets were conducted to verify our approach. The observation shows that applying GCN to BERTology models as the final layer significantly improves performance. Moreover, the experiments demonstrate that ViCGCN outperforms 13 powerful baseline models, including BERTology models, fusion BERTology and GCN models, other baselines, and SOTA on three benchmark social media datasets. Our proposed ViCGCN approach demonstrates a significant improvement of up to 6.21%, 4.61%, and 2.63% over the best Contextualized Language Models, including multilingual and monolingual, on three benchmark datasets, UIT-VSMEC, UIT-ViCTSD, and UIT-VSFC, respectively. Additionally, our integrated model ViCGCN achieves the best performance compared to other BERTology integrated with GCN models

    Multi-dimensional data refining strategy for effective fine-tuning LLMs

    Full text link
    Data is a cornerstone for fine-tuning large language models, yet acquiring suitable data remains challenging. Challenges encompassed data scarcity, linguistic diversity, and domain-specific content. This paper presents lessons learned while crawling and refining data tailored for fine-tuning Vietnamese language models. Crafting such a dataset, while accounting for linguistic intricacies and striking a balance between inclusivity and accuracy, demands meticulous planning. Our paper presents a multidimensional strategy including leveraging existing datasets in the English language and developing customized data-crawling scripts with the assistance of generative AI tools. A fine-tuned LLM model for the Vietnamese language, which was produced using resultant datasets, demonstrated good performance while generating Vietnamese news articles from prompts. The study offers practical solutions and guidance for future fine-tuning models in languages like Vietnamese

    A non-projective greedy dependency parser with bidirectional LSTMs

    Full text link
    The LyS-FASTPARSE team presents BIST-COVINGTON, a neural implementation of the Covington (2001) algorithm for non-projective dependency parsing. The bidirectional LSTM approach by Kipperwasser and Goldberg (2016) is used to train a greedy parser with a dynamic oracle to mitigate error propagation. The model participated in the CoNLL 2017 UD Shared Task. In spite of not using any ensemble methods and using the baseline segmentation and PoS tagging, the parser obtained good results on both macro-average LAS and UAS in the big treebanks category (55 languages), ranking 7th out of 33 teams. In the all treebanks category (LAS and UAS) we ranked 16th and 12th. The gap between the all and big categories is mainly due to the poor performance on four parallel PUD treebanks, suggesting that some `suffixed' treebanks (e.g. Spanish-AnCora) perform poorly on cross-treebank settings, which does not occur with the corresponding `unsuffixed' treebank (e.g. Spanish). By changing that, we obtain the 11th best LAS among all runs (official and unofficial). The code is made available at https://github.com/CoNLL-UD-2017/LyS-FASTPARSEComment: 12 pages, 2 figures, 5 table
    • …
    corecore