89,898 research outputs found

    Prompt Tuning on Graph-augmented Low-resource Text Classification

    Full text link
    Text classification is a fundamental problem in information retrieval with many real-world applications, such as predicting the topics of online articles and the categories of e-commerce product descriptions. However, low-resource text classification, with no or few labeled samples, presents a serious concern for supervised learning. Meanwhile, many text data are inherently grounded on a network structure, such as a hyperlink/citation network for online articles, and a user-item purchase network for e-commerce products. These graph structures capture rich semantic relationships, which can potentially augment low-resource text classification. In this paper, we propose a novel model called Graph-Grounded Pre-training and Prompting (G2P2) to address low-resource text classification in a two-pronged approach. During pre-training, we propose three graph interaction-based contrastive strategies to jointly pre-train a graph-text model; during downstream classification, we explore handcrafted discrete prompts and continuous prompt tuning for the jointly pre-trained model to achieve zero- and few-shot classification, respectively. Besides, for generalizing continuous prompts to unseen classes, we propose conditional prompt tuning on graphs (G2P2^*). Extensive experiments on four real-world datasets demonstrate the strength of G2P2 in zero- and few-shot low-resource text classification tasks, and illustrate the advantage of G2P2^* in dealing with unseen classes.Comment: 14 pages, journal under review. arXiv admin note: substantial text overlap with arXiv:2305.0332

    What attracts vehicle consumers’ buying:A Saaty scale-based VIKOR (SSC-VIKOR) approach from after-sales textual perspective?

    Get PDF
    Purpose: The increasingly booming e-commerce development has stimulated vehicle consumers to express individual reviews through online forum. The purpose of this paper is to probe into the vehicle consumer consumption behavior and make recommendations for potential consumers from textual comments viewpoint. Design/methodology/approach: A big data analytic-based approach is designed to discover vehicle consumer consumption behavior from online perspective. To reduce subjectivity of expert-based approaches, a parallel Naïve Bayes approach is designed to analyze the sentiment analysis, and the Saaty scale-based (SSC) scoring rule is employed to obtain specific sentimental value of attribute class, contributing to the multi-grade sentiment classification. To achieve the intelligent recommendation for potential vehicle customers, a novel SSC-VIKOR approach is developed to prioritize vehicle brand candidates from a big data analytical viewpoint. Findings: The big data analytics argue that “cost-effectiveness” characteristic is the most important factor that vehicle consumers care, and the data mining results enable automakers to better understand consumer consumption behavior. Research limitations/implications: The case study illustrates the effectiveness of the integrated method, contributing to much more precise operations management on marketing strategy, quality improvement and intelligent recommendation. Originality/value: Researches of consumer consumption behavior are usually based on survey-based methods, and mostly previous studies about comments analysis focus on binary analysis. The hybrid SSC-VIKOR approach is developed to fill the gap from the big data perspective

    An Intelligent Hybrid Sentiment Analyzer for Personal Protective Medical Equipments Based on Word Embedding Technique: The COVID-19 Era

    Get PDF
    Due to the accelerated growth of symmetrical sentiment data across different platforms, experimenting with different sentiment analysis (SA) techniques allows for better decision-making and strategic planning for different sectors. Specifically, the emergence of COVID-19 has enriched the data of people’s opinions and feelings about medical products. In this paper, we analyze people’s sentiments about the products of a well-known e-commerce website named Alibaba.com. People’s sentiments are experimented with using a novel evolutionary approach by applying advanced pre-trained word embedding for word presentations and combining them with an evolutionary feature selection mechanism to classify these opinions into different levels of ratings. The proposed approach is based on harmony search algorithm and different classification techniques including random forest, k-nearest neighbor, AdaBoost, bagging, SVM, and REPtree to achieve competitive results with the least possible features. The experiments are conducted on five different datasets including medical gloves, hand sanitizer, medical oxygen, face masks, and a combination of all these datasets. The results show that the harmony search algorithm successfully reduced the number of features by 94.25%, 89.5%, 89.25%, 92.5%, and 84.25% for the medical glove, hand sanitizer, medical oxygen, face masks, and whole datasets, respectively, while keeping a competitive performance in terms of accuracy and root mean square error (RMSE) for the classification techniques and decreasing the computational time required for classification

    Credibility Evaluation of User-generated Content using Novel Multinomial Classification Technique

    Get PDF
    Awareness about the features of the internet, easy access to data using mobile, and affordable data facilities have caused a lot of traffic on the internet. Digitization came with a lot of opportunities and challenges as well. One of the important advantages of digitization is paperless transactions, and transparency in payment, while data privacy, fake news, and cyber-attacks are the evolving challenges. The extensive use of social media networks and e-commerce websites has caused a lot of user-generated information, misinformation, and disinformation on the Internet. The quality of information depends upon various stages (of information) like generation of information, medium of propagation, and consumption of information. Content being user-generated, information needs a quality assessment before consumption. The loss of information is also necessary to be examined by applying the machine learning approach as the volume of content is extremely huge. This research work focuses on novel multinomial classification (based on multinoulli distribution) techniques to determine the quality of the information in the given content. To evaluate the information content a single algorithm with some processing is not sufficient and various approaches are necessary to evaluate the quality of content.  We propose a novel approach to calculate the bias, for which the Machine Learning model will be fitted appropriately to classify the content correctly. As an empirical study, rotten tomatoes’ movie review data set is used to apply the classification techniques. The accuracy of the system is evaluated using the ROC curve, confusion matrix, and MAP

    WEB recommendations for E-commerce websites

    Get PDF
    In this part of the thesis we have investigated how the navigation utilizing web recommendations can be implemented on the e-commerce websites based on integrated data sources. The integrated e-commerce websites are an interesting use case for web recommendations. One of the reasons for this interest is that many modern, large and economically successful e-commerce websites follow the integrated approach. Another reason is that especially in the integrated environment, due to the lack of the pre-defined semantic connections between the data, the web recommendations step forward as means of enabling user navigation. In this chapter we have presented the architecture for the websites based on integrated data sources named EC-Fuice. We have also presented the prototypical implementation of our architecture which serves as a proof-of-concept and investigated the challenges of creating navigation on an integrated website. The following issues were addressed in this part of the thesis: Combination of several state-of-the-art tools and techniques in the fields of databases, data integration, ontology matching and web engineering into one generic architecture for creating integrated websites. Comparative experiments with several techniques for instance matching (also known as record linkage or duplicate detection). Investigation on using the ontology matching to facilitate the instance matching. Comparative experiments with several techniques for ontology matching. Investigations on the instance-based ontology matching and the possibilities for combining instance-based ontology matching with other techniques for ontology matching. Investigation of the possibilities to improve user navigation in the integrated data environment with different types of web recommendations. Review of the related work in the fields of data integration and ontology matching and discussion of the contact points between the research described here and other related projects. The main contributions of the research described in this part of the thesis are the EC-Fuice architecture, the novel method for matching e-commerce ontologies based on combination of instance information and metadata information, the experimental results of ontology and instance matching performed by different matching algorithms and the classification of the types of recommendations which can be used on an integrated e-commerce website

    Automatic domain ontology extraction for context-sensitive opinion mining

    Get PDF
    Automated analysis of the sentiments presented in online consumer feedbacks can facilitate both organizations’ business strategy development and individual consumers’ comparison shopping. Nevertheless, existing opinion mining methods either adopt a context-free sentiment classification approach or rely on a large number of manually annotated training examples to perform context sensitive sentiment classification. Guided by the design science research methodology, we illustrate the design, development, and evaluation of a novel fuzzy domain ontology based contextsensitive opinion mining system. Our novel ontology extraction mechanism underpinned by a variant of Kullback-Leibler divergence can automatically acquire contextual sentiment knowledge across various product domains to improve the sentiment analysis processes. Evaluated based on a benchmark dataset and real consumer reviews collected from Amazon.com, our system shows remarkable performance improvement over the context-free baseline
    corecore