89,898 research outputs found
Prompt Tuning on Graph-augmented Low-resource Text Classification
Text classification is a fundamental problem in information retrieval with
many real-world applications, such as predicting the topics of online articles
and the categories of e-commerce product descriptions. However, low-resource
text classification, with no or few labeled samples, presents a serious concern
for supervised learning. Meanwhile, many text data are inherently grounded on a
network structure, such as a hyperlink/citation network for online articles,
and a user-item purchase network for e-commerce products. These graph
structures capture rich semantic relationships, which can potentially augment
low-resource text classification. In this paper, we propose a novel model
called Graph-Grounded Pre-training and Prompting (G2P2) to address low-resource
text classification in a two-pronged approach. During pre-training, we propose
three graph interaction-based contrastive strategies to jointly pre-train a
graph-text model; during downstream classification, we explore handcrafted
discrete prompts and continuous prompt tuning for the jointly pre-trained model
to achieve zero- and few-shot classification, respectively. Besides, for
generalizing continuous prompts to unseen classes, we propose conditional
prompt tuning on graphs (G2P2). Extensive experiments on four real-world
datasets demonstrate the strength of G2P2 in zero- and few-shot low-resource
text classification tasks, and illustrate the advantage of G2P2 in dealing
with unseen classes.Comment: 14 pages, journal under review. arXiv admin note: substantial text
overlap with arXiv:2305.0332
What attracts vehicle consumers’ buying:A Saaty scale-based VIKOR (SSC-VIKOR) approach from after-sales textual perspective?
Purpose:
The increasingly booming e-commerce development has stimulated vehicle consumers to express individual reviews through online forum. The purpose of this paper is to probe into the vehicle consumer consumption behavior and make recommendations for potential consumers from textual comments viewpoint.
Design/methodology/approach:
A big data analytic-based approach is designed to discover vehicle consumer consumption behavior from online perspective. To reduce subjectivity of expert-based approaches, a parallel Naïve Bayes approach is designed to analyze the sentiment analysis, and the Saaty scale-based (SSC) scoring rule is employed to obtain specific sentimental value of attribute class, contributing to the multi-grade sentiment classification. To achieve the intelligent recommendation for potential vehicle customers, a novel SSC-VIKOR approach is developed to prioritize vehicle brand candidates from a big data analytical viewpoint.
Findings:
The big data analytics argue that “cost-effectiveness” characteristic is the most important factor that vehicle consumers care, and the data mining results enable automakers to better understand consumer consumption behavior.
Research limitations/implications:
The case study illustrates the effectiveness of the integrated method, contributing to much more precise operations management on marketing strategy, quality improvement and intelligent recommendation.
Originality/value:
Researches of consumer consumption behavior are usually based on survey-based methods, and mostly previous studies about comments analysis focus on binary analysis. The hybrid SSC-VIKOR approach is developed to fill the gap from the big data perspective
An Intelligent Hybrid Sentiment Analyzer for Personal Protective Medical Equipments Based on Word Embedding Technique: The COVID-19 Era
Due to the accelerated growth of symmetrical sentiment data across different platforms,
experimenting with different sentiment analysis (SA) techniques allows for better decision-making
and strategic planning for different sectors. Specifically, the emergence of COVID-19 has enriched
the data of people’s opinions and feelings about medical products. In this paper, we analyze people’s
sentiments about the products of a well-known e-commerce website named Alibaba.com. People’s
sentiments are experimented with using a novel evolutionary approach by applying advanced
pre-trained word embedding for word presentations and combining them with an evolutionary
feature selection mechanism to classify these opinions into different levels of ratings. The proposed
approach is based on harmony search algorithm and different classification techniques including
random forest, k-nearest neighbor, AdaBoost, bagging, SVM, and REPtree to achieve competitive
results with the least possible features. The experiments are conducted on five different datasets
including medical gloves, hand sanitizer, medical oxygen, face masks, and a combination of all these
datasets. The results show that the harmony search algorithm successfully reduced the number of
features by 94.25%, 89.5%, 89.25%, 92.5%, and 84.25% for the medical glove, hand sanitizer, medical
oxygen, face masks, and whole datasets, respectively, while keeping a competitive performance in
terms of accuracy and root mean square error (RMSE) for the classification techniques and decreasing
the computational time required for classification
Credibility Evaluation of User-generated Content using Novel Multinomial Classification Technique
Awareness about the features of the internet, easy access to data using mobile, and affordable data facilities have caused a lot of traffic on the internet. Digitization came with a lot of opportunities and challenges as well. One of the important advantages of digitization is paperless transactions, and transparency in payment, while data privacy, fake news, and cyber-attacks are the evolving challenges. The extensive use of social media networks and e-commerce websites has caused a lot of user-generated information, misinformation, and disinformation on the Internet. The quality of information depends upon various stages (of information) like generation of information, medium of propagation, and consumption of information. Content being user-generated, information needs a quality assessment before consumption. The loss of information is also necessary to be examined by applying the machine learning approach as the volume of content is extremely huge. This research work focuses on novel multinomial classification (based on multinoulli distribution) techniques to determine the quality of the information in the given content. To evaluate the information content a single algorithm with some processing is not sufficient and various approaches are necessary to evaluate the quality of content. We propose a novel approach to calculate the bias, for which the Machine Learning model will be fitted appropriately to classify the content correctly. As an empirical study, rotten tomatoes’ movie review data set is used to apply the classification techniques. The accuracy of the system is evaluated using the ROC curve, confusion matrix, and MAP
WEB recommendations for E-commerce websites
In this part of the thesis we have investigated how the navigation utilizing web recommendations can be implemented on the e-commerce websites based on integrated data sources. The integrated e-commerce websites are an interesting use case for web recommendations. One of the reasons for this interest is that many modern, large and economically successful e-commerce websites follow the integrated approach. Another reason is that especially in the integrated environment, due to the lack of the pre-defined semantic connections between the data, the web recommendations step forward as means of enabling user navigation. In this chapter we have presented the architecture for the websites based on integrated data sources named EC-Fuice. We have also presented the prototypical implementation of our architecture which serves as a proof-of-concept and investigated the challenges of creating navigation on an integrated website.
The following issues were addressed in this part of the thesis:
Combination of several state-of-the-art tools and techniques in the fields of databases, data integration, ontology matching and web engineering into one generic architecture for creating integrated websites.
Comparative experiments with several techniques for instance matching (also known as record linkage or duplicate detection). Investigation on using the ontology matching to facilitate the instance matching.
Comparative experiments with several techniques for ontology matching. Investigations on the instance-based ontology matching and the possibilities for combining instance-based ontology matching with other techniques for ontology matching.
Investigation of the possibilities to improve user navigation in the integrated data environment with different types of web recommendations.
Review of the related work in the fields of data integration and ontology matching and discussion of the contact points between the research described here and other related projects.
The main contributions of the research described in this part of the thesis are the EC-Fuice architecture, the novel method for matching e-commerce ontologies based on combination of instance information and metadata information, the experimental results of ontology and instance matching performed by different matching algorithms and the classification of the types of recommendations which can be used on an integrated e-commerce website
Automatic domain ontology extraction for context-sensitive opinion mining
Automated analysis of the sentiments presented in online consumer feedbacks can facilitate both organizations’ business strategy development and individual consumers’ comparison shopping. Nevertheless, existing opinion mining methods either adopt a context-free sentiment classification approach or rely on a large number of manually annotated training examples to perform context sensitive sentiment classification. Guided by the design science research methodology, we illustrate the design, development, and evaluation of a novel fuzzy domain ontology based contextsensitive opinion mining system. Our novel ontology extraction mechanism underpinned by a variant of Kullback-Leibler divergence can automatically acquire contextual sentiment knowledge across various product domains to improve the sentiment analysis processes. Evaluated based on a benchmark dataset and real consumer reviews collected from Amazon.com, our system shows remarkable performance improvement over the context-free baseline
- …