169 research outputs found
Exploiting Emotions via Composite Pretrained Embedding and Ensemble Language Model
Decisions in the modern era are based on more than just the available data; they also incorporate feedback from online sources. Processing reviews known as Sentiment analysis (SA) or Emotion analysis. Understanding the user's perspective and routines is crucial now-a-days for multiple reasons. It is used by both businesses and governments to make strategic decisions. Various architectural and vector embedding strategies have been developed for SA processing. Accurate representation of text is crucial for automatic SA. Due to the large number of languages spoken and written, polysemy and syntactic or semantic issues were common. To get around these problems, we developed effective composite embedding (ECE), a method that combines the advantages of vector embedding techniques that are either context-independent (like glove & fasttext) or context-aware (like XLNet) to effectively represent the features needed for processing. To improve the performace towards emotion or sentiment we proposed stacked ensemble model of deep lanugae models.ECE with Ensembled model is evaluated on balanced dataset to prove that it is a reliable embedding technique and a generalised model for SA.In order to evaluate ECE, cutting-edge ML and Deep net language models are deployed and comapared. The model is evaluated using benchmark datset such as MR, Kindle along with realtime tweet dataset of user complaints . LIME is used to verify the model's predictions and to provide statistical results for sentence.The model with ECE embedding provides state-of-art results with real time dataset as well
GrabQC: Graph based Query Contextualization for automated ICD coding
Automated medical coding is a process of codifying clinical notes to
appropriate diagnosis and procedure codes automatically from the standard
taxonomies such as ICD (International Classification of Diseases) and CPT
(Current Procedure Terminology). The manual coding process involves the
identification of entities from the clinical notes followed by querying a
commercial or non-commercial medical codes Information Retrieval (IR) system
that follows the Centre for Medicare and Medicaid Services (CMS) guidelines. We
propose to automate this manual process by automatically constructing a query
for the IR system using the entities auto-extracted from the clinical notes. We
propose \textbf{GrabQC}, a \textbf{Gra}ph \textbf{b}ased \textbf{Q}uery
\textbf{C}ontextualization method that automatically extracts queries from the
clinical text, contextualizes the queries using a Graph Neural Network (GNN)
model and obtains the ICD Codes using an external IR system. We also propose a
method for labelling the dataset for training the model. We perform experiments
on two datasets of clinical text in three different setups to assert the
effectiveness of our approach. The experimental results show that our proposed
method is better than the compared baselines in all three settings.Comment: 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining
(PAKDD 2021
Twitter Analysis to Predict the Satisfaction of Saudi Telecommunication Companies’ Customers
The flexibility in mobile communications allows customers to quickly switch from one service provider to
another, making customer churn one of the most critical challenges for the data and voice telecommunication
service industry. In 2019, the percentage of post-paid telecommunication customers in Saudi Arabia
decreased; this represents a great deal of customer dissatisfaction and subsequent corporate fiscal losses.
Many studies correlate customer satisfaction with customer churn. The Telecom companies have depended
on historical customer data to measure customer churn. However, historical data does not reveal current
customer satisfaction or future likeliness to switch between telecom companies. Current methods of analysing
churn rates are inadequate and faced some issues, particularly in the Saudi market.
This research was conducted to realize the relationship between customer satisfaction and customer churn
and how to use social media mining to measure customer satisfaction and predict customer churn.
This research conducted a systematic review to address the churn prediction models problems and their
relation to Arabic Sentiment Analysis. The findings show that the current churn models lack integrating
structural data frameworks with real-time analytics to target customers in real-time. In addition, the findings
show that the specific issues in the existing churn prediction models in Saudi Arabia relate to the Arabic
language itself, its complexity, and lack of resources.
As a result, I have constructed the first gold standard corpus of Saudi tweets related to telecom companies,
comprising 20,000 manually annotated tweets. It has been generated as a dialect sentiment lexicon extracted
from a larger Twitter dataset collected by me to capture text characteristics in social media. I developed a
new ASA prediction model for telecommunication that fills the detected gaps in the ASA literature and fits
the telecommunication field. The proposed model proved its effectiveness for Arabic sentiment analysis and
churn prediction. This is the first work using Twitter mining to predict potential customer loss (churn) in
Saudi telecom companies, which has not been attempted before. Different fields, such as education, have
different features, making applying the proposed model is interesting because it based on text-mining
Natural Language Processing: Emerging Neural Approaches and Applications
This Special Issue highlights the most recent research being carried out in the NLP field to discuss relative open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding interactively or autonomously from data in cognitive and neural systems, as well as on their potential or real applications in different domains
Meta-Learning with Variational Semantic Memory for Word Sense Disambiguation
A critical challenge faced by supervised word sense disambiguation (WSD) is
the lack of large annotated datasets with sufficient coverage of words in their
diversity of senses. This inspired recent research on few-shot WSD using
meta-learning. While such work has successfully applied meta-learning to learn
new word senses from very few examples, its performance still lags behind its
fully supervised counterpart. Aiming to further close this gap, we propose a
model of semantic memory for WSD in a meta-learning setting. Semantic memory
encapsulates prior experiences seen throughout the lifetime of the model, which
aids better generalization in limited data settings. Our model is based on
hierarchical variational inference and incorporates an adaptive memory update
rule via a hypernetwork. We show our model advances the state of the art in
few-shot WSD, supports effective learning in extremely data scarce (e.g.
one-shot) scenarios and produces meaning prototypes that capture similar senses
of distinct words.Comment: 15 pages, 5 figure
- …