280,080 research outputs found
Hybrid Model For Word Prediction Using Naive Bayes and Latent Information
Historically, the Natural Language Processing area has been given too much
attention by many researchers. One of the main motivation beyond this interest
is related to the word prediction problem, which states that given a set words
in a sentence, one can recommend the next word. In literature, this problem is
solved by methods based on syntactic or semantic analysis. Solely, each of
these analysis cannot achieve practical results for end-user applications. For
instance, the Latent Semantic Analysis can handle semantic features of text,
but cannot suggest words considering syntactical rules. On the other hand,
there are models that treat both methods together and achieve state-of-the-art
results, e.g. Deep Learning. These models can demand high computational effort,
which can make the model infeasible for certain types of applications. With the
advance of the technology and mathematical models, it is possible to develop
faster systems with more accuracy. This work proposes a hybrid word suggestion
model, based on Naive Bayes and Latent Semantic Analysis, considering
neighbouring words around unfilled gaps. Results show that this model could
achieve 44.2% of accuracy in the MSR Sentence Completion Challenge
Syntax-aware Hybrid prompt model for Few-shot multi-modal sentiment analysis
Multimodal Sentiment Analysis (MSA) has been a popular topic in natural
language processing nowadays, at both sentence and aspect level. However, the
existing approaches almost require large-size labeled datasets, which bring
about large consumption of time and resources. Therefore, it is practical to
explore the method for few-shot sentiment analysis in cross-modalities.
Previous works generally execute on textual modality, using the prompt-based
methods, mainly two types: hand-crafted prompts and learnable prompts. The
existing approach in few-shot multi-modality sentiment analysis task has
utilized both methods, separately. We further design a hybrid pattern that can
combine one or more fixed hand-crafted prompts and learnable prompts and
utilize the attention mechanisms to optimize the prompt encoder. The
experiments on both sentence-level and aspect-level datasets prove that we get
a significant outperformance
A Novel Hybrid Classification Approach for Sentiment Analysis of Text Document
Sentiment analysis is a more popular area of highly active research in Automatic Language Processing. She assigns a negative or positive polarity to one or more entities using different natural language processing tools and also predicted high and low performance of various sentiment classifiers. Our approach focuses on the analysis of feelings resulting from reviews of products using original text search techniques. These reviews can be classified as having a positive or negative feeling based on certain aspects in relation to a query based on terms. In this paper, we chose to use two automatic learning methods for classification: Support Vector Machines (SVM) and Random Forest, and we introduce a novel hybrid approach to identify product reviews offered by Amazon. This is useful for consumers who want to research the sentiment of products before purchase, or companies that want to monitor the public sentiment of their brands. The results summarize that the proposed method outperforms these individual classifiers in this amazon dataset
A Comprehensive Scientometric Evaluation of the Field of Information Literacy Using Hybrid Bibliometrics and Full-Text Lexical Analysis Methods
In scientometric studies, hybrid approaches (i.e., the combination of traditional
bibliometric techniques and lexical analysis methods) are used to investigate fields of research.
With the increasing availability of full-text documents in machine-readable formats, advanced
techniques (e.g., natural language processing [NLP]) are becoming common practice. Numerous
bibliometric analyses have been conducted in the field of information literacy (IL). However, the
majority of these investigations focus on citation metadata, while some incorporate lexical
analyses of titles and abstracts.
The purpose of this dissertation work is to contribute to existing scientometrics
knowledge of the IL field using novel and advanced hybrid methods. The primary goal is to
examine IL holistically, using both bibliometric techniques and full-text lexical analyses. The
study aims to answer the following research questions: 1) What are the most important historical
publications in the IL field?; 2) What are the intellectual and collaborative structural
configurations of the IL field?; 3) To what extent are the structural configurations enhanced by
lexical analysis?; and 4) How has the field of IL evolved over time with respect to seminal
concepts and vocabulary?
This poster presents findings from preliminary analyses. Citation metadata and full-text
documents were collected from Web of Science (WoS), Scopus, and Google Scholar. The
methods used include reference publication year spectroscopy (RPYS) to establish the historical
roots of the IL literature, co-word analysis to map the intellectual structure of the IL field, and
co-authorship analysis to analyze the collaboration networks of IL researchers
Adapting Sequence to Sequence models for Text Normalization in Social Media
Social media offer an abundant source of valuable raw data, however informal
writing can quickly become a bottleneck for many natural language processing
(NLP) tasks. Off-the-shelf tools are usually trained on formal text and cannot
explicitly handle noise found in short online posts. Moreover, the variety of
frequently occurring linguistic variations presents several challenges, even
for humans who might not be able to comprehend the meaning of such posts,
especially when they contain slang and abbreviations. Text Normalization aims
to transform online user-generated text to a canonical form. Current text
normalization systems rely on string or phonetic similarity and classification
models that work on a local fashion. We argue that processing contextual
information is crucial for this task and introduce a social media text
normalization hybrid word-character attention-based encoder-decoder model that
can serve as a pre-processing step for NLP applications to adapt to noisy text
in social media. Our character-based component is trained on synthetic
adversarial examples that are designed to capture errors commonly found in
online user-generated text. Experiments show that our model surpasses neural
architectures designed for text normalization and achieves comparable
performance with state-of-the-art related work.Comment: Accepted at the 13th International AAAI Conference on Web and Social
Media (ICWSM 2019
Table Search Using a Deep Contextualized Language Model
Pretrained contextualized language models such as BERT have achieved
impressive results on various natural language processing benchmarks.
Benefiting from multiple pretraining tasks and large scale training corpora,
pretrained models can capture complex syntactic word relations. In this paper,
we use the deep contextualized language model BERT for the task of ad hoc table
retrieval. We investigate how to encode table content considering the table
structure and input length limit of BERT. We also propose an approach that
incorporates features from prior literature on table retrieval and jointly
trains them with BERT. In experiments on public datasets, we show that our best
approach can outperform the previous state-of-the-art method and BERT baselines
with a large margin under different evaluation metrics.Comment: Accepted at SIGIR 2020 (Long
- …