23,167 research outputs found
BERT-CNN: a Hierarchical Patent Classifier Based on a Pre-Trained Language Model
The automatic classification is a process of automatically assigning text
documents to predefined categories. An accurate automatic patent classifier is
crucial to patent inventors and patent examiners in terms of intellectual
property protection, patent management, and patent information retrieval. We
present BERT-CNN, a hierarchical patent classifier based on pre-trained
language model by training the national patent application documents collected
from the State Information Center, China. The experimental results show that
BERT-CNN achieves 84.3% accuracy, which is far better than the two compared
baseline methods, Convolutional Neural Networks and Recurrent Neural Networks.
We didn't apply our model to the third and fourth hierarchical level of the
International Patent Classification - "subclass" and "group".The visualization
of the Attention Mechanism shows that BERT-CNN obtains new state-of-the-art
results in representing vocabularies and semantics. This article demonstrates
the practicality and effectiveness of BERT-CNN in the field of automatic patent
classification.Comment: in Chines
Joint Embedding of Words and Category Labels for Hierarchical Multi-label Text Classification
Text classification has become increasingly challenging due to the continuous
refinement of classification label granularity and the expansion of
classification label scale. To address that, some research has been applied
onto strategies that exploit the hierarchical structure in problems with a
large number of categories. At present, hierarchical text classification (HTC)
has received extensive attention and has broad application prospects. Making
full use of the relationship between parent category and child category in text
classification task can greatly improve the performance of classification. In
this paper, We propose a joint embedding of text and parent category based on
hierarchical fine-tuning ordered neurons LSTM (HFT-ONLSTM) for HTC. Our method
makes full use of the connection between the upper-level and lower-level
labels. Experiments show that our model outperforms the state-of-the-art
hierarchical model at a lower computation cost.Comment: The submitted paper (Identifier: arXiv:2004.02555) has a problem of
authorship disputes within a collaboration of author
Hierarchical Attention Generative Adversarial Networks for Cross-domain Sentiment Classification
Cross-domain sentiment classification (CDSC) is an importance task in domain
adaptation and sentiment classification. Due to the domain discrepancy, a
sentiment classifier trained on source domain data may not works well on target
domain data. In recent years, many researchers have used deep neural network
models for cross-domain sentiment classification task, many of which use
Gradient Reversal Layer (GRL) to design an adversarial network structure to
train a domain-shared sentiment classifier. Different from those methods, we
proposed Hierarchical Attention Generative Adversarial Networks (HAGAN) which
alternately trains a generator and a discriminator in order to produce a
document representation which is sentiment-distinguishable but
domain-indistinguishable. Besides, the HAGAN model applies Bidirectional Gated
Recurrent Unit (Bi-GRU) to encode the contextual information of a word and a
sentence into the document representation. In addition, the HAGAN model use
hierarchical attention mechanism to optimize the document representation and
automatically capture the pivots and non-pivots. The experiments on Amazon
review dataset show the effectiveness of HAGAN
Text Classification Algorithms: A Survey
In recent years, there has been an exponential growth in the number of
complex documents and texts that require a deeper understanding of machine
learning methods to be able to accurately classify texts in many applications.
Many machine learning approaches have achieved surpassing results in natural
language processing. The success of these learning algorithms relies on their
capacity to understand complex models and non-linear relationships within data.
However, finding suitable structures, architectures, and techniques for text
classification is a challenge for researchers. In this paper, a brief overview
of text classification algorithms is discussed. This overview covers different
text feature extractions, dimensionality reduction methods, existing algorithms
and techniques, and evaluations methods. Finally, the limitations of each
technique and their application in the real-world problem are discussed
Constructing a Natural Language Inference Dataset using Generative Neural Networks
Natural Language Inference is an important task for Natural Language
Understanding. It is concerned with classifying the logical relation between
two sentences. In this paper, we propose several text generative neural
networks for generating text hypothesis, which allows construction of new
Natural Language Inference datasets. To evaluate the models, we propose a new
metric -- the accuracy of the classifier trained on the generated dataset. The
accuracy obtained by our best generative model is only 2.7% lower than the
accuracy of the classifier trained on the original, human crafted dataset.
Furthermore, the best generated dataset combined with the original dataset
achieves the highest accuracy. The best model learns a mapping embedding for
each training example. By comparing various metrics we show that datasets that
obtain higher ROUGE or METEOR scores do not necessarily yield higher
classification accuracies. We also provide analysis of what are the
characteristics of a good dataset including the distinguishability of the
generated datasets from the original one
Deep Neural Networks Ensemble for Detecting Medication Mentions in Tweets
Objective: After years of research, Twitter posts are now recognized as an
important source of patient-generated data, providing unique insights into
population health. A fundamental step to incorporating Twitter data in
pharmacoepidemiological research is to automatically recognize medication
mentions in tweets. Given that lexical searches for medication names may fail
due to misspellings or ambiguity with common words, we propose a more advanced
method to recognize them. Methods: We present Kusuri, an Ensemble Learning
classifier, able to identify tweets mentioning drug products and dietary
supplements. Kusuri ("medication" in Japanese) is composed of two modules.
First, four different classifiers (lexicon-based, spelling-variant-based,
pattern-based and one based on a weakly-trained neural network) are applied in
parallel to discover tweets potentially containing medication names. Second, an
ensemble of deep neural networks encoding morphological, semantical and
long-range dependencies of important words in the tweets discovered is used to
make the final decision. Results: On a balanced (50-50) corpus of 15,005
tweets, Kusuri demonstrated performances close to human annotators with 93.7%
F1-score, the best score achieved thus far on this corpus. On a corpus made of
all tweets posted by 113 Twitter users (98,959 tweets, with only 0.26%
mentioning medications), Kusuri obtained 76.3% F1-score. There is not a prior
drug extraction system that compares running on such an extremely unbalanced
dataset. Conclusion: The system identifies tweets mentioning drug names with
performance high enough to ensure its usefulness and ready to be integrated in
larger natural language processing systems.Comment: This is a pre-copy-editing, author-produced PDF of an article
accepted for publication in JAMIA following peer review. The definitive
publisher-authenticated version is "D. Weissenbacher, A. Sarker, A. Klein, K.
O'Connor, A. Magge, G. Gonzalez-Hernandez, Deep neural networks ensemble for
detecting medication mentions in tweets, Journal of the American Medical
Informatics Association, ocz156, 2019
Extraction and Analysis of Clinically Important Follow-up Recommendations in a Large Radiology Dataset
Communication of follow-up recommendations when abnormalities are identified
on imaging studies is prone to error. In this paper, we present a natural
language processing approach based on deep learning to automatically identify
clinically important recommendations in radiology reports. Our approach first
identifies the recommendation sentences and then extracts reason, test, and
time frame of the identified recommendations. To train our extraction models,
we created a corpus of 567 radiology reports annotated for recommendation
information. Our extraction models achieved 0.92 f-score for recommendation
sentence, 0.65 f-score for reason, 0.73 f-score for test, and 0.84 f-score for
time frame. We applied the extraction models to a set of over 3.3 million
radiology reports and analyzed the adherence of follow-up recommendations.Comment: Under Review at American Medical Informatics Association Fall
Symposium'201
Topic Spotting using Hierarchical Networks with Self Attention
Success of deep learning techniques have renewed the interest in development
of dialogue systems. However, current systems struggle to have consistent long
term conversations with the users and fail to build rapport. Topic spotting,
the task of automatically inferring the topic of a conversation, has been shown
to be helpful in making a dialog system more engaging and efficient. We propose
a hierarchical model with self attention for topic spotting. Experiments on the
Switchboard corpus show the superior performance of our model over previously
proposed techniques for topic spotting and deep models for text classification.
Additionally, in contrast to offline processing of dialog, we also analyze the
performance of our model in a more realistic setting i.e. in an online setting
where the topic is identified in real time as the dialog progresses. Results
show that our model is able to generalize even with limited information in the
online setting.Comment: 5+2 Pages, Accepted at NAACL 201
Multi-Label Classification of Patient Notes a Case Study on ICD Code Assignment
In the context of the Electronic Health Record, automated diagnosis coding of
patient notes is a useful task, but a challenging one due to the large number
of codes and the length of patient notes. We investigate four models for
assigning multiple ICD codes to discharge summaries taken from both MIMIC II
and III. We present Hierarchical Attention-GRU (HA-GRU), a hierarchical
approach to tag a document by identifying the sentences relevant for each
label. HA-GRU achieves state-of-the art results. Furthermore, the learned
sentence-level attention layer highlights the model decision process, allows
easier error analysis, and suggests future directions for improvement
Dialogue Act Sequence Labeling using Hierarchical encoder with CRF
Dialogue Act recognition associate dialogue acts (i.e., semantic labels) to
utterances in a conversation. The problem of associating semantic labels to
utterances can be treated as a sequence labeling problem. In this work, we
build a hierarchical recurrent neural network using bidirectional LSTM as a
base unit and the conditional random field (CRF) as the top layer to classify
each utterance into its corresponding dialogue act. The hierarchical network
learns representations at multiple levels, i.e., word level, utterance level,
and conversation level. The conversation level representations are input to the
CRF layer, which takes into account not only all previous utterances but also
their dialogue acts, thus modeling the dependency among both, labels and
utterances, an important consideration of natural dialogue. We validate our
approach on two different benchmark data sets, Switchboard and Meeting Recorder
Dialogue Act, and show performance improvement over the state-of-the-art
methods by and absolute points, respectively. It is worth
noting that the inter-annotator agreement on Switchboard data set is ,
and our method is able to achieve the accuracy of about despite being
trained on the noisy data
- …