3,489 research outputs found
Text Classification: A Review, Empirical, and Experimental Evaluation
The explosive and widespread growth of data necessitates the use of text
classification to extract crucial information from vast amounts of data.
Consequently, there has been a surge of research in both classical and deep
learning text classification methods. Despite the numerous methods proposed in
the literature, there is still a pressing need for a comprehensive and
up-to-date survey. Existing survey papers categorize algorithms for text
classification into broad classes, which can lead to the misclassification of
unrelated algorithms and incorrect assessments of their qualities and behaviors
using the same metrics. To address these limitations, our paper introduces a
novel methodological taxonomy that classifies algorithms hierarchically into
fine-grained classes and specific techniques. The taxonomy includes methodology
categories, methodology techniques, and methodology sub-techniques. Our study
is the first survey to utilize this methodological taxonomy for classifying
algorithms for text classification. Furthermore, our study also conducts
empirical evaluation and experimental comparisons and rankings of different
algorithms that employ the same specific sub-technique, different
sub-techniques within the same technique, different techniques within the same
category, and categorie
Recognizing Conditional Causal Relationships about Emotions and Their Corresponding Conditions
The study of causal relationships between emotions and causes in texts has
recently received much attention. Most works focus on extracting causally
related clauses from documents. However, none of these works has considered
that the causal relationships among the extracted emotion and cause clauses can
only be valid under some specific context clauses. To highlight the context in
such special causal relationships, we propose a new task to determine whether
or not an input pair of emotion and cause has a valid causal relationship under
different contexts and extract the specific context clauses that participate in
the causal relationship. Since the task is new for which no existing dataset is
available, we conduct manual annotation on a benchmark dataset to obtain the
labels for our tasks and the annotations of each context clause's type that can
also be used in some other applications. We adopt negative sampling to
construct the final dataset to balance the number of documents with and without
causal relationships. Based on the constructed dataset, we propose an
end-to-end multi-task framework, where we design two novel and general modules
to handle the two goals of our task. Specifically, we propose a context masking
module to extract the context clauses participating in the causal
relationships. We propose a prediction aggregation module to fine-tune the
prediction results according to whether the input emotion and causes depend on
specific context clauses. Results of extensive comparative experiments and
ablation studies demonstrate the effectiveness and generality of our proposed
framework
- …