1,118 research outputs found
A Graph-Based Context-Aware Model to Understand Online Conversations
Online forums that allow for participatory engagement between users have been
transformative for the public discussion of many important issues. However,
such conversations can sometimes escalate into full-blown exchanges of hate and
misinformation. Existing approaches in natural language processing (NLP), such
as deep learning models for classification tasks, use as inputs only a single
comment or a pair of comments depending upon whether the task concerns the
inference of properties of the individual comments or the replies between pairs
of comments, respectively. But in online conversations, comments and replies
may be based on external context beyond the immediately relevant information
that is input to the model. Therefore, being aware of the conversations'
surrounding contexts should improve the model's performance for the inference
task at hand.
We propose GraphNLI, a novel graph-based deep learning architecture that uses
graph walks to incorporate the wider context of a conversation in a principled
manner. Specifically, a graph walk starts from a given comment and samples
"nearby" comments in the same or parallel conversation threads, which results
in additional embeddings that are aggregated together with the initial
comment's embedding. We then use these enriched embeddings for downstream NLP
prediction tasks that are important for online conversations. We evaluate
GraphNLI on two such tasks - polarity prediction and misogynistic hate speech
detection - and found that our model consistently outperforms all relevant
baselines for both tasks. Specifically, GraphNLI with a biased root-seeking
random walk performs with a macro-F1 score of 3 and 6 percentage points better
than the best-performing BERT-based baselines for the polarity prediction and
hate speech detection tasks, respectively.Comment: 25 pages, 9 figures. arXiv admin note: text overlap with
arXiv:2202.0817
Deep Learning for User Comment Moderation
Experimenting with a new dataset of 1.6M user comments from a Greek news
portal and existing datasets of English Wikipedia comments, we show that an RNN
outperforms the previous state of the art in moderation. A deep,
classification-specific attention mechanism improves further the overall
performance of the RNN. We also compare against a CNN and a word-list baseline,
considering both fully automatic and semi-automatic moderation
- …