2 research outputs found
SemEval-2017 Task 3: Community Question Answering
We describe SemEval-2017 Task 3 on Community Question Answering. This year,
we reran the four subtasks from SemEval-2016:(A) Question-Comment
Similarity,(B) Question-Question Similarity,(C) Question-External Comment
Similarity, and (D) Rerank the correct answers for a new question in Arabic,
providing all the data from 2015 and 2016 for training, and fresh data for
testing. Additionally, we added a new subtask E in order to enable
experimentation with Multi-domain Question Duplicate Detection in a
larger-scale scenario, using StackExchange subforums. A total of 23 teams
participated in the task, and submitted a total of 85 runs (36 primary and 49
contrastive) for subtasks A-D. Unfortunately, no teams participated in subtask
E. A variety of approaches and features were used by the participating systems
to address the different subtasks. The best systems achieved an official score
(MAP) of 88.43, 47.22, 15.46, and 61.16 in subtasks A, B, C, and D,
respectively. These scores are better than the baselines, especially for
subtasks A-C.Comment: community question answering, question-question similarity,
question-comment similarity, answer reranking, Multi-domain Question
Duplicate Detection, StackExchange, English, Arabi
Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching
Community-based question answering (CQA) websites represent an important
source of information. As a result, the problem of matching the most valuable
answers to their corresponding questions has become an increasingly popular
research topic. We frame this task as a binary (relevant/irrelevant)
classification problem, and present an adversarial training framework to
alleviate label imbalance issue. We employ a generative model to iteratively
sample a subset of challenging negative samples to fool our classification
model. Both models are alternatively optimized using REINFORCE algorithm. The
proposed method is completely different from previous ones, where negative
samples in training set are directly used or uniformly down-sampled. Further,
we propose using Multi-scale Matching which explicitly inspects the correlation
between words and ngrams of different levels of granularity. We evaluate the
proposed method on SemEval 2016 and SemEval 2017 datasets and achieves
state-of-the-art or similar performance