113 research outputs found
Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A Platforms
Question categorization and expert retrieval methods have been crucial for
information organization and accessibility in community question & answering
(CQA) platforms. Research in this area, however, has dealt with only the text
modality. With the increasing multimodal nature of web content, we focus on
extending these methods for CQA questions accompanied by images. Specifically,
we leverage the success of representation learning for text and images in the
visual question answering (VQA) domain, and adapt the underlying concept and
architecture for automated category classification and expert retrieval on
image-based questions posted on Yahoo! Chiebukuro, the Japanese counterpart of
Yahoo! Answers.
To the best of our knowledge, this is the first work to tackle the
multimodality challenge in CQA, and to adapt VQA models for tasks on a more
ecologically valid source of visual questions. Our analysis of the differences
between visual QA and community QA data drives our proposal of novel
augmentations of an attention method tailored for CQA, and use of auxiliary
tasks for learning better grounding features. Our final model markedly
outperforms the text-only and VQA model baselines for both tasks of
classification and expert retrieval on real-world multimodal CQA data.Comment: Submitted for review at CIKM 201
The Best Answers? Think Twice: Online Detection of Commercial Campaigns in the CQA Forums
In an emerging trend, more and more Internet users search for information
from Community Question and Answer (CQA) websites, as interactive communication
in such websites provides users with a rare feeling of trust. More often than
not, end users look for instant help when they browse the CQA websites for the
best answers. Hence, it is imperative that they should be warned of any
potential commercial campaigns hidden behind the answers. However, existing
research focuses more on the quality of answers and does not meet the above
need. In this paper, we develop a system that automatically analyzes the hidden
patterns of commercial spam and raises alarms instantaneously to end users
whenever a potential commercial campaign is detected. Our detection method
integrates semantic analysis and posters' track records and utilizes the
special features of CQA websites largely different from those in other types of
forums such as microblogs or news reports. Our system is adaptive and
accommodates new evidence uncovered by the detection algorithms over time.
Validated with real-world trace data from a popular Chinese CQA website over a
period of three months, our system shows great potential towards adaptive
online detection of CQA spams.Comment: 9 pages, 10 figure
Hierarchical Expert Recommendation on Community Question Answering Platforms
The community question answering (CQA) platforms, such as Stack Overflow, have become the primary source of answers to most questions in various topics. CQA platforms offer an opportunity for sharing and acquiring knowledge at a low cost, where users, many of whom are experts in a specific topic, can potentially provide high-quality solutions to a given question. Many recommendation methods have been proposed to match questions to potential good answerers. However, most existing methods have focused on modelling the user-question interaction — a user might answer multiple questions and a question might be answered by multiple users — using simple collaborative filtering approaches, overlooking the rich information in the question’s title and body when modelling the users’ expertise.
This project fills the research gap by thoroughly examining machine learning and deep learning approaches that can be applied to the expert recommendation problem. It proposes a Hierarchical Expert Recommendation (HER) model, a deep learning recommender system that recommends experts to answer a given question in the CQA platform. Although choosing a deep learning over a machine learning solution for this problem can be justified considering the degree of complexity of the available datasets, we assess performance of each family of methods and evaluate the trade-off between them to pick the perfect fit for our problem.
We analyzed various machine learning algorithms to determine their performances in the expert recommendation problem, which narrows down the potential ways for tackling this problem using traditional recommendation methods. Furthermore, we investigate the recommendation models based on matrix factorization to establish the baselines for our proposed model and shed light on the weaknesses and strengths of matrix- based solutions, which shape our final deep learning model. In the last section, we introduce the Hierarchical Expert Recommendation System (HER) that utilizes hierarchical attention-based neural networks to rep- resent the questions better and ultimately model the users’ expertise through user-question interactions. We conducted extensive experiments on a large real-world Stack Overflow dataset and benchmarked HER against the state-of-the-art baselines. The results from our extensive experiments show that HER outperforms the state-of-the-art baselines in recommending experts to answer questions in Stack Overflow
Modeling Tag Prediction based on Question Tagging Behavior Analysis of CommunityQA Platform Users
In community question-answering platforms, tags play essential roles in
effective information organization and retrieval, better question routing,
faster response to questions, and assessment of topic popularity. Hence,
automatic assistance for predicting and suggesting tags for posts is of high
utility to users of such platforms. To develop better tag prediction across
diverse communities and domains, we performed a thorough analysis of users'
tagging behavior in 17 StackExchange communities. We found various common
inherent properties of this behavior in those diverse domains. We used the
findings to develop a flexible neural tag prediction architecture, which
predicts both popular tags and more granular tags for each question. Our
extensive experiments and obtained performance show the effectiveness of our
modelComment: 20 page
- …