199 research outputs found
Generating Synthetic Data for Neural Keyword-to-Question Models
Search typically relies on keyword queries, but these are often semantically
ambiguous. We propose to overcome this by offering users natural language
questions, based on their keyword queries, to disambiguate their intent. This
keyword-to-question task may be addressed using neural machine translation
techniques. Neural translation models, however, require massive amounts of
training data (keyword-question pairs), which is unavailable for this task. The
main idea of this paper is to generate large amounts of synthetic training data
from a small seed set of hand-labeled keyword-question pairs. Since natural
language questions are available in large quantities, we develop models to
automatically generate the corresponding keyword queries. Further, we introduce
various filtering mechanisms to ensure that synthetic training data is of high
quality. We demonstrate the feasibility of our approach using both automatic
and manual evaluation. This is an extended version of the article published
with the same title in the Proceedings of ICTIR'18.Comment: Extended version of ICTIR'18 full paper, 11 page
Generating Video Descriptions with Topic Guidance
Generating video descriptions in natural language (a.k.a. video captioning)
is a more challenging task than image captioning as the videos are
intrinsically more complicated than images in two aspects. First, videos cover
a broader range of topics, such as news, music, sports and so on. Second,
multiple topics could coexist in the same video. In this paper, we propose a
novel caption model, topic-guided model (TGM), to generate topic-oriented
descriptions for videos in the wild via exploiting topic information. In
addition to predefined topics, i.e., category tags crawled from the web, we
also mine topics in a data-driven way based on training captions by an
unsupervised topic mining model. We show that data-driven topics reflect a
better topic schema than the predefined topics. As for testing video topic
prediction, we treat the topic mining model as teacher to train the student,
the topic prediction model, by utilizing the full multi-modalities in the video
especially the speech modality. We propose a series of caption models to
exploit topic guidance, including implicitly using the topics as input features
to generate words related to the topic and explicitly modifying the weights in
the decoder with topics to function as an ensemble of topic-aware language
decoders. Our comprehensive experimental results on the current largest video
caption dataset MSR-VTT prove the effectiveness of our topic-guided model,
which significantly surpasses the winning performance in the 2016 MSR video to
language challenge.Comment: Appeared at ICMR 201
Using conditional random fields to extract contexts and answers of questions from online forums
Online forum discussions often contain vast amounts of questions that are the focuses of discussions. Extracting contexts and answers together with the questions will yield not only a coherent forum summary but also a valuable QA knowledge base. In this paper, we propose a general framework based on Conditional Random Fields (CRFs) to detect the contexts and answers of questions from forum threads. We improve the basic framework by Skip-chain CRFs and 2D CRFs to better accommodate the features of forums for better performance. Experimental results show that our techniques are very promising.
Improving Entity Linking by Modeling Latent Entity Type Information
Existing state of the art neural entity linking models employ attention-based
bag-of-words context model and pre-trained entity embeddings bootstrapped from
word embeddings to assess topic level context compatibility. However, the
latent entity type information in the immediate context of the mention is
neglected, which causes the models often link mentions to incorrect entities
with incorrect type. To tackle this problem, we propose to inject latent entity
type information into the entity embeddings based on pre-trained BERT. In
addition, we integrate a BERT-based entity similarity score into the local
context model of a state-of-the-art model to better capture latent entity type
information. Our model significantly outperforms the state-of-the-art entity
linking models on standard benchmark (AIDA-CoNLL). Detailed experiment analysis
demonstrates that our model corrects most of the type errors produced by the
direct baseline.Comment: Accepted by AAAI 202
Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text
Self-supervised representation learning has proved to be a valuable component
for out-of-distribution (OoD) detection with only the texts of in-distribution
(ID) examples. These approaches either train a language model from scratch or
fine-tune a pre-trained language model using ID examples, and then take
perplexity as output by the language model as OoD scores. In this paper, we
analyse the complementary characteristics of both OoD detection methods and
propose a multi-level knowledge distillation approach to integrate their
strengths, while mitigating their limitations. Specifically, we use a
fine-tuned model as the teacher to teach a randomly initialized student model
on the ID examples. Besides the prediction layer distillation, we present a
similarity-based intermediate layer distillation method to facilitate the
student's awareness of the information flow inside the teacher's layers. In
this way, the derived student model gains the teacher's rich knowledge about
the ID data manifold due to pre-training, while benefiting from seeing only ID
examples during parameter learning, which promotes more distinguishable
features for OoD detection. We conduct extensive experiments over multiple
benchmark datasets, i.e., CLINC150, SST, 20 NewsGroups, and AG News; showing
that the proposed method yields new state-of-the-art performance.Comment: 11 page
Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources
For languages with no annotated resources, transferring knowledge from
rich-resource languages is an effective solution for named entity recognition
(NER). While all existing methods directly transfer from source-learned model
to a target language, in this paper, we propose to fine-tune the learned model
with a few similar examples given a test case, which could benefit the
prediction by leveraging the structural and semantic information conveyed in
such similar examples. To this end, we present a meta-learning algorithm to
find a good model parameter initialization that could fast adapt to the given
test case and propose to construct multiple pseudo-NER tasks for meta-training
by computing sentence similarities. To further improve the model's
generalization ability across different languages, we introduce a masking
scheme and augment the loss function with an additional maximum term during
meta-training. We conduct extensive experiments on cross-lingual named entity
recognition with minimal resources over five target languages. The results show
that our approach significantly outperforms existing state-of-the-art methods
across the board.Comment: This paper is accepted by AAAI2020. Code is available at
https://github.com/microsoft/vert-papers/tree/master/papers/Meta-Cros
A computational approach to measuring the correlation between expertise and social media influence for celebrities on microblogs
Social media influence analysis, sometimes also called authority detection, aims to rank users based on their influence scores in social media. Existing approaches of social influence analysis usually focus on how to develop effective algorithms to quantize users’ influence scores. They rarely consider a person’s expertise levels which are arguably important to influence measures. In this paper, we propose a computational approach to measuring the correlation between expertise and social media influence, and we take a new perspective to understand social media influence by incorporating expertise into influence analysis. We carefully constructed a large dataset of 13,684 Chinese celebrities from Sina Weibo (literally ”Sina microblogging”). We found that there is a strong correlation between expertise levels and social media influence scores. Our analysis gave a good explanation of the phenomenon of “top across-domain influencers”. In addition, different expertise levels showed influence variation patterns: e.g., (1) high-expertise celebrities have stronger influence on the “audience” in their expertise domains; (2) expertise seems to be more important than relevance and participation for social media influence; (3) the audiences of top expertise celebrities are more likely to forward tweets on topics outside the expertise domains from high-expertise celebrities
- …