614 research outputs found
CasGCN: Predicting future cascade growth based on information diffusion graph
Sudden bursts of information cascades can lead to unexpected consequences
such as extreme opinions, changes in fashion trends, and uncontrollable spread
of rumors. It has become an important problem on how to effectively predict a
cascade' size in the future, especially for large-scale cascades on social
media platforms such as Twitter and Weibo. However, existing methods are
insufficient in dealing with this challenging prediction problem. Conventional
methods heavily rely on either hand crafted features or unrealistic
assumptions. End-to-end deep learning models, such as recurrent neural
networks, are not suitable to work with graphical inputs directly and cannot
handle structural information that is embedded in the cascade graphs. In this
paper, we propose a novel deep learning architecture for cascade growth
prediction, called CasGCN, which employs the graph convolutional network to
extract structural features from a graphical input, followed by the application
of the attention mechanism on both the extracted features and the temporal
information before conducting cascade size prediction. We conduct experiments
on two real-world cascade growth prediction scenarios (i.e., retweet popularity
on Sina Weibo and academic paper citations on DBLP), with the experimental
results showing that CasGCN enjoys a superior performance over several baseline
methods, particularly when the cascades are of large scale
Meta-KD: A Meta Knowledge Distillation Framework for Language Model Compression across Domains
Pre-trained language models have been applied to various NLP tasks with
considerable performance gains. However, the large model sizes, together with
the long inference time, limit the deployment of such models in real-time
applications. One line of model compression approaches considers knowledge
distillation to distill large teacher models into small student models. Most of
these studies focus on single-domain only, which ignores the transferable
knowledge from other domains. We notice that training a teacher with
transferable knowledge digested across domains can achieve better
generalization capability to help knowledge distillation. Hence we propose a
Meta-Knowledge Distillation (Meta-KD) framework to build a meta-teacher model
that captures transferable knowledge across domains and passes such knowledge
to students. Specifically, we explicitly force the meta-teacher to capture
transferable knowledge at both instance-level and feature-level from multiple
domains, and then propose a meta-distillation algorithm to learn single-domain
student models with guidance from the meta-teacher. Experiments on public
multi-domain NLP tasks show the effectiveness and superiority of the proposed
Meta-KD framework. Further, we also demonstrate the capability of Meta-KD in
the settings where the training data is scarce
- …