65 research outputs found
Low-Resource Response Generation with Template Prior
We study open domain response generation with limited message-response pairs.
The problem exists in real-world applications but is less explored by the
existing work. Since the paired data now is no longer enough to train a neural
generation model, we consider leveraging the large scale of unpaired data that
are much easier to obtain, and propose response generation with both paired and
unpaired data. The generation model is defined by an encoder-decoder
architecture with templates as prior, where the templates are estimated from
the unpaired data as a neural hidden semi-markov model. By this means, response
generation learned from the small paired data can be aided by the semantic and
syntactic knowledge in the large unpaired data. To balance the effect of the
prior and the input message to response generation, we propose learning the
whole generation model with an adversarial approach. Empirical studies on
question response generation and sentiment response generation indicate that
when only a few pairs are available, our model can significantly outperform
several state-of-the-art response generation models in terms of both automatic
and human evaluation.Comment: Accepted by EMNLP201
TI-CNN: Convolutional Neural Networks for Fake News Detection
With the development of social networks, fake news for various commercial and
political purposes has been appearing in large numbers and gotten widespread in
the online world. With deceptive words, people can get infected by the fake
news very easily and will share them without any fact-checking. For instance,
during the 2016 US president election, various kinds of fake news about the
candidates widely spread through both official news media and the online social
networks. These fake news is usually released to either smear the opponents or
support the candidate on their side. The erroneous information in the fake news
is usually written to motivate the voters' irrational emotion and enthusiasm.
Such kinds of fake news sometimes can bring about devastating effects, and an
important goal in improving the credibility of online social networks is to
identify the fake news timely. In this paper, we propose to study the fake news
detection problem. Automatic fake news identification is extremely hard, since
pure model based fact-checking for news is still an open problem, and few
existing models can be applied to solve the problem. With a thorough
investigation of a fake news data, lots of useful explicit features are
identified from both the text words and images used in the fake news. Besides
the explicit features, there also exist some hidden patterns in the words and
images used in fake news, which can be captured with a set of latent features
extracted via the multiple convolutional layers in our model. A model named as
TI-CNN (Text and Image information based Convolutinal Neural Network) is
proposed in this paper. By projecting the explicit and latent features into a
unified feature space, TI-CNN is trained with both the text and image
information simultaneously. Extensive experiments carried on the real-world
fake news datasets have demonstrate the effectiveness of TI-CNN
M2C: Towards Automatic Multimodal Manga Complement
Multimodal manga analysis focuses on enhancing manga understanding with
visual and textual features, which has attracted considerable attention from
both natural language processing and computer vision communities. Currently,
most comics are hand-drawn and prone to problems such as missing pages, text
contamination, and aging, resulting in missing comic text content and seriously
hindering human comprehension. In other words, the Multimodal Manga Complement
(M2C) task has not been investigated, which aims to handle the aforementioned
issues by providing a shared semantic space for vision and language
understanding. To this end, we first propose the Multimodal Manga Complement
task by establishing a new M2C benchmark dataset covering two languages. First,
we design a manga argumentation method called MCoT to mine event knowledge in
comics with large language models. Then, an effective baseline FVP-M
using fine-grained visual prompts is proposed to support manga complement.
Extensive experimental results show the effectiveness of FVP-M method for
Multimodal Mange Complement.Comment: EMNLP2023. arXiv admin note: text overlap with arXiv:2210.1546
HLT-MT: High-resource Language-specific Training for Multilingual Neural Machine Translation
Multilingual neural machine translation (MNMT) trained in multiple language
pairs has attracted considerable attention due to fewer model parameters and
lower training costs by sharing knowledge among multiple languages.
Nonetheless, multilingual training is plagued by language interference
degeneration in shared parameters because of the negative interference among
different translation directions, especially on high-resource languages. In
this paper, we propose the multilingual translation model with the
high-resource language-specific training (HLT-MT) to alleviate the negative
interference, which adopts the two-stage training with the language-specific
selection mechanism. Specifically, we first train the multilingual model only
with the high-resource pairs and select the language-specific modules at the
top of the decoder to enhance the translation quality of high-resource
directions. Next, the model is further trained on all available corpora to
transfer knowledge from high-resource languages (HRLs) to low-resource
languages (LRLs). Experimental results show that HLT-MT outperforms various
strong baselines on WMT-10 and OPUS-100 benchmarks. Furthermore, the analytic
experiments validate the effectiveness of our method in mitigating the negative
interference in multilingual training.Comment: 7 pages, 7 figures, IJCAI-ECAI 202
- …