19,563 research outputs found
Video Question Answering: Datasets, Algorithms and Challenges
Video Question Answering (VideoQA) aims to answer natural language questions
according to the given videos. It has earned increasing attention with recent
research trends in joint vision and language understanding. Yet, compared with
ImageQA, VideoQA is largely underexplored and progresses slowly. Although
different algorithms have continually been proposed and shown success on
different VideoQA datasets, we find that there lacks a meaningful survey to
categorize them, which seriously impedes its advancements. This paper thus
provides a clear taxonomy and comprehensive analyses to VideoQA, focusing on
the datasets, algorithms, and unique challenges. We then point out the research
trend of studying beyond factoid QA to inference QA towards the cognition of
video contents, Finally, we conclude some promising directions for future
exploration.Comment: Accepted by EMNLP 202
A Survey of Natural Language Generation
This paper offers a comprehensive review of the research on Natural Language
Generation (NLG) over the past two decades, especially in relation to
data-to-text generation and text-to-text generation deep learning methods, as
well as new applications of NLG technology. This survey aims to (a) give the
latest synthesis of deep learning research on the NLG core tasks, as well as
the architectures adopted in the field; (b) detail meticulously and
comprehensively various NLG tasks and datasets, and draw attention to the
challenges in NLG evaluation, focusing on different evaluation methods and
their relationships; (c) highlight some future emphasis and relatively recent
research issues that arise due to the increasing synergy between NLG and other
artificial intelligence areas, such as computer vision, text and computational
creativity.Comment: Accepted by ACM Computing Survey (CSUR) 202
- …