With short video platforms becoming one of the important channels for news
sharing, major short video platforms in China have gradually become new
breeding grounds for fake news. However, it is not easy to distinguish short
video rumors due to the great amount of information and features contained in
short videos, as well as the serious homogenization and similarity of features
among videos. In order to mitigate the spread of short video rumors, our group
decides to detect short video rumors by constructing multimodal feature fusion
and introducing external knowledge after considering the advantages and
disadvantages of each algorithm. The ideas of detection are as follows: (1)
dataset creation: to build a short video dataset with multiple features; (2)
multimodal rumor detection model: firstly, we use TSN (Temporal Segment
Networks) video coding model to extract video features; then, we use OCR
(Optical Character Recognition) and ASR (Automatic Character Recognition) to
extract video features. Recognition) and ASR (Automatic Speech Recognition)
fusion to extract text, and then use the BERT model to fuse text features with
video features (3) Finally, use contrast learning to achieve distinction: first
crawl external knowledge, then use the vector database to achieve the
introduction of external knowledge and the final structure of the
classification output. Our research process is always oriented to practical
needs, and the related knowledge results will play an important role in many
practical scenarios such as short video rumor identification and social opinion
control