Search CORE

1 research outputs found

Machine learning based fusion algorithm to perform multimodal summarization

Author: Harakannanavar Sunil S.
Kanabur Vidyashree R.
Puranikmath Veena I.
Publication venue: Universidad Tecnica de Manabi
Publication date: 01/04/2022
Field of study

Video summarization is a rapidly growing research field which finds its application in various commercial and personal interests due to the massive surge in the amount of video data available in the modern world. The proposed approach uses ResNet-18 for feature extraction and with the help of temporal interest proposals generated for the video sequences, generates a video summary. The ResNet-18 is a convolutional neural network with eighteen layers. The existing methods don’t address the problem of the summary being temporally consistent. The proposed work aims to create a temporally consistent summary. The classification and regression module are implemented to get fixed length inputs of the combined features. After this, the non-maximum suppression algorithm is applied to reduce the redundancy and remove the video segments having poor quality and low confidence-scores. Video summaries are generated using the kernel temporal segmentation (KTS) algorithm which converts a given video segment into video shots. The two standard datasets TVSum and SumMe are used to evaluate the proposed model. It is seen that the F-score obtained on TVSum and SumMe datasets are 56.13 and 45.06 respectively

Neliti

International journal of health sciences