Search CORE

510 research outputs found

Self-Supervised Video Hashing with Hierarchical Binary Auto-encoder

Author: Gao Lianli
Hong Richang
Li Xiangpeng
Song Jingkuan
Wang Meng
Zhang Hanwang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Existing video hash functions are built on three isolated stages: frame pooling, relaxed learning, and binarization, which have not adequately explored the temporal order of video frames in a joint binary optimization model, resulting in severe information loss. In this paper, we propose a novel unsupervised video hashing framework dubbed Self-Supervised Video Hashing (SSVH), that is able to capture the temporal nature of videos in an end-to-end learning-to-hash fashion. We specifically address two central problems: 1) how to design an encoder-decoder architecture to generate binary codes for videos; and 2) how to equip the binary codes with the ability of accurate video retrieval. We design a hierarchical binary autoencoder to model the temporal dependencies in videos with multiple granularities, and embed the videos into binary codes with less computations than the stacked architecture. Then, we encourage the binary codes to simultaneously reconstruct the visual content and neighborhood structure of the videos. Experiments on two real-world datasets (FCVID and YFCC) show that our SSVH method can significantly outperform the state-of-the-art methods and achieve the currently best performance on the task of unsupervised video retrieval

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

ARCHANGEL: Tamper-proofing Video Archives using Temporal Content Hashes on the Blockchain

Author: Bell Mark
Brown Alan
Bui Tu
Collomosse John
Cooper Daniel
Das Arindra
Green Alex
Higgins Jez
Keller Jared
Sheridan John
Thereaux Olivier
Publication venue
Publication date: 01/01/2019
Field of study

We present ARCHANGEL; a novel distributed ledger based system for assuring the long-term integrity of digital video archives. First, we describe a novel deep network architecture for computing compact temporal content hashes (TCHs) from audio-visual streams with durations of minutes or hours. Our TCHs are sensitive to accidental or malicious content modification (tampering) but invariant to the codec used to encode the video. This is necessary due to the curatorial requirement for archives to format shift video over time to ensure future accessibility. Second, we describe how the TCHs (and the models used to derive them) are secured via a proof-of-authority blockchain distributed across multiple independent archives. We report on the efficacy of ARCHANGEL within the context of a trial deployment in which the national government archives of the United Kingdom, Estonia and Norway participated.Comment: Accepted to CVPR Blockchain Workshop 201

arXiv.org e-Print Archive

Crossref

University of Surrey

Surrey Research Insight

Contrastive Masked Autoencoders for Self-Supervised Video Hashing

Author: Chen Bin
Wang Jinpeng
Wang Yuting
Xia Shutao
Zeng Ziyun
Publication venue
Publication date: 23/11/2022
Field of study

Self-Supervised Video Hashing (SSVH) models learn to generate short binary representations for videos without ground-truth supervision, facilitating large-scale video retrieval efficiency and attracting increasing research attention. The success of SSVH lies in the understanding of video content and the ability to capture the semantic relation among unlabeled videos. Typically, state-of-the-art SSVH methods consider these two points in a two-stage training pipeline, where they firstly train an auxiliary network by instance-wise mask-and-predict tasks and secondly train a hashing model to preserve the pseudo-neighborhood structure transferred from the auxiliary network. This consecutive training strategy is inflexible and also unnecessary. In this paper, we propose a simple yet effective one-stage SSVH method called ConMH, which incorporates video semantic information and video similarity relationship understanding in a single stage. To capture video semantic information for better hashing learning, we adopt an encoder-decoder structure to reconstruct the video from its temporal-masked frames. Particularly, we find that a higher masking ratio helps video understanding. Besides, we fully exploit the similarity relationship between videos by maximizing agreement between two augmented views of a video, which contributes to more discriminative and robust hash codes. Extensive experiments on three large-scale video datasets (i.e., FCVID, ActivityNet and YFCC) indicate that ConMH achieves state-of-the-art results. Code is available at https://github.com/huangmozhi9527/ConMH.Comment: This work is accepted by the AAAI 2023. 9 pages, 6 figures, 6 table

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Unsupervised t-Distributed Video Hashing and Its Deep Hashing Extension

Author: Goulermas John Y
Hao Yanbin
Hong Richang
Jiang Jianguo
Mu Tingting
Wang Meng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

University of Liverpool Repository

The University of Manchester - Institutional Repository