Search CORE

5,644 research outputs found

Making space & time uniformly identifiable in the web via media fragments

Author: Mannens Erik
Van de Walle Rik
Van Deursen Davy
Publication venue
Publication date: 01/01/2011
Field of study

Ghent University Academic Bibliography

Improved quality of service utilising high priority traffic in HCF in a dynamically changing wireless networks

Author: Adda Mo
Peart Amanda
Publication venue: 'Athens Institute for Education and Research ATINER'
Publication date: 01/01/2008
Field of study

Portsmouth University Research Portal (Pure)

ARCHANGEL: Tamper-proofing Video Archives using Temporal Content Hashes on the Blockchain

Author: Bell Mark
Brown Alan
Bui Tu
Collomosse John
Cooper Daniel
Das Arindra
Green Alex
Higgins Jez
Keller Jared
Sheridan John
Thereaux Olivier
Publication venue
Publication date: 01/01/2019
Field of study

We present ARCHANGEL; a novel distributed ledger based system for assuring the long-term integrity of digital video archives. First, we describe a novel deep network architecture for computing compact temporal content hashes (TCHs) from audio-visual streams with durations of minutes or hours. Our TCHs are sensitive to accidental or malicious content modification (tampering) but invariant to the codec used to encode the video. This is necessary due to the curatorial requirement for archives to format shift video over time to ensure future accessibility. Second, we describe how the TCHs (and the models used to derive them) are secured via a proof-of-authority blockchain distributed across multiple independent archives. We report on the efficacy of ARCHANGEL within the context of a trial deployment in which the national government archives of the United Kingdom, Estonia and Norway participated.Comment: Accepted to CVPR Blockchain Workshop 201

arXiv.org e-Print Archive

Crossref

University of Surrey

Surrey Research Insight

3D PersonVLAD: Learning Deep Global Representations for Video-based Person Re-identification

Author: Shao Ling
Wang Meng
Wang Yang
Wu Lin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/02/2019
Field of study

In this paper, we introduce a global video representation to video-based person re-identification (re-ID) that aggregates local 3D features across the entire video extent. Most of the existing methods rely on 2D convolutional networks (ConvNets) to extract frame-wise deep features which are pooled temporally to generate the video-level representations. However, 2D ConvNets lose temporal input information immediately after the convolution, and a separate temporal pooling is limited in capturing human motion in shorter sequences. To this end, we present a \textit{global} video representation (3D PersonVLAD), complementary to 3D ConvNets as a novel layer to capture the appearance and motion dynamics in full-length videos. However, encoding each video frame in its entirety and computing an aggregate global representation across all frames is tremendously challenging due to occlusions and misalignments. To resolve this, our proposed network is further augmented with 3D part alignment module to learn local features through soft-attention module. These attended features are statistically aggregated to yield identity-discriminative representations. Our global 3D features are demonstrated to achieve state-of-the-art results on three benchmark datasets: MARS \cite{MARS}, iLIDS-VID \cite{VideoRanking}, and PRID 2011Comment: Accepted to appear at IEEE Transactions on Neural Networks and Learning System

arXiv.org e-Print Archive

University of Queensland eSpace

Designing annotation before it's needed

Author: Frank Nack
Wolfgang Putz
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

Smart PIN: performance and cost-oriented context-aware personal information network

Author: Lee Seung-Bum
Publication venue: Dublin City University. CLARITY: The Centre for Sensor Web Technologies
Publication date: 01/03/2010
Field of study

The next generation of networks will involve interconnection of heterogeneous individual networks such as WPAN, WLAN, WMAN and Cellular network, adopting the IP as common infrastructural protocol and providing virtually always-connected network. Furthermore, there are many devices which enable easy acquisition and storage of information as pictures, movies, emails, etc. Therefore, the information overload and divergent content’s characteristics make it difficult for users to handle their data in manual way. Consequently, there is a need for personalised automatic services which would enable data exchange across heterogeneous network and devices. To support these personalised services, user centric approaches for data delivery across the heterogeneous network are also required. In this context, this thesis proposes Smart PIN - a novel performance and cost-oriented context-aware Personal Information Network. Smart PIN's architecture is detailed including its network, service and management components. Within the service component, two novel schemes for efficient delivery of context and content data are proposed: Multimedia Data Replication Scheme (MDRS) and Quality-oriented Algorithm for Multiple-source Multimedia Delivery (QAMMD). MDRS supports efficient data accessibility among distributed devices using data replication which is based on a utility function and a minimum data set. QAMMD employs a buffer underflow avoidance scheme for streaming, which achieves high multimedia quality without content adaptation to network conditions. Simulation models for MDRS and QAMMD were built which are based on various heterogeneous network scenarios. Additionally a multiple-source streaming based on QAMMS was implemented as a prototype and tested in an emulated network environment. Comparative tests show that MDRS and QAMMD perform significantly better than other approaches

Irish Universities

DCU Online Research Access Service

Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning

Author: Cheng Yuan
Chu Wei
Guo Qingpei
Jiang Chen
Liu Hong
Liu Zhongyi
Qi Yuan
Wang Qing
Xu Jia
Yang Ming
Yu Xuzheng
Publication venue
Publication date: 20/09/2023
Field of study

In recent years, the explosion of web videos makes text-video retrieval increasingly essential and popular for video filtering, recommendation, and search. Text-video retrieval aims to rank relevant text/video higher than irrelevant ones. The core of this task is to precisely measure the cross-modal similarity between texts and videos. Recently, contrastive learning methods have shown promising results for text-video retrieval, most of which focus on the construction of positive and negative pairs to learn text and video representations. Nevertheless, they do not pay enough attention to hard negative pairs and lack the ability to model different levels of semantic similarity. To address these two issues, this paper improves contrastive learning using two novel techniques. First, to exploit hard examples for robust discriminative power, we propose a novel Dual-Modal Attention-Enhanced Module (DMAE) to mine hard negative pairs from textual and visual clues. By further introducing a Negative-aware InfoNCE (NegNCE) loss, we are able to adaptively identify all these hard negatives and explicitly highlight their impacts in the training loss. Second, our work argues that triplet samples can better model fine-grained semantic similarity compared to pairwise samples. We thereby present a new Triplet Partial Margin Contrastive Learning (TPM-CL) module to construct partial order triplet samples by automatically generating fine-grained hard negatives for matched text-video pairs. The proposed TPM-CL designs an adaptive token masking strategy with cross-modal interaction to model subtle semantic differences. Extensive experiments demonstrate that the proposed approach outperforms existing methods on four widely-used text-video retrieval datasets, including MSR-VTT, MSVD, DiDeMo and ActivityNet.Comment: Accepted by ACM MM 202

arXiv.org e-Print Archive