1,580 research outputs found
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
Reviews on Technology and Standard of Spatial Audio Coding
Market demands on a more impressive entertainment media have motivated for delivery of three dimensional (3D) audio content to home consumers through Ultra High Definition TV (UHDTV), the next generation of TV broadcasting, where spatial audio coding plays fundamental role. This paper reviews fundamental concept on spatial audio coding which includes technology, standard, and application. Basic principle of object-based audio reproduction system will also be elaborated, compared to the traditional channel-based system, to provide good understanding on this popular interactive audio reproduction system which gives end users flexibility to render their own preferred audio composition.Keywords : spatial audio, audio coding, multi-channel audio signals, MPEG standard, object-based audi
Towards a Unified Knowledge-Based Approach to Modality Choice
This paper advances a unified knowledge-based approach to the process of choosing the most appropriate modality or combination of modalities in multimodal output generation. We propose a Modality Ontology (MO) that models the knowledge needed to support the two most fundamental processes determining modality choice – modality allocation (choosing the modality or set of modalities that can best support a particular type of information) and modality combination (selecting an optimal final combination of modalities). In the proposed ontology we model the main levels which collectively determine the characteristics of each modality and the specific relationships between different modalities that are important for multi-modal meaning making. This ontology aims to support the automatic selection of modalities and combinations of modalities that are suitable to convey the meaning of the intended message
ARCHANGEL: Tamper-proofing Video Archives using Temporal Content Hashes on the Blockchain
We present ARCHANGEL; a novel distributed ledger based system for assuring
the long-term integrity of digital video archives. First, we describe a novel
deep network architecture for computing compact temporal content hashes (TCHs)
from audio-visual streams with durations of minutes or hours. Our TCHs are
sensitive to accidental or malicious content modification (tampering) but
invariant to the codec used to encode the video. This is necessary due to the
curatorial requirement for archives to format shift video over time to ensure
future accessibility. Second, we describe how the TCHs (and the models used to
derive them) are secured via a proof-of-authority blockchain distributed across
multiple independent archives. We report on the efficacy of ARCHANGEL within
the context of a trial deployment in which the national government archives of
the United Kingdom, Estonia and Norway participated.Comment: Accepted to CVPR Blockchain Workshop 201
Type-IV DCT, DST, and MDCT algorithms with reduced numbers of arithmetic operations
We present algorithms for the type-IV discrete cosine transform (DCT-IV) and
discrete sine transform (DST-IV), as well as for the modified discrete cosine
transform (MDCT) and its inverse, that achieve a lower count of real
multiplications and additions than previously published algorithms, without
sacrificing numerical accuracy. Asymptotically, the operation count is reduced
from ~2NlogN to ~(17/9)NlogN for a power-of-two transform size N, and the exact
count is strictly lowered for all N > 4. These results are derived by
considering the DCT to be a special case of a DFT of length 8N, with certain
symmetries, and then pruning redundant operations from a recent improved fast
Fourier transform algorithm (based on a recursive rescaling of the
conjugate-pair split radix algorithm). The improved algorithms for DST-IV and
MDCT follow immediately from the improved count for the DCT-IV.Comment: 11 page
- …