905 research outputs found
MASCOT : metadata for advanced scalable video coding tools : final report
The goal of the MASCOT project was to develop new video coding schemes and tools that provide both an increased coding efficiency as well as extended scalability features compared to technology that was available at the beginning of the project. Towards that goal the following tools would be used: - metadata-based coding tools; - new spatiotemporal decompositions; - new prediction schemes. Although the initial goal was to develop one single codec architecture that was able to combine all new coding tools that were foreseen when the project was formulated, it became clear that this would limit the selection of the new tools. Therefore the consortium decided to develop two codec frameworks within the project, a standard hybrid DCT-based codec and a 3D wavelet-based codec, which together are able to accommodate all tools developed during the course of the project
Video Compressive Sensing for Dynamic MRI
We present a video compressive sensing framework, termed kt-CSLDS, to
accelerate the image acquisition process of dynamic magnetic resonance imaging
(MRI). We are inspired by a state-of-the-art model for video compressive
sensing that utilizes a linear dynamical system (LDS) to model the motion
manifold. Given compressive measurements, the state sequence of an LDS can be
first estimated using system identification techniques. We then reconstruct the
observation matrix using a joint structured sparsity assumption. In particular,
we minimize an objective function with a mixture of wavelet sparsity and joint
sparsity within the observation matrix. We derive an efficient convex
optimization algorithm through alternating direction method of multipliers
(ADMM), and provide a theoretical guarantee for global convergence. We
demonstrate the performance of our approach for video compressive sensing, in
terms of reconstruction accuracy. We also investigate the impact of various
sampling strategies. We apply this framework to accelerate the acquisition
process of dynamic MRI and show it achieves the best reconstruction accuracy
with the least computational time compared with existing algorithms in the
literature.Comment: 30 pages, 9 figure
An overview Survey on Various Video compressions and its importance
With the rise of digital computing and visual data processing, the need for storage and transmission of video data became prevalent. Storage and transmission of uncompressed raw visual data is not a good practice, because it requires a large storage space and great bandwidth. Video compression algorithms can compress this raw visual data or video into smaller files with a little sacrifice on the quality. This paper an overview and comparison of standard efforts on video compression algorithm of: MPEG-1, MPEG-2, MPEG-4, MPEG-
Motion Estimation and Compensation in the Redundant Wavelet Domain
Despite being the prefered approach for still-image compression for nearly a decade, wavelet-based coding for video has been slow to emerge, due primarily to the fact that the shift variance of the discrete wavelet transform hinders motion estimation and compensation crucial to modern video coders. Recently it has been recognized that a redundant, or overcomplete, wavelet transform is shift invariant and thus permits motion prediction in the wavelet domain. In this dissertation, other uses for the redundancy of overcomplete wavelet transforms in video coding are explored. First, it is demonstrated that the redundant-wavelet domain facilitates the placement of an irregular triangular mesh to video images, thereby exploiting transform redundancy to implement geometries for motion estimation and compensation more general than the traditional block structure widely employed. As the second contribution of this dissertation, a new form of multihypothesis prediction, redundant wavelet multihypothesis, is presented. This new approach to motion estimation and compensation produces motion predictions that are diverse in transform phase to increase prediction accuracy. Finally, it is demonstrated that the proposed redundant-wavelet strategies complement existing advanced video-coding techniques and produce significant performance improvements in a battery of experimental results
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
A polar prediction model for learning to represent visual transformations
All organisms make temporal predictions, and their evolutionary fitness level
depends on the accuracy of these predictions. In the context of visual
perception, the motions of both the observer and objects in the scene structure
the dynamics of sensory signals, allowing for partial prediction of future
signals based on past ones. Here, we propose a self-supervised
representation-learning framework that extracts and exploits the regularities
of natural videos to compute accurate predictions. We motivate the polar
architecture by appealing to the Fourier shift theorem and its group-theoretic
generalization, and we optimize its parameters on next-frame prediction.
Through controlled experiments, we demonstrate that this approach can discover
the representation of simple transformation groups acting in data. When trained
on natural video datasets, our framework achieves better prediction performance
than traditional motion compensation and rivals conventional deep networks,
while maintaining interpretability and speed. Furthermore, the polar
computations can be restructured into components resembling normalized simple
and direction-selective complex cell models of primate V1 neurons. Thus, polar
prediction offers a principled framework for understanding how the visual
system represents sensory inputs in a form that simplifies temporal prediction
- …