12,074 research outputs found
Navigation domain representation for interactive multiview imaging
Enabling users to interactively navigate through different viewpoints of a
static scene is a new interesting functionality in 3D streaming systems. While
it opens exciting perspectives towards rich multimedia applications, it
requires the design of novel representations and coding techniques in order to
solve the new challenges imposed by interactive navigation. Interactivity
clearly brings new design constraints: the encoder is unaware of the exact
decoding process, while the decoder has to reconstruct information from
incomplete subsets of data since the server can generally not transmit images
for all possible viewpoints due to resource constrains. In this paper, we
propose a novel multiview data representation that permits to satisfy bandwidth
and storage constraints in an interactive multiview streaming system. In
particular, we partition the multiview navigation domain into segments, each of
which is described by a reference image and some auxiliary information. The
auxiliary information enables the client to recreate any viewpoint in the
navigation segment via view synthesis. The decoder is then able to navigate
freely in the segment without further data request to the server; it requests
additional data only when it moves to a different segment. We discuss the
benefits of this novel representation in interactive navigation systems and
further propose a method to optimize the partitioning of the navigation domain
into independent segments, under bandwidth and storage constraints.
Experimental results confirm the potential of the proposed representation;
namely, our system leads to similar compression performance as classical
inter-view coding, while it provides the high level of flexibility that is
required for interactive streaming. Hence, our new framework represents a
promising solution for 3D data representation in novel interactive multimedia
services
Human Motion Capture Data Tailored Transform Coding
Human motion capture (mocap) is a widely used technique for digitalizing
human movements. With growing usage, compressing mocap data has received
increasing attention, since compact data size enables efficient storage and
transmission. Our analysis shows that mocap data have some unique
characteristics that distinguish themselves from images and videos. Therefore,
directly borrowing image or video compression techniques, such as discrete
cosine transform, does not work well. In this paper, we propose a novel
mocap-tailored transform coding algorithm that takes advantage of these
features. Our algorithm segments the input mocap sequences into clips, which
are represented in 2D matrices. Then it computes a set of data-dependent
orthogonal bases to transform the matrices to frequency domain, in which the
transform coefficients have significantly less dependency. Finally, the
compression is obtained by entropy coding of the quantized coefficients and the
bases. Our method has low computational cost and can be easily extended to
compress mocap databases. It also requires neither training nor complicated
parameter setting. Experimental results demonstrate that the proposed scheme
significantly outperforms state-of-the-art algorithms in terms of compression
performance and speed
Ontology based approach for video transmission over the network
With the increase in the bandwidth & the transmission speed over the
internet, transmission of multimedia objects like video, audio, images has
become an easier work. In this paper we provide an approach that can be useful
for transmission of video objects over the internet without much fuzz. The
approach provides a ontology based framework that is used to establish an
automatic deployment of video transmission system. Further the video is
compressed using the structural flow mechanism that uses the wavelet principle
for compression of video frames. Finally the video transmission algorithm known
as RRDBFSF algorithm is provided that makes use of the concept of restrictive
flooding to avoid redundancy thereby increasing the efficiency.Comment: 7 pages, 2 figures, 4 table
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
JOINT CODING OF MULTIMODAL BIOMEDICAL IMAGES US ING CONVOLUTIONAL NEURAL NETWORKS
The massive volume of data generated daily by the gathering of medical images with
different modalities might be difficult to store in medical facilities and share through
communication networks. To alleviate this issue, efficient compression methods
must be implemented to reduce the amount of storage and transmission resources
required in such applications. However, since the preservation of all image details
is highly important in the medical context, the use of lossless image compression
algorithms is of utmost importance.
This thesis presents the research results on a lossless compression scheme designed
to encode both computerized tomography (CT) and positron emission tomography
(PET). Different techniques, such as image-to-image translation, intra prediction,
and inter prediction are used. Redundancies between both image modalities are
also investigated. To perform the image-to-image translation approach, we resort to
lossless compression of the original CT data and apply a cross-modality image translation
generative adversarial network to obtain an estimation of the corresponding
PET.
Two approaches were implemented and evaluated to determine a PET residue
that will be compressed along with the original CT. In the first method, the
residue resulting from the differences between the original PET and its estimation
is encoded, whereas in the second method, the residue is obtained using encoders
inter-prediction coding tools. Thus, in alternative to compressing two independent
picture modalities, i.e., both images of the original PET-CT pair solely the CT is
independently encoded alongside with the PET residue, in the proposed method.
Along with the proposed pipeline, a post-processing optimization algorithm that
modifies the estimated PET image by altering the contrast and rescaling the image
is implemented to maximize the compression efficiency.
Four different versions (subsets) of a publicly available PET-CT pair dataset
were tested. The first proposed subset was used to demonstrate that the concept
developed in this work is capable of surpassing the traditional compression schemes.
The obtained results showed gains of up to 8.9% using the HEVC. On the other
side, JPEG2k proved not to be the most suitable as it failed to obtain good results,
having reached only -9.1% compression gain. For the remaining (more challenging) subsets, the results reveal that the proposed refined post-processing scheme attains,
when compared to conventional compression methods, up 6.33% compression gain
using HEVC, and 7.78% using VVC
Low-latency compression of mocap data using learned spatial decorrelation transform
Due to the growing needs of human motion capture (mocap) in movie, video
games, sports, etc., it is highly desired to compress mocap data for efficient
storage and transmission. This paper presents two efficient frameworks for
compressing human mocap data with low latency. The first framework processes
the data in a frame-by-frame manner so that it is ideal for mocap data
streaming and time critical applications. The second one is clip-based and
provides a flexible tradeoff between latency and compression performance. Since
mocap data exhibits some unique spatial characteristics, we propose a very
effective transform, namely learned orthogonal transform (LOT), for reducing
the spatial redundancy. The LOT problem is formulated as minimizing square
error regularized by orthogonality and sparsity and solved via alternating
iteration. We also adopt a predictive coding and temporal DCT for temporal
decorrelation in the frame- and clip-based frameworks, respectively.
Experimental results show that the proposed frameworks can produce higher
compression performance at lower computational cost and latency than the
state-of-the-art methods.Comment: 15 pages, 9 figure
- …