225 research outputs found
Network streaming and compression for mixed reality tele-immersion
Bulterman, D.C.A. [Promotor]Cesar, P.S. [Copromotor
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
Image and Video Coding Techniques for Ultra-low Latency
The next generation of wireless networks fosters the adoption of latency-critical applications such as XR, connected industry, or autonomous driving. This survey gathers implementation aspects of different image and video coding schemes and discusses their tradeoffs. Standardized video coding technologies such as HEVC or VVC provide a high compression ratio, but their enormous complexity sets the scene for alternative approaches like still image, mezzanine, or texture compression in scenarios with tight resource or latency constraints. Regardless of the coding scheme, we found inter-device memory transfers and the lack of sub-frame coding as limitations of current full-system and software-programmable implementations.publishedVersionPeer reviewe
Segmentation based coding of depth Information for 3D video
Increased interest in 3D artifact and the need of transmitting, broadcasting and saving the
whole information that represents the 3D view, has been a hot topic in recent years.
Knowing that adding the depth information to the views will increase the encoding bitrate
considerably, we decided to find a new approach to encode/decode the depth information
for 3D video.
In this project, different approaches to encode/decode the depth information are
experienced and a new method is implemented which its result is compared to the best
previously developed method considering both bitrate and quality (PSNR)
A family of stereoscopic image compression algorithms using wavelet transforms
With the standardization of JPEG-2000, wavelet-based image and video
compression technologies are gradually replacing the popular DCT-based methods. In
parallel to this, recent developments in autostereoscopic display technology is now
threatening to revolutionize the way in which consumers are used to enjoying the
traditional 2D display based electronic media such as television, computer and
movies. However, due to the two-fold bandwidth/storage space requirement of
stereoscopic imaging, an essential requirement of a stereo imaging system is efficient
data compression.
In this thesis, seven wavelet-based stereo image compression algorithms are
proposed, to take advantage of the higher data compaction capability and better
flexibility of wavelets. In the proposed CODEC I, block-based disparity
estimation/compensation (DE/DC) is performed in pixel domain. However, this
results in an inefficiency when DWT is applied on the whole predictive error image
that results from the DE process. This is because of the existence of artificial block
boundaries between error blocks in the predictive error image. To overcome this
problem, in the remaining proposed CODECs, DE/DC is performed in the wavelet
domain. Due to the multiresolution nature of the wavelet domain, two methods of
disparity estimation and compensation have been proposed. The first method is
performing DEJDC in each subband of the lowest/coarsest resolution level and then
propagating the disparity vectors obtained to the corresponding subbands of
higher/finer resolution. Note that DE is not performed in every subband due to the
high overhead bits that could be required for the coding of disparity vectors of all
subbands. This method is being used in CODEC II. In the second method, DEJDC is
performed m the wavelet-block domain. This enables disparity estimation to be
performed m all subbands simultaneously without increasing the overhead bits
required for the coding disparity vectors. This method is used by CODEC III.
However, performing disparity estimation/compensation in all subbands would result
in a significant improvement of CODEC III. To further improve the performance of
CODEC ill, pioneering wavelet-block search technique is implemented in CODEC
IV. The pioneering wavelet-block search technique enables the right/predicted image
to be reconstructed at the decoder end without the need of transmitting the disparity
vectors. In proposed CODEC V, pioneering block search is performed in all subbands
of DWT decomposition which results in an improvement of its performance. Further,
the CODEC IV and V are able to perform at very low bit rates(< 0.15 bpp). In
CODEC VI and CODEC VII, Overlapped Block Disparity Compensation (OBDC) is
used with & without the need of coding disparity vector. Our experiment results
showed that no significant coding gains could be obtained for these CODECs over
CODEC IV & V.
All proposed CODECs m this thesis are wavelet-based stereo image coding
algorithms that maximise the flexibility and benefits offered by wavelet transform
technology when applied to stereo imaging. In addition the use of a baseline-JPEG
coding architecture would enable the easy adaptation of the proposed algorithms
within systems originally built for DCT-based coding. This is an important feature
that would be useful during an era where DCT-based technology is only slowly being
phased out to give way for DWT based compression technology.
In addition, this thesis proposed a stereo image coding algorithm that uses JPEG-2000
technology as the basic compression engine. The proposed CODEC, named RASTER
is a rate scalable stereo image CODEC that has a unique ability to preserve the image
quality at binocular depth boundaries, which is an important requirement in the design
of stereo image CODEC. The experimental results have shown that the proposed
CODEC is able to achieve PSNR gains of up to 3.7 dB as compared to directly
transmitting the right frame using JPEG-2000
Energy efficient hardware acceleration of multimedia processing tools
The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores.
To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature.
The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings
A practical comparison between two powerful PCC codec’s
Recent advances in the consumption of 3D content creates the necessity of efficient ways to
visualize and transmit 3D content. As a result, methods to obtain that same content have
been evolving, leading to the development of new methods of representations, namely point
clouds and light fields. A point cloud represents a set of points with associated Cartesian coordinates associated with each point(x, y, z), as well as being able to contain even more information inside that point (color, material, texture, etc). This kind of representation changes
the way on how 3D content in consumed, having a wide range of applications, from videogaming to medical ones. However, since this type of data carries so much information within
itself, they are data-heavy, making the storage and transmission of content a daunting task.
To resolve this issue, MPEG created a point cloud coding normalization project, giving birth
to V-PCC (Video-based Point Cloud Coding) and G-PCC (Geometry-based Point Cloud Coding) for static content. Firstly, a general analysis of point clouds is made, spanning from their
possible solutions, to their acquisition. Secondly, point cloud codecs are studied, namely VPCC and G-PCC from MPEG. Then, a state of art study of quality evaluation is performed,
namely subjective and objective evaluation. Finally, a report on the JPEG Pleno Point Cloud,
in which an active colaboration took place, is made, with the comparative results of the two
codecs and used metrics.Os avanços recentes no consumo de conteúdo 3D vêm criar a necessidade de maneiras eficientes de visualizar e transmitir conteúdo 3D. Consequentemente, os métodos de obtenção
desse mesmo conteúdo têm vindo a evoluir, levando ao desenvolvimento de novas maneiras
de representação, nomeadamente point clouds e lightfields. Um point cloud (núvem de pontos) representa um conjunto de pontos com coordenadas cartesianas associadas a cada ponto
(x, y, z), além de poder conter mais informação dentro do mesmo (cor, material, textura,
etc). Este tipo de representação abre uma nova janela na maneira como se consome conteúdo 3D, tendo um elevado leque de aplicações, desde videojogos e realidade virtual a aplicações médicas. No entanto, este tipo de dados, ao carregarem com eles tanta informação,
tornam-se incrivelmente pesados, tornando o seu armazenamento e transmissão uma tarefa
hercúleana. Tendo isto em mente, a MPEG criou um projecto de normalização de codificação de point clouds, dando origem ao V-PCC (Video-based Point Cloud Coding) e G-PCC
(Geometry-based Point Cloud Coding) para conteúdo estático. Esta dissertação tem como
objectivo uma análise geral sobre os point clouds, indo desde as suas possívei utilizações
à sua aquisição. Seguidamente, é efectuado um estudo dos codificadores de point clouds,
nomeadamente o V-PCC e o G-PCC da MPEG, o estado da arte da avaliação de qualidade, objectiva e subjectiva, e finalmente, são reportadas as actividades da JPEG Pleno Point Cloud,
na qual se teve uma colaboração activa
- …