18 research outputs found
DCT-domain spatial transcoding using generalized DCT decimation
[[abstract]]In this paper, we propose a generalized DCT-domain spatial downscaling scheme to improve the visual quality. We analyze the filtering performances and computational complexities of the proposed scheme and the pixel-domain downscaling schemes. The analyses show that the proposed scheme can reduce the aliasing artifact compared to the existing schemes, while the computational complexity may be increased. We also integrate the proposed decimation scheme into the cascaded DCT-domain transcoder for spatial downscaling of a pre-encoded video into its quarter size. Experiments show the proposed approach can achieve better visual quality than the existing schemes.[[fileno]]2030144030019[[department]]é»æ©ć·„çšćž
On transcoding a B-frame to a P-frame in the compressed domain
2007-2008 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe
An Efficient Motion Estimation Method for H.264-Based Video Transcoding with Arbitrary Spatial Resolution Conversion
As wireless and wired network connectivity is rapidly expanding
and the number of network users is steadily increasing, it has become more
and more important to support universal access of multimedia
content over the whole network. A big challenge, however, is
the great diversity of network devices from full screen computers
to small smart phones. This leads to research on transcoding,
which involves in efficiently reformatting compressed data from
its original high resolution to a desired spatial resolution
supported by the displaying device. Particularly, there is a
great momentum in the multimedia industry for H.264-based
transcoding as H.264 has been widely employed as a mandatory
player feature in applications ranging from television broadcast
to video for mobile devices.
While H.264 contains many new features for effective video
coding with excellent rate distortion (RD) performance, a major issue
for transcoding H.264 compressed video from one spatial resolution
to another is the computational complexity. Specifically, it is
the motion compensated prediction (MCP) part. MCP is the main
contributor to the excellent RD performance
of H.264 video compression, yet it is very time consuming. In general,
a brute-force search is used to find the best motion vectors for MCP.
In the scenario of transcoding, however, an immediate idea for
improving the MCP efficiency for the re-encoding procedure is to
utilize the motion vectors in the original compressed stream.
Intuitively, motion in the high resolution scene is highly related
to that in the down-scaled scene.
In this thesis, we study homogeneous video transcoding from H.264
to H.264. Specifically, for the video transcoding with arbitrary
spatial resolution conversion, we propose a motion vector estimation
algorithm based on a multiple linear regression model, which
systematically utilizes the motion information in the original scenes.
We also propose a practical solution for efficiently determining a
reference frame to take the advantage of the new feature of multiple
references in H.264. The performance of the algorithm was assessed
in an H.264 transcoder. Experimental results show that, as compared
with a benchmark solution, the proposed method significantly reduces
the transcoding complexity without degrading much the video quality
Recommended from our members
Adaptive intra refresh for robust wireless multi-view video
This thesis was submitted for the award of PhD and was awarded by Brunel University LondonMobile wireless communication technology is a fast developing field and every day new mobile communication techniques and means are becoming available. In this thesis multi-view video (MVV) is also refers to as 3D video. Thus, the 3D video signals through wireless communication are shaping telecommunication industry and academia. However, wireless channels are prone to high level of bit and burst errors that largely deteriorate the quality of service (QoS). Noise along the wireless transmission path can introduce distortion or make a compressed bitstream lose vital information. The error caused by noise progressively spread to subsequent frames and among multiple views due to prediction. This error may compel the receiver to pause momentarily and wait for the subsequent INTRA picture to continue decoding. The pausing of video stream affects the user's Quality of Experience (QoE). Thus, an error resilience strategy is needed to protect the compressed bitstream against transmission errors. This thesis focuses on error resilience Adaptive Intra Refresh (AIR) technique. The AIR method is developed to make the compressed 3D video more robust to channel errors. The process involves periodic injection of Intra-coded macroblocks in a cyclic pattern using H.264/AVC standard. The algorithm takes into account individual features in each macroblock and the feedback information sent by the decoder about the channel condition in order to generate an MVV-AIR map. MVV-AIR map generation regulates the order of packets arrival and identifies the motion activities in each macroblock. Based on the level of motion activity contained in each macroblock, the MVV-AIR map classifies frames as high or low motion macroblocks. A proxy MVV-AIR transcoder is used to validate the efficiency of the generated MVV-AIR map. The MVV-AIR transcoding algorithm uses spatial and views downscaling scheme to convert from MVV to single view. Various experimental results indicate that the proposed error resilient MVV-AIR transcoder technique effectively improves the quality of reconstructed 3D video in wireless networks. A comparison of MVV-AIR transcoder algorithm with some traditional error resilience techniques demonstrates that MVV-AIR algorithm performs better in an error prone channel. Results of simulation revealed significant improvements in both objective and subjective qualities. No additional computational complexity emanates from the scheme while the QoS and QoE requirements are still fully met.Tertiary Institution Trust Fund (TETFund) of Nigeri
Efficient HEVC-based video adaptation using transcoding
In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints.
These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency.
This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications
Etude et mise en place d'une plateforme d'adaptation multiservice embarquée pour la gestion de flux multimédia à différents niveaux logiciels et matériels
Les avancées technologiques ont permis la commercialisation à grande échelle de terminaux mobiles. De ce fait, l homme est de plus en plus connecté et partout. Ce nombre grandissant d usagers du réseau ainsi que la forte croissance du contenu disponible, aussi bien d un point de vue quantitatif que qualitatif saturent les réseaux et l augmentation des moyens matériels (passage à la fibre optique) ne suffisent pas. Pour surmonter cela, les réseaux doivent prendre en compte le type de contenu (texte, vidéo, ...) ainsi que le contexte d utilisation (état du réseau, capacité du terminal, ...) pour assurer une qualité d expérience optimum. A ce sujet, la vidéo fait partie des contenus les plus critiques. Ce type de contenu est non seulement de plus en plus consommé par les utilisateurs mais est aussi l un des plus contraignant en terme de ressources nécéssaires à sa distribution (taille serveur, bande passante, ). Adapter un contenu vidéo en fonction de l état du réseau (ajuster son débit binaire à la bande passante) ou des capacités du terminal (s assurer que le codec soit nativement supporté) est indispensable. Néanmoins, l adaptation vidéo est un processus qui nécéssite beaucoup de ressources. Cela est antinomique à son utilisation à grande echelle dans les appareils à bas coûts qui constituent aujourd hui une grande part dans l ossature du réseau Internet. Cette thÚse se concentre sur la conception d un systÚme d adaptation vidéo à bas coût et temps réel qui prendrait place dans ces réseaux du futur. AprÚs une analyse du contexte, un systÚme d adaptation générique est proposé et évalué en comparaison de l état de l art. Ce systÚme est implémenté sur un FPGA afin d assurer les performances (temps-réels) et la nécessité d une solution à bas coût. Enfin, une étude sur les effets indirects de l adaptation vidéo est menée.On the one hand, technology advances have led to the expansion of the handheld devices market. Thanks to this expansion, people are more and more connected and more and more data are exchanged over the Internet. On the other hand, this huge amound of data imposes drastic constrains in order to achieve sufficient quality. The Internet is now showing its limits to assure such quality. To answer nowadays limitations, a next generation Internet is envisioned. This new network takes into account the content nature (video, audio, ...) and the context (network state, terminal capabilities ...) to better manage its own resources. To this extend, video manipulation is one of the key concept that is highlighted in this arising context. Video content is more and more consumed and at the same time requires more and more resources. Adapting videos to the network state (reducing its bitrate to match available bandwidth) or to the terminal capabilities (screen size, supported codecs, ) appears mandatory and is foreseen to take place in real time in networking devices such as home gateways. However, video adaptation is a resource intensive task and must be implemented using hardware accelerators to meet the desired low cost and real time constraints.In this thesis, content- and context-awareness is first analyzed to be considered at the network side. Secondly, a generic low cost video adaptation system is proposed and compared to existing solutions as a trade-off between system complexity and quality. Then, hardware conception is tackled as this system is implemented in an FPGA based architecture. Finally, this system is used to evaluate the indirect effects of video adaptation; energy consumption reduction is achieved at the terminal side by reducing video characteristics thus permitting an increased user experience for End-Users.BORDEAUX1-Bib.electronique (335229901) / SudocSudocFranceF
Etude et mise en place dâune plateforme dâadaptation multiservice embarquĂ©e pour la gestion de flux multimĂ©dia Ă diffĂ©rents niveaux logiciels et matĂ©riels
On the one hand, technology advances have led to the expansion of the handheld devices market. Thanks to this expansion, people are more and more connected and more and more data are exchanged over the Internet. On the other hand, this huge amound of data imposes drastic constrains in order to achieve sufficient quality. The Internet is now showing its limits to assure such quality. To answer nowadays limitations, a next generation Internet is envisioned. This new network takes into account the content nature (video, audio, ...) and the context (network state, terminal capabilities ...) to better manage its own resources. To this extend, video manipulation is one of the key concept that is highlighted in this arising context. Video content is more and more consumed and at the same time requires more and more resources. Adapting videos to the network state (reducing its bitrate to match available bandwidth) or to the terminal capabilities (screen size, supported codecs, âŠ) appears mandatory and is foreseen to take place in real time in networking devices such as home gateways. However, video adaptation is a resource intensive task and must be implemented using hardware accelerators to meet the desired low cost and real time constraints.In this thesis, content- and context-awareness is first analyzed to be considered at the network side. Secondly, a generic low cost video adaptation system is proposed and compared to existing solutions as a trade-off between system complexity and quality. Then, hardware conception is tackled as this system is implemented in an FPGA based architecture. Finally, this system is used to evaluate the indirect effects of video adaptation; energy consumption reduction is achieved at the terminal side by reducing video characteristics thus permitting an increased user experience for End-Users.Les avancĂ©es technologiques ont permis la commercialisation Ă grande Ă©chelle de terminaux mobiles. De ce fait, lâhomme est de plus en plus connectĂ© et partout. Ce nombre grandissant dâusagers du rĂ©seau ainsi que la forte croissance du contenu disponible, aussi bien dâun point de vue quantitatif que qualitatif saturent les rĂ©seaux et lâaugmentation des moyens matĂ©riels (passage Ă la fibre optique) ne suffisent pas. Pour surmonter cela, les rĂ©seaux doivent prendre en compte le type de contenu (texte, vidĂ©o, ...) ainsi que le contexte dâutilisation (Ă©tat du rĂ©seau, capacitĂ© du terminal, ...) pour assurer une qualitĂ© dâexpĂ©rience optimum. A ce sujet, la vidĂ©o fait partie des contenus les plus critiques. Ce type de contenu est non seulement de plus en plus consommĂ© par les utilisateurs mais est aussi lâun des plus contraignant en terme de ressources nĂ©cĂ©ssaires Ă sa distribution (taille serveur, bande passante, âŠ). Adapter un contenu vidĂ©o en fonction de lâĂ©tat du rĂ©seau (ajuster son dĂ©bit binaire Ă la bande passante) ou des capacitĂ©s du terminal (sâassurer que le codec soit nativement supportĂ©) est indispensable. NĂ©anmoins, lâadaptation vidĂ©o est un processus qui nĂ©cĂ©ssite beaucoup de ressources. Cela est antinomique Ă son utilisation Ă grande echelle dans les appareils Ă bas coĂ»ts qui constituent aujourdâhui une grande part dans lâossature du rĂ©seau Internet. Cette thĂšse se concentre sur la conception dâun systĂšme dâadaptation vidĂ©o Ă bas coĂ»t et temps rĂ©el qui prendrait place dans ces rĂ©seaux du futur. AprĂšs une analyse du contexte, un systĂšme dâadaptation gĂ©nĂ©rique est proposĂ© et Ă©valuĂ© en comparaison de lâĂ©tat de lâart. Ce systĂšme est implĂ©mentĂ© sur un FPGA afin dâassurer les performances (temps-rĂ©els) et la nĂ©cessitĂ© dâune solution Ă bas coĂ»t. Enfin, une Ă©tude sur les effets indirects de lâadaptation vidĂ©o est menĂ©e