1,297 research outputs found
Unequal Error Protected JPEG 2000 Broadcast Scheme with Progressive Fountain Codes
This paper proposes a novel scheme, based on progressive fountain codes, for
broadcasting JPEG 2000 multimedia. In such a broadcast scheme, progressive
resolution levels of images/video have been unequally protected when
transmitted using the proposed progressive fountain codes. With progressive
fountain codes applied in the broadcast scheme, the resolutions of images (JPEG
2000) or videos (MJPEG 2000) received by different users can be automatically
adaptive to their channel qualities, i.e. the users with good channel qualities
are possible to receive the high resolution images/vedio while the users with
bad channel qualities may receive low resolution images/vedio. Finally, the
performance of the proposed scheme is evaluated with the MJPEG 2000 broadcast
prototype
Algorithms & implementation of advanced video coding standards
Advanced video coding standards have become widely deployed coding techniques used in numerous products, such as broadcast, video conference, mobile television and blu-ray disc, etc. New compression techniques are gradually included in video coding standards so that a 50% compression rate reduction is achievable every five years. However, the trend also has brought many problems, such as, dramatically increased computational complexity, co-existing multiple standards and gradually increased development time. To solve the above problems, this thesis intends to investigate efficient algorithms for the latest video coding standard, H.264/AVC. Two aspects of H.264/AVC standard are inspected in this thesis: (1) Speeding up intra4x4 prediction with parallel architecture. (2) Applying an efficient rate control algorithm based on deviation measure to intra frame. Another aim of this thesis is to work on low-complexity algorithms for MPEG-2 to H.264/AVC transcoder. Three main mapping algorithms and a computational complexity reduction algorithm are focused by this thesis: motion vector mapping, block mapping, field-frame mapping and efficient modes ranking algorithms. Finally, a new video coding framework methodology to reduce development time is examined. This thesis explores the implementation of MPEG-4 simple profile with the RVC framework. A key technique of automatically generating variable length decoder table is solved in this thesis. Moreover, another important video coding standard, DV/DVCPRO, is further modeled by RVC framework. Consequently, besides the available MPEG-4 simple profile and China audio/video standard, a new member is therefore added into the RVC framework family. A part of the research work presented in this thesis is targeted algorithms and implementation of video coding standards. In the wide topic, three main problems are investigated. The results show that the methodologies presented in this thesis are efficient and encourage
Symbolic inductive bias for visually grounded learning of spoken language
A widespread approach to processing spoken language is to first automatically
transcribe it into text. An alternative is to use an end-to-end approach:
recent works have proposed to learn semantic embeddings of spoken language from
images with spoken captions, without an intermediate transcription step. We
propose to use multitask learning to exploit existing transcribed speech within
the end-to-end setting. We describe a three-task architecture which combines
the objectives of matching spoken captions with corresponding images, speech
with text, and text with images. We show that the addition of the speech/text
task leads to substantial performance improvements on image retrieval when
compared to training the speech/image task in isolation. We conjecture that
this is due to a strong inductive bias transcribed speech provides to the
model, and offer supporting evidence for this.Comment: ACL 201
Perception-aware low-power audio processing techniques for portable devices
Ph.DDOCTOR OF PHILOSOPH
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Complexity management of H.264/AVC video compression.
The H. 264/AVC video coding standard offers significantly improved compression efficiency and flexibility compared to previous standards. However, the high computational complexity of H. 264/AVC is a problem for codecs running on low-power hand held devices and general purpose computers. This thesis presents new techniques to reduce, control and manage the computational complexity of an H. 264/AVC codec. A new complexity reduction algorithm for H. 264/AVC is developed. This algorithm predicts "skipped" macroblocks prior to motion estimation by estimating a Lagrange ratedistortion cost function. Complexity savings are achieved by not processing the macroblocks that are predicted as "skipped". The Lagrange multiplier is adaptively modelled as a function of the quantisation parameter and video sequence statistics. Simulation results show that this algorithm achieves significant complexity savings with a negligible loss in rate-distortion performance. The complexity reduction algorithm is further developed to achieve complexity-scalable control of the encoding process. The Lagrangian cost estimation is extended to incorporate computational complexity. A target level of complexity is maintained by using a feedback algorithm to update the Lagrange multiplier associated with complexity. Results indicate that scalable complexity control of the encoding process can be achieved whilst maintaining near optimal complexity-rate-distortion performance. A complexity management framework is proposed for maximising the perceptual quality of coded video in a real-time processing-power constrained environment. A real-time frame-level control algorithm and a per-frame complexity control algorithm are combined in order to manage the encoding process such that a high frame rate is maintained without significantly losing frame quality. Subjective evaluations show that the managed complexity approach results in higher perceptual quality compared to a reference encoder that drops frames in computationally constrained situations. These novel algorithms are likely to be useful in implementing real-time H. 264/AVC standard encoders in computationally constrained environments such as low-power mobile devices and general purpose computers
A Survey on Semantic Communications for Intelligent Wireless Networks
With deployment of 6G technology, it is envisioned that competitive edge of
wireless networks will be sustained and next decade's communication
requirements will be stratified. Also 6G will aim to aid development of a human
society which is ubiquitous and mobile, simultaneously providing solutions to
key challenges such as, coverage, capacity, etc. In addition, 6G will focus on
providing intelligent use-cases and applications using higher data-rates over
mill-meter waves and Tera-Hertz frequency. However, at higher frequencies
multiple non-desired phenomena such as atmospheric absorption, blocking, etc.,
occur which create a bottleneck owing to resource (spectrum and energy)
scarcity. Hence, following same trend of making efforts towards reproducing at
receiver, exact information which was sent by transmitter, will result in a
never ending need for higher bandwidth. A possible solution to such a challenge
lies in semantic communications which focuses on meaning (context) of received
data as opposed to only reproducing correct transmitted data. This in turn will
require less bandwidth, and will reduce bottleneck due to various undesired
phenomenon. In this respect, current article presents a detailed survey on
recent technological trends in regard to semantic communications for
intelligent wireless networks. We focus on semantic communications architecture
including model, and source and channel coding. Next, we detail cross-layer
interaction, and various goal-oriented communication applications. We also
present overall semantic communications trends in detail, and identify
challenges which need timely solutions before practical implementation of
semantic communications within 6G wireless technology. Our survey article is an
attempt to significantly contribute towards initiating future research
directions in area of semantic communications for intelligent 6G wireless
networks
- …