22,501 research outputs found
Recommended from our members
Mobile Audiovisual Terminal: System Design and Subjective Testing in DECT and UMTS networks
It is anticipated that there will shortly be a requirement
for multimedia terminals that operate via mobile
communications systems. This paper presents a functional specification
for such a terminal operating at 32 kb/s in a digital
European cordless telecommunications (DECT) and universal
mobile telecommunications system (UMTS) radio network. A terminal
has been built, based on a PC with digital signal processor
(DSP) boards for audio and video coding and decoding. Speech
coding is by a phonetically driven code-excited linear prediction
(CELP) speech coder and video coding by a block-oriented hybrid
discrete cosine transform (DCT) coder. Separate channel coding
is provided for the audio and video data. The paper describes the
techniques used for audio and video coding, channel coding, and
synchronization. Methods of subjective testing in a DECT network
and in a UMTS network are also described. These consisted of
subjective tests of first impressions of the mobile audio–visual
terminal (MAVT) quality, interactive tests, and the completion
of an exit questionnaire. The test results showed that the quality
of the audio was sufficiently good for comprehension and the
video was sufficiently good for following and repeating simple
mechanical tasks. However, the quality of the MAVT was not
good enough for general use where high-quality audio and video
was needed, especially when transmission was in a noisy radio
environment
Automatic differentiation in machine learning: a survey
Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in
machine learning. Automatic differentiation (AD), also called algorithmic
differentiation or simply "autodiff", is a family of techniques similar to but
more general than backpropagation for efficiently and accurately evaluating
derivatives of numeric functions expressed as computer programs. AD is a small
but established field with applications in areas including computational fluid
dynamics, atmospheric sciences, and engineering design optimization. Until very
recently, the fields of machine learning and AD have largely been unaware of
each other and, in some cases, have independently discovered each other's
results. Despite its relevance, general-purpose AD has been missing from the
machine learning toolbox, a situation slowly changing with its ongoing adoption
under the names "dynamic computational graphs" and "differentiable
programming". We survey the intersection of AD and machine learning, cover
applications where AD has direct relevance, and address the main implementation
techniques. By precisely defining the main differentiation techniques and their
interrelationships, we aim to bring clarity to the usage of the terms
"autodiff", "automatic differentiation", and "symbolic differentiation" as
these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure
Steered mixture-of-experts for light field images and video : representation and coding
Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution
Study of information transfer optimization for communication satellites
The results are presented of a study of source coding, modulation/channel coding, and systems techniques for application to teleconferencing over high data rate digital communication satellite links. Simultaneous transmission of video, voice, data, and/or graphics is possible in various teleconferencing modes and one-way, two-way, and broadcast modes are considered. A satellite channel model including filters, limiter, a TWT, detectors, and an optimized equalizer is treated in detail. A complete analysis is presented for one set of system assumptions which exclude nonlinear gain and phase distortion in the TWT. Modulation, demodulation, and channel coding are considered, based on an additive white Gaussian noise channel model which is an idealization of an equalized channel. Source coding with emphasis on video data compression is reviewed, and the experimental facility utilized to test promising techniques is fully described
Optimising Spatial and Tonal Data for PDE-based Inpainting
Some recent methods for lossy signal and image compression store only a few
selected pixels and fill in the missing structures by inpainting with a partial
differential equation (PDE). Suitable operators include the Laplacian, the
biharmonic operator, and edge-enhancing anisotropic diffusion (EED). The
quality of such approaches depends substantially on the selection of the data
that is kept. Optimising this data in the domain and codomain gives rise to
challenging mathematical problems that shall be addressed in our work.
In the 1D case, we prove results that provide insights into the difficulty of
this problem, and we give evidence that a splitting into spatial and tonal
(i.e. function value) optimisation does hardly deteriorate the results. In the
2D setting, we present generic algorithms that achieve a high reconstruction
quality even if the specified data is very sparse. To optimise the spatial
data, we use a probabilistic sparsification, followed by a nonlocal pixel
exchange that avoids getting trapped in bad local optima. After this spatial
optimisation we perform a tonal optimisation that modifies the function values
in order to reduce the global reconstruction error. For homogeneous diffusion
inpainting, this comes down to a least squares problem for which we prove that
it has a unique solution. We demonstrate that it can be found efficiently with
a gradient descent approach that is accelerated with fast explicit diffusion
(FED) cycles. Our framework allows to specify the desired density of the
inpainting mask a priori. Moreover, is more generic than other data
optimisation approaches for the sparse inpainting problem, since it can also be
extended to nonlinear inpainting operators such as EED. This is exploited to
achieve reconstructions with state-of-the-art quality.
We also give an extensive literature survey on PDE-based image compression
methods
- …