17,291 research outputs found
Steered mixture-of-experts for light field images and video : representation and coding
Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
MeshfreeFlowNet: A Physics-Constrained Deep Continuous Space-Time Super-Resolution Framework
We propose MeshfreeFlowNet, a novel deep learning-based super-resolution
framework to generate continuous (grid-free) spatio-temporal solutions from the
low-resolution inputs. While being computationally efficient, MeshfreeFlowNet
accurately recovers the fine-scale quantities of interest. MeshfreeFlowNet
allows for: (i) the output to be sampled at all spatio-temporal resolutions,
(ii) a set of Partial Differential Equation (PDE) constraints to be imposed,
and (iii) training on fixed-size inputs on arbitrarily sized spatio-temporal
domains owing to its fully convolutional encoder. We empirically study the
performance of MeshfreeFlowNet on the task of super-resolution of turbulent
flows in the Rayleigh-Benard convection problem. Across a diverse set of
evaluation metrics, we show that MeshfreeFlowNet significantly outperforms
existing baselines. Furthermore, we provide a large scale implementation of
MeshfreeFlowNet and show that it efficiently scales across large clusters,
achieving 96.80% scaling efficiency on up to 128 GPUs and a training time of
less than 4 minutes.Comment: Supplementary Video: https://youtu.be/mjqwPch9gDo. Accepted to SC2
Phase Retrieval with Application to Optical Imaging
This review article provides a contemporary overview of phase retrieval in
optical imaging, linking the relevant optical physics to the information
processing methods and algorithms. Its purpose is to describe the current state
of the art in this area, identify challenges, and suggest vision and areas
where signal processing methods can have a large impact on optical imaging and
on the world of imaging at large, with applications in a variety of fields
ranging from biology and chemistry to physics and engineering
- …