1,763 research outputs found
The Visual Centrifuge: Model-Free Layered Video Representations
True video understanding requires making sense of non-lambertian scenes where
the color of light arriving at the camera sensor encodes information about not
just the last object it collided with, but about multiple mediums -- colored
windows, dirty mirrors, smoke or rain. Layered video representations have the
potential of accurately modelling realistic scenes but have so far required
stringent assumptions on motion, lighting and shape. Here we propose a
learning-based approach for multi-layered video representation: we introduce
novel uncertainty-capturing 3D convolutional architectures and train them to
separate blended videos. We show that these models then generalize to single
videos, where they exhibit interesting abilities: color constancy, factoring
out shadows and separating reflections. We present quantitative and qualitative
results on real world videos.Comment: Appears in: 2019 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2019). This arXiv contains the CVPR Camera Ready version of
the paper (although we have included larger figures) as well as an appendix
detailing the model architectur
Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
We present a deep neural network-based approach to image quality assessment
(IQA). The network is trained end-to-end and comprises ten convolutional layers
and five pooling layers for feature extraction, and two fully connected layers
for regression, which makes it significantly deeper than related IQA models.
Unique features of the proposed architecture are that: 1) with slight
adaptations it can be used in a no-reference (NR) as well as in a
full-reference (FR) IQA setting and 2) it allows for joint learning of local
quality and local weights, i.e., relative importance of local quality to the
global quality estimate, in an unified framework. Our approach is purely
data-driven and does not rely on hand-crafted features or other types of prior
domain knowledge about the human visual system or image statistics. We evaluate
the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the
LIVE In the wild image quality challenge database and show superior performance
to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation
shows a high ability to generalize between different databases, indicating a
high robustness of the learned features
- …