6,333 research outputs found
DEEP LEARNING FOR IMAGE RESTORATION AND ROBOTIC VISION
Traditional model-based approach requires the formulation of mathematical model, and the model often has limited performance. The quality of an image may degrade due to a variety of reasons: It could be the context of scene is affected by weather conditions such as haze, rain, and snow; It\u27s also possible that there is some noise generated during image processing/transmission (e.g., artifacts generated during compression.). The goal of image restoration is to restore the image back to desirable quality both subjectively and objectively. Agricultural robotics is gaining interest these days since most agricultural works are lengthy and repetitive. Computer vision is crucial to robots especially the autonomous ones. However, it is challenging to have a precise mathematical model to describe the aforementioned problems. Compared with traditional approach, learning-based approach has an edge since it does not require any model to describe the problem. Moreover, learning-based approach now has the best-in-class performance on most of the vision problems such as image dehazing, super-resolution, and image recognition.
In this dissertation, we address the problem of image restoration and robotic vision with deep learning. These two problems are highly related with each other from a unique network architecture perspective: It is essential to select appropriate networks when dealing with different problems. Specifically, we solve the problems of single image dehazing, High Efficiency Video Coding (HEVC) loop filtering and super-resolution, and computer vision for an autonomous robot. Our technical contributions are threefold: First, we propose to reformulate haze as a signal-dependent noise which allows us to uncover it by learning a structural residual. Based on our novel reformulation, we solve dehazing with recursive deep residual network and generative adversarial network which emphasizes on objective and perceptual quality, respectively. Second, we replace traditional filters in HEVC with a Convolutional Neural Network (CNN) filter. We show that our CNN filter could achieve 7% BD-rate saving when compared with traditional filters such as bilateral and deblocking filter. We also propose to incorporate a multi-scale CNN super-resolution module into HEVC. Such post-processing module could improve visual quality under extremely low bandwidth. Third, a transfer learning technique is implemented to support vision and autonomous decision making of a precision pollination robot. Good experimental results are reported with real-world data
CARNet:Compression Artifact Reduction for Point Cloud Attribute
A learning-based adaptive loop filter is developed for the Geometry-based
Point Cloud Compression (G-PCC) standard to reduce attribute compression
artifacts. The proposed method first generates multiple Most-Probable Sample
Offsets (MPSOs) as potential compression distortion approximations, and then
linearly weights them for artifact mitigation. As such, we drive the filtered
reconstruction as close to the uncompressed PCA as possible. To this end, we
devise a Compression Artifact Reduction Network (CARNet) which consists of two
consecutive processing phases: MPSOs derivation and MPSOs combination. The
MPSOs derivation uses a two-stream network to model local neighborhood
variations from direct spatial embedding and frequency-dependent embedding,
where sparse convolutions are utilized to best aggregate information from
sparsely and irregularly distributed points. The MPSOs combination is guided by
the least square error metric to derive weighting coefficients on the fly to
further capture content dynamics of input PCAs. The CARNet is implemented as an
in-loop filtering tool of the GPCC, where those linear weighting coefficients
are encapsulated into the bitstream with negligible bit rate overhead.
Experimental results demonstrate significant improvement over the latest GPCC
both subjectively and objectively.Comment: 13pages, 8figure
- …