Search CORE

11,837 research outputs found

A temporal phase coherence estimation algorithm and its application on DInSAR pixel selection

Author: Mallorquí Franquet Jordi Joan
Zhao Feng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Pixel selection is a crucial step of all advanced Differential Interferometric Synthetic Aperture Radar (DInSAR) techniques that have a direct impact on the quality of the final DInSAR products. In this paper, a full-resolution phase quality estimator, i.e., the temporal phase coherence (TPC), is proposed for DInSAR pixel selection. The method is able to work with both distributed scatterers (DSs) and permanent scatterers (PSs). The influence of different neighboring window sizes and types of interferograms combinations [both the single-master (SM) and the multi-master (MM)] on TPC has been studied. The relationship between TPC and phase standard deviation (STD) of the selected pixels has also been derived. Together with the classical coherence and amplitude dispersion methods, the TPC pixel selection algorithm has been tested on 37 VV polarization Radarsat-2 images of Barcelona Airport. Results show the feasibility and effectiveness of TPC pixel selection algorithm. Besides obvious improvements in the number of selected pixels, the new method shows some other advantages comparing with the other classical two. The proposed pixel selection algorithm, which presents an affordable computational cost, is easy to be implemented and incorporated into any advanced DInSAR processing chain for high-quality pixels' identification.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Solving Inverse Problems with Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity

Author: Mallat Stéphane
Sapiro Guillermo
Yu Guoshen
Publication venue
Publication date: 01/01/2010
Field of study

A general framework for solving image inverse problems is introduced in this paper. The approach is based on Gaussian mixture models, estimated via a computationally efficient MAP-EM algorithm. A dual mathematical interpretation of the proposed framework with structured sparse estimation is described, which shows that the resulting piecewise linear estimate stabilizes the estimation when compared to traditional sparse inverse problem techniques. This interpretation also suggests an effective dictionary motivated initialization for the MAP-EM algorithm. We demonstrate that in a number of image inverse problems, including inpainting, zooming, and deblurring, the same algorithm produces either equal, often significantly better, or very small margin worse results than the best published ones, at a lower computational cost.Comment: 30 page

arXiv.org e-Print Archive

CiteSeerX

University of Minnesota Digital Conservancy

JND-Based Perceptual Video Coding for 4:4:4 Screen Content Data in HEVC

Author: Prangnell Lee
Sanchez Victor
Publication venue
Publication date: 12/02/2018
Field of study

The JCT-VC standardized Screen Content Coding (SCC) extension in the HEVC HM RExt + SCM reference codec offers an impressive coding efficiency performance when compared with HM RExt alone; however, it is not significantly perceptually optimized. For instance, it does not include advanced HVS-based perceptual coding methods, such as JND-based spatiotemporal masking schemes. In this paper, we propose a novel JND-based perceptual video coding technique for HM RExt + SCM. The proposed method is designed to further improve the compression performance of HM RExt + SCM when applied to YCbCr 4:4:4 SC video data. In the proposed technique, luminance masking and chrominance masking are exploited to perceptually adjust the Quantization Step Size (QStep) at the Coding Block (CB) level. Compared with HM RExt 16.10 + SCM 8.0, the proposed method considerably reduces bitrates (Kbps), with a maximum reduction of 48.3%. In addition to this, the subjective evaluations reveal that SC-PAQ achieves visually lossless coding at very low bitrates.Comment: Preprint: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

Subjective Annotation for a Frame Interpolation Benchmark using Artefact Amplification

Author: Bruhn Andrés
Hosu Vlad
Lin Hanhe
Men Hui
Saupe Dietmar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Current benchmarks for optical flow algorithms evaluate the estimation either directly by comparing the predicted flow fields with the ground truth or indirectly by using the predicted flow fields for frame interpolation and then comparing the interpolated frames with the actual frames. In the latter case, objective quality measures such as the mean squared error are typically employed. However, it is well known that for image quality assessment, the actual quality experienced by the user cannot be fully deduced from such simple measures. Hence, we conducted a subjective quality assessment crowdscouring study for the interpolated frames provided by one of the optical flow benchmarks, the Middlebury benchmark. We collected forced-choice paired comparisons between interpolated images and corresponding ground truth. To increase the sensitivity of observers when judging minute difference in paired comparisons we introduced a new method to the field of full-reference quality assessment, called artefact amplification. From the crowdsourcing data, we reconstructed absolute quality scale values according to Thurstone's model. As a result, we obtained a re-ranking of the 155 participating algorithms w.r.t. the visual quality of the interpolated frames. This re-ranking not only shows the necessity of visual quality assessment as another evaluation metric for optical flow and frame interpolation benchmarks, the results also provide the ground truth for designing novel image quality assessment (IQA) methods dedicated to perceptual quality of interpolated images. As a first step, we proposed such a new full-reference method, called WAE-IQA. By weighing the local differences between an interpolated image and its ground truth WAE-IQA performed slightly better than the currently best FR-IQA approach from the literature.Comment: arXiv admin note: text overlap with arXiv:1901.0536

arXiv.org e-Print Archive

KOPS - The Institutional Repository of the University of Konstanz

University of Dundee Online Publications

Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade

Author: Cavallari Tommaso
Di Stefano Luigi
Golodetz Stuart
Lord Nicholas A.
Prisacariu Victor A.
Torr Philip H. S.
Valentin Julien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/10/2018
Field of study

Camera pose estimation is an important problem in computer vision. Common techniques either match the current image against keyframes with known poses, directly regress the pose, or establish correspondences between keypoints in the image and points in the scene to estimate the pose. In recent years, regression forests have become a popular alternative to establish such correspondences. They achieve accurate results, but have traditionally needed to be trained offline on the target scene, preventing relocalisation in new environments. Recently, we showed how to circumvent this limitation by adapting a pre-trained forest to a new scene on the fly. The adapted forests achieved relocalisation performance that was on par with that of offline forests, and our approach was able to estimate the camera pose in close to real time. In this paper, we present an extension of this work that achieves significantly better relocalisation performance whilst running fully in real time. To achieve this, we make several changes to the original approach: (i) instead of accepting the camera pose hypothesis without question, we make it possible to score the final few hypotheses using a geometric approach and select the most promising; (ii) we chain several instantiations of our relocaliser together in a cascade, allowing us to try faster but less accurate relocalisation first, only falling back to slower, more accurate relocalisation as necessary; and (iii) we tune the parameters of our cascade to achieve effective overall performance. These changes allow us to significantly improve upon the performance our original state-of-the-art method was able to achieve on the well-known 7-Scenes and Stanford 4 Scenes benchmarks. As additional contributions, we present a way of visualising the internal behaviour of our forests and show how to entirely circumvent the need to pre-train a forest on a generic scene.Comment: Tommaso Cavallari, Stuart Golodetz, Nicholas Lord and Julien Valentin assert joint first authorshi

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Oxford University Research Archive

Recurrent Scene Parsing with Perspective Understanding in the Loop

Author: Fowlkes Charless
Kong Shu
Publication venue
Publication date: 05/12/2017
Field of study

Objects may appear at arbitrary scales in perspective images of a scene, posing a challenge for recognition systems that process images at a fixed resolution. We propose a depth-aware gating module that adaptively selects the pooling field size in a convolutional network architecture according to the object scale (inversely proportional to the depth) so that small details are preserved for distant objects while larger receptive fields are used for those nearby. The depth gating signal is provided by stereo disparity or estimated directly from monocular input. We integrate this depth-aware gating into a recurrent convolutional neural network to perform semantic segmentation. Our recurrent module iteratively refines the segmentation results, leveraging the depth and semantic predictions from the previous iterations. Through extensive experiments on four popular large-scale RGB-D datasets, we demonstrate this approach achieves competitive semantic segmentation performance with a model which is substantially more compact. We carry out extensive analysis of this architecture including variants that operate on monocular RGB but use depth as side-information during training, unsupervised gating as a generic attentional mechanism, and multi-resolution gating. We find that gated pooling for joint semantic segmentation and depth yields state-of-the-art results for quantitative monocular depth estimation

arXiv.org e-Print Archive

Crossref

Foreground Detection in Camouflaged Scenes

Author: Cook Chris
Florencio Dinei
Li Shuai
Li Wanqing
Zhao Yaqin
Publication venue
Publication date: 11/07/2017
Field of study

Foreground detection has been widely studied for decades due to its importance in many practical applications. Most of the existing methods assume foreground and background show visually distinct characteristics and thus the foreground can be detected once a good background model is obtained. However, there are many situations where this is not the case. Of particular interest in video surveillance is the camouflage case. For example, an active attacker camouflages by intentionally wearing clothes that are visually similar to the background. In such cases, even given a decent background model, it is not trivial to detect foreground objects. This paper proposes a texture guided weighted voting (TGWV) method which can efficiently detect foreground objects in camouflaged scenes. The proposed method employs the stationary wavelet transform to decompose the image into frequency bands. We show that the small and hardly noticeable differences between foreground and background in the image domain can be effectively captured in certain wavelet frequency bands. To make the final foreground decision, a weighted voting scheme is developed based on intensity and texture of all the wavelet bands with weights carefully designed. Experimental results demonstrate that the proposed method achieves superior performance compared to the current state-of-the-art results.Comment: IEEE International Conference on Image Processing, 201

arXiv.org e-Print Archive

Crossref

Research Online

Geometry-based spherical JND modeling for 360 $^\circ$ display

Author: Chen Bo
Chen Weiling
Lin Liqun
Liu Jiaqi
Wei Hongan
Zhao Tiesong
Publication venue
Publication date: 27/03/2023
Field of study

360

^\circ

videos have received widespread attention due to its realistic and immersive experiences for users. To date, how to accurately model the user perceptions on 360

^\circ

display is still a challenging issue. In this paper, we exploit the visual characteristics of 360

^\circ

projection and display and extend the popular just noticeable difference (JND) model to spherical JND (SJND). First, we propose a quantitative 2D-JND model by jointly considering spatial contrast sensitivity, luminance adaptation and texture masking effect. In particular, our model introduces an entropy-based region classification and utilizes different parameters for different types of regions for better modeling performance. Second, we extend our 2D-JND model to SJND by jointly exploiting latitude projection and field of view during 360

^\circ

display. With this operation, SJND reflects both the characteristics of human vision system and the 360

^\circ

display. Third, our SJND model is more consistent with user perceptions during subjective test and also shows more tolerance in distortions with fewer bit rates during 360

^\circ

video compression. To further examine the effectiveness of our SJND model, we embed it in Versatile Video Coding (VVC) compression. Compared with the state-of-the-arts, our SJND-VVC framework significantly reduced the bit rate with negligible loss in visual quality

arXiv.org e-Print Archive