15 research outputs found
Beyond Geo-localization: Fine-grained Orientation of Street-view Images by Cross-view Matching with Satellite Imagery
Street-view imagery provides us with novel experiences to explore different
places remotely. Carefully calibrated street-view images (e.g. Google Street
View) can be used for different downstream tasks, e.g. navigation, map features
extraction. As personal high-quality cameras have become much more affordable
and portable, an enormous amount of crowdsourced street-view images are
uploaded to the internet, but commonly with missing or noisy sensor
information. To prepare this hidden treasure for "ready-to-use" status,
determining missing location information and camera orientation angles are two
equally important tasks. Recent methods have achieved high performance on
geo-localization of street-view images by cross-view matching with a pool of
geo-referenced satellite imagery. However, most of the existing works focus
more on geo-localization than estimating the image orientation. In this work,
we re-state the importance of finding fine-grained orientation for street-view
images, formally define the problem and provide a set of evaluation metrics to
assess the quality of the orientation estimation. We propose two methods to
improve the granularity of the orientation estimation, achieving 82.4% and
72.3% accuracy for images with estimated angle errors below 2 degrees for CVUSA
and CVACT datasets, corresponding to 34.9% and 28.2% absolute improvement
compared to previous works. Integrating fine-grained orientation estimation in
training also improves the performance on geo-localization, giving top 1 recall
95.5%/85.5% and 86.8%/80.4% for orientation known/unknown tests on the two
datasets.Comment: This paper has been accepted by ACM Multimedia 2022. The version
contains additional supplementary material
Using Mutual Information To Combine Object Models
. This paper introduces a randomized method for combining different object models. By determining a configuration of the models, which maximizes their mutual information, the proposed method creates a unified hypothesis from multiple object models on the fly, without prior training. To validate the effectiveness of the proposed method, experiments are conducted in which human faces are detected and localized in images by combining different face models. 1 Introduction All object models have their specific strengths and weaknesses depending on the context and the environment dynamics. Since no single object model is robust and general enough to cover all potential contexts, we propose to combine different models using mutual information. The ultimate goal of the proposed approach is to overcome the limitations of the individual models by the combination of multiple models on-the-fly, using information-theoretic concepts. While there are many computer vision algorithms for computing va..
Fast and Robust Face Finding via Local Context
In visual surveillance face detection can be an important cue for initializing tracking algorithms. Recent work in psychophics hints at the importance of the local context of a face for robust detection, such as head contours and torso. This paper describes a detector that actively utilizes the idea of local context. The promise is to gain robustness that goes beyond the capabilities of traditional face detection making it particularly interesting for surveillance. The performance of the proposed detector in terms of accuracy and speed is evaluated on data sets from PETS 2000 and PETS 2003 and compared to the object-centered approach. Particular attention is paid to the role of available image resolution