15 research outputs found

    Beyond Geo-localization: Fine-grained Orientation of Street-view Images by Cross-view Matching with Satellite Imagery

    Full text link
    Street-view imagery provides us with novel experiences to explore different places remotely. Carefully calibrated street-view images (e.g. Google Street View) can be used for different downstream tasks, e.g. navigation, map features extraction. As personal high-quality cameras have become much more affordable and portable, an enormous amount of crowdsourced street-view images are uploaded to the internet, but commonly with missing or noisy sensor information. To prepare this hidden treasure for "ready-to-use" status, determining missing location information and camera orientation angles are two equally important tasks. Recent methods have achieved high performance on geo-localization of street-view images by cross-view matching with a pool of geo-referenced satellite imagery. However, most of the existing works focus more on geo-localization than estimating the image orientation. In this work, we re-state the importance of finding fine-grained orientation for street-view images, formally define the problem and provide a set of evaluation metrics to assess the quality of the orientation estimation. We propose two methods to improve the granularity of the orientation estimation, achieving 82.4% and 72.3% accuracy for images with estimated angle errors below 2 degrees for CVUSA and CVACT datasets, corresponding to 34.9% and 28.2% absolute improvement compared to previous works. Integrating fine-grained orientation estimation in training also improves the performance on geo-localization, giving top 1 recall 95.5%/85.5% and 86.8%/80.4% for orientation known/unknown tests on the two datasets.Comment: This paper has been accepted by ACM Multimedia 2022. The version contains additional supplementary material

    Using Mutual Information To Combine Object Models

    No full text
    . This paper introduces a randomized method for combining different object models. By determining a configuration of the models, which maximizes their mutual information, the proposed method creates a unified hypothesis from multiple object models on the fly, without prior training. To validate the effectiveness of the proposed method, experiments are conducted in which human faces are detected and localized in images by combining different face models. 1 Introduction All object models have their specific strengths and weaknesses depending on the context and the environment dynamics. Since no single object model is robust and general enough to cover all potential contexts, we propose to combine different models using mutual information. The ultimate goal of the proposed approach is to overcome the limitations of the individual models by the combination of multiple models on-the-fly, using information-theoretic concepts. While there are many computer vision algorithms for computing va..

    Skin Patch Detection in Real-World Images

    No full text

    Fast and Robust Face Finding via Local Context

    No full text
    In visual surveillance face detection can be an important cue for initializing tracking algorithms. Recent work in psychophics hints at the importance of the local context of a face for robust detection, such as head contours and torso. This paper describes a detector that actively utilizes the idea of local context. The promise is to gain robustness that goes beyond the capabilities of traditional face detection making it particularly interesting for surveillance. The performance of the proposed detector in terms of accuracy and speed is evaluated on data sets from PETS 2000 and PETS 2003 and compared to the object-centered approach. Particular attention is paid to the role of available image resolution
    corecore