Search CORE

4,238 research outputs found

Single-Shot Global Localization via Graph-Theoretic Correspondence Matching

Author: Banno Atsuhiko
Koide Kenji
Matsuzaki Shigemichi
Oishi Shuji
Yokozuka Masashi
Publication venue
Publication date: 06/06/2023
Field of study

This paper describes a method of global localization based on graph-theoretic association of instances between a query and the prior map. The proposed framework employs correspondence matching based on the maximum clique problem (MCP). The framework is potentially applicable to other map and/or query modalities thanks to the graph-based abstraction of the problem, while many of existing global localization methods rely on a query and the dataset in the same modality. We implement it with a semantically labeled 3D point cloud map, and a semantic segmentation image as a query. Leveraging the graph-theoretic framework, the proposed method realizes global localization exploiting only the map and the query. The method shows promising results on multiple large-scale simulated maps of urban scenes

arXiv.org e-Print Archive

Learning based automatic face annotation for arbitrary poses and expressions from frontal images only

Author: Asthana A
Gedeon T
Goecke R
Quadrianto N
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Statistical approaches for building non-rigid deformable models, such as the active appearance model (AAM), have enjoyed great popularity in recent years, but typically require tedious manual annotation of training images. In this paper, a learning based approach for the automatic annotation of visually deformable objects from a single annotated frontal image is presented and demonstrated on the example of automatically annotating face images that can be used for building AAMs for fitting and tracking. This approach employs the idea of initially learning the correspondences between landmarks in a frontal image and a set of training images with a face in arbitrary poses. Using this learner, virtual images of unseen faces at any arbitrary pose for which the learner was trained can be reconstructed by predicting the new landmark locations and warping the texture from the frontal image. View-based AAMs are then built from the virtual images and used for automatically annotating unseen images, including images of different facial expressions, at any random pose within the maximum range spanned by the virtually reconstructed images. The approach is experimentally validated by automatically annotating face images from three different databases

CiteSeerX

Crossref

University of Canberra Research Repository

Sussex Research Online

CUED - Cambridge University Engineering Department

$\mathcal{X}$ -Metric: An N-Dimensional Information-Theoretic Framework for Groupwise Registration and Deep Combined Computing

Author: Luo Xinzhe
Zhuang Xiahai
Publication venue
Publication date: 03/11/2022
Field of study

This paper presents a generic probabilistic framework for estimating the statistical dependency and finding the anatomical correspondences among an arbitrary number of medical images. The method builds on a novel formulation of the

N

-dimensional joint intensity distribution by representing the common anatomy as latent variables and estimating the appearance model with nonparametric estimators. Through connection to maximum likelihood and the expectation-maximization algorithm, an information\hyp{}theoretic metric called

\mathcal{X}

-metric and a co-registration algorithm named

\mathcal{X}

-CoReg are induced, allowing groupwise registration of the

N

observed images with computational complexity of

\mathcal{O}(N)

. Moreover, the method naturally extends for a weakly-supervised scenario where anatomical labels of certain images are provided. This leads to a combined\hyp{}computing framework implemented with deep learning, which performs registration and segmentation simultaneously and collaboratively in an end-to-end fashion. Extensive experiments were conducted to demonstrate the versatility and applicability of our model, including multimodal groupwise registration, motion correction for dynamic contrast enhanced magnetic resonance images, and deep combined computing for multimodal medical images. Results show the superiority of our method in various applications in terms of both accuracy and efficiency, highlighting the advantage of the proposed representation of the imaging process

arXiv.org e-Print Archive

Automated Complexity-Sensitive Image Fusion

Author: Jackson Brian Patrick
Publication venue: CORE Scholar
Publication date: 01/01/2014
Field of study

To construct a complete representation of a scene with environmental obstacles such as fog, smoke, darkness, or textural homogeneity, multisensor video streams captured in diferent modalities are considered. A computational method for automatically fusing multimodal image streams into a highly informative and unified stream is proposed. The method consists of the following steps: 1. Image registration is performed to align video frames in the visible band over time, adapting to the nonplanarity of the scene by automatically subdividing the image domain into regions approximating planar patches 2. Wavelet coefficients are computed for each of the input frames in each modality 3. Corresponding regions and points are compared using spatial and temporal information across various scales 4. Decision rules based on the results of multimodal image analysis are used to combine thewavelet coefficients from different modalities 5. The combined wavelet coefficients are inverted to produce an output frame containing useful information gathered from the available modalities Experiments show that the proposed system is capable of producing fused output containing the characteristics of color visible-spectrum imagery while adding information exclusive to infrared imagery, with attractive visual and informational properties

OhioLINK Electronic Thesis and Dissertation Center

CORE