    noRANSAC for fundamental matrix estimation

    The estimation of the fundamental matrix from a set of corresponding points is a relevant topic in epipolar stereo geometry [10]. Due to the high amount of outliers between the matches, RANSAC-based approaches [7, 13, 29] have been used to obtain the fundamental matrix. In this paper two new contributes are presented: a new normalized epipolar error measure which takes into account the shape of the features used as matches [17] and a new strategy to compare fundamental matrices. The proposed error measure gives good results and it does not depend on the image scale. Moreover, the new evaluation strategy describes a valid tool to compare different RANSAC-based methods because it does not rely on the inlier ratio, which could not correspond to the best allowable fundamental matrix estimated model, but it makes use of a reference ground truth fundamental matrix obtained by a set of corresponding points given by the use

    Rethinking the sGLOH descriptor

    sGLOH (shifting GLOH) is a histogram-based keypoint descriptor that can be associated to multiple quantized rotations of the keypoint patch without any recomputation. This property can be exploited to define the best distance between two descriptor vectors, thus avoiding computing the dominant orientation. In addition, sGLOH can reject incongruous correspondences by adding a global constraint on the rotations either as an a priori knowledge or based on the data. This paper thoroughly reconsiders sGLOH and improves it in terms of robustness, speed and descriptor dimension. The revised sGLOH embeds more quantized rotations, thus yielding more correct matches. A novel fast matching scheme is also designed, which significantly reduces both computation time and memory usage. In addition, a new binarization technique based on comparisons inside each descriptor histogram is defined, yielding a more compact, faster, yet robust alternative. Results on an exhaustive comparative experimental evaluation show that the revised sGLOH descriptor incorporating the above ideas and combining them according to task requirements, improves in most cases the state of the art in both image matching and object recognition

    Guest editorial: Local image descriptors in computer vision

    ...This Special Issue includes seven original research papers that cover diverse and significant aspects of local image descriptor research. In particular, the order in which papers appear reflects the main phase they address, in an ideal computational pipeline starting with the localisation of salient points in an image and ending with the incorporation of spatial and temporal features in descriptor construction...

    Estimating the best reference homography for planar mosaics from videos

    This paper proposes a novel strategy to find the best reference homography in mosaics from video sequences. The reference homography globally minimizes the distortions induced on each image frame by the mosaic homography itself. This method is designed for planar mosaics on which a bad choice of the first reference image frame can lead to severe distortions after concatenating several successive homographies. This often happens in the case of underwater mosaics with non-flat seabed and no georeferential information available. Given a video sequence of an almost planar surface, sub-mosaics with low distortions of temporally close image frames are computed and successively merged according to a hierarchical clustering procedure. A robust and effective feature tracker using an approximated global position map between image frames allows us to build the mosaic also between locally close but not temporally consecutive frames. Sub-mosaics are successively merged by concatenating their relative homographies with another reference homography which minimizes the distortion on each frame of the fused image. Experimental results on challenging real underwater videos show the validity of the proposed method


    The extraction of reliable and repeatable interest points among images is a fundamental step for automatic image orientation (Structure-From-Motion). Despite recent progresses, open issues in challenging conditions - such as wide baselines and strong light variations - are still present. Over the years, traditional hand-crafted methods have been paired by learning-based approaches, progressively updating the state-of-the-art according to recent benchmarks. Notwithstanding these advancements, learning-based methods are often not suitable for real photogrammetric surveys due to their lack of rotation invariance, a fundamental requirement for these specific applications. This paper proposes a novel hybrid image matching pipeline which employs both hand-crafted and deep-based components, to extract reliable rotational invariant keypoints optimized for wide-baseline scenarios. The proposed hybrid pipeline was compared with other hand-crafted and learning-based state-of-the-art approaches on some photogrammetric datasets using metric ground-truth data. Results show that the proposed hybrid matching pipeline has high accuracy and appeared to be the only method among the evaluated ones able to register images in the most challenging wide-baseline scenarios

    Selective visual odometry for accurate AUV localization

    In this paper we present a stereo visual odometry system developed for autonomous underwater vehicle localization tasks. The main idea is to make use of only highly reliable data in the estimation process, employing a robust keypoint tracking approach and an effective keyframe selection strategy, so that camera movements are estimated with high accuracy even for long paths. Furthermore, in order to limit the drift error, camera pose estimation is referred to the last keyframe, selected by analyzing the feature temporal flow. The proposed system was tested on the KITTI evaluation framework and on the New Tsukuba stereo dataset to assess its effectiveness on long tracks and different illumination conditions. Results of a live archaeological campaign in the Mediterranean Sea, on an AUV equipped with a stereo camera pair, show that our solution can effectively work in underwater environments

    Fast adaptive frame preprocessing for 3D reconstruction

    This paper presents a new online preprocessing strategy to detect and discard ongoing bad frames in video sequences. These include frames where an accurate localization between corresponding points is difficult, such as for blurred frames, or which do not provide relevant information with respect to the previous frames in terms of texture, image contrast and non-flat areas. Unlike keyframe selectors and deblurring methods, the proposed approach is a fast preprocessing working on a simple gradient statistic, that does not require to compute complex time-consuming image processing, such as the computation of image feature keypoints, previous poses and 3D structure, or to know a priori the input sequence. The presented method provides a fast and useful frame pre-analysis which can be used to improve further image analysis tasks, including also the keyframe selection or the blur detection, or to directly filter the video sequence as shown in the paper, improving the final 3D reconstruction by discarding noisy frames and decreasing the final computation time by removing some redundant frames. This scheme is adaptive, fast and works at runtime by exploiting the image gradient statistic of the last few frames of the video sequence. Experimental results show that the proposed frame selection strategy is robust and improves the final 3D reconstruction both in terms of number of obtained 3D points and reprojection error, also reducing the computational time

    Accurate keyframe selection and keypoint tracking for robust visual odometry

    This paper presents a novel stereo visual odometry (VO) framework based on structure from motion, where a robust keypoint tracking and matching is combined with an effective keyframe selection strategy. In order to track and find correct feature correspondences a robust loop chain matching scheme on two consecutive stereo pairs is introduced. Keyframe selection is based on the proportion of features with high temporal disparity. This criterion relies on the observation that the error in the pose estimation propagates from the uncertainty of 3D points—higher for distant points, that have low 2D motion. Comparative results based on three VO datasets show that the proposed solution is remarkably effective and robust even for very long path lengths

    Restoration and enhancement of historical stereo photos

    Restoration of digital visual media acquired from repositories of historical photographic and cinematographic material is of key importance for the preservation, study and transmission of the legacy of past cultures to the coming generations. In this paper, a fully automatic approach to the digital restoration of historical stereo photographs is proposed, referred to as Stacked Median Restoration plus (SMR+). The approach exploits the content redundancy in stereo pairs for detecting and fixing scratches, dust, dirt spots and many other defects in the original images, as well as improving contrast and illumination. This is done by estimating the optical flow between the images, and using it to register one view onto the other both geometrically and photometrically. Restoration is then accomplished in three steps: (1) image fusion according to the stacked median operator, (2) low-resolution detail enhancement by guided supersampling, and (3) iterative visual consistency checking and refinement. Each step implements an original algorithm specifically designed for this work. The restored image is fully consistent with the original content, thus improving over the methods based on image hallucination. Comparative results on three different datasets of historical stereograms show the effectiveness of the proposed approach, and its superiority over single-image denoising and super-resolution methods. Results also show that the performance of the state-of-the-art single-image deep restoration network Bringing Old Photo Back to Life (BOPBtL) can be strongly improved when the input image is pre-processed by SMR+
