20 research outputs found

    Understanding the Limitations of CNN-based Absolute Camera Pose Regression

    Full text link
    Visual localization is the task of accurate camera pose estimation in a known scene. It is a key problem in computer vision and robotics, with applications including self-driving cars, Structure-from-Motion, SLAM, and Mixed Reality. Traditionally, the localization problem has been tackled using 3D geometry. Recently, end-to-end approaches based on convolutional neural networks have become popular. These methods learn to directly regress the camera pose from an input image. However, they do not achieve the same level of pose accuracy as 3D structure-based methods. To understand this behavior, we develop a theoretical model for camera pose regression. We use our model to predict failure cases for pose regression techniques and verify our predictions through experiments. We furthermore use our model to show that pose regression is more closely related to pose approximation via image retrieval than to accurate pose estimation via 3D structure. A key result is that current approaches do not consistently outperform a handcrafted image retrieval baseline. This clearly shows that additional research is needed before pose regression algorithms are ready to compete with structure-based methods.Comment: Initial version of a paper accepted to CVPR 201

    To Learn or Not to Learn: Visual Localization from Essential Matrices

    Full text link
    Visual localization is the problem of estimating a camera within a scene and a key component in computer vision applications such as self-driving cars and Mixed Reality. State-of-the-art approaches for accurate visual localization use scene-specific representations, resulting in the overhead of constructing these models when applying the techniques to new scenes. Recently, deep learning-based approaches based on relative pose estimation have been proposed, carrying the promise of easily adapting to new scenes. However, it has been shown such approaches are currently significantly less accurate than state-of-the-art approaches. In this paper, we are interested in analyzing this behavior. To this end, we propose a novel framework for visual localization from relative poses. Using a classical feature-based approach within this framework, we show state-of-the-art performance. Replacing the classical approach with learned alternatives at various levels, we then identify the reasons for why deep learned approaches do not perform well. Based on our analysis, we make recommendations for future work.Comment: Accepted to ICRA 202

    Is Geometry Enough for Matching in Visual Localization?

    Full text link
    In this paper, we propose to go beyond the well-established approach to vision-based localization that relies on visual descriptor matching between a query image and a 3D point cloud. While matching keypoints via visual descriptors makes localization highly accurate, it has significant storage demands, raises privacy concerns and requires update to the descriptors in the long-term. To elegantly address those practical challenges for large-scale localization, we present GoMatch, an alternative to visual-based matching that solely relies on geometric information for matching image keypoints to maps, represented as sets of bearing vectors. Our novel bearing vectors representation of 3D points, significantly relieves the cross-modal challenge in geometric-based matching that prevented prior work to tackle localization in a realistic environment. With additional careful architecture design, GoMatch improves over prior geometric-based matching work with a reduction of (10.67m,95.7deg) and (1.43m, 34.7deg) in average median pose errors on Cambridge Landmarks and 7-Scenes, while requiring as little as 1.5/1.7% of storage capacity in comparison to the best visual-based matching methods. This confirms its potential and feasibility for real-world localization and opens the door to future efforts in advancing city-scale visual localization methods that do not require storing visual descriptors.Comment: ECCV2022 Camera Read

    Functional connectivity of the right inferior frontal gyrus and orbitofrontal cortex in depression

    Get PDF
    The orbitofrontal cortex extends into the laterally adjacent inferior frontal gyrus. We analyzed how voxel-level functional connectivity of the inferior frontal gyrus and orbitofrontal cortex is related to depression in 282 people with major depressive disorder (125 were unmedicated) and 254 controls, using FDR correction P < 0.05 for pairs of voxels. In the unmedicated group, higher functional connectivity was found of the right inferior frontal gyrus with voxels in the lateral and medial orbitofrontal cortex, cingulate cortex, temporal lobe, angular gyrus, precuneus, hippocampus and frontal gyri. In medicated patients, these functional connectivities were lower and toward those in controls. Functional connectivities between the lateral orbitofrontal cortex and the precuneus, posterior cingulate cortex, inferior frontal gyrus, ventromedial prefrontal cortex and the angular and middle frontal gyri were higher in unmedicated patients, and closer to controls in medicated patients. Medial orbitofrontal cortex voxels had lower functional connectivity with temporal cortex areas, the parahippocampal gyrus and fusiform gyrus, and medication did not result in these being closer to controls. These findings are consistent with the hypothesis that the orbitofrontal cortex is involved in depression, and can influence mood and behavior via the right inferior frontal gyrus, which projects to premotor cortical areas

    Understanding the Limitations of CNN-based Absolute Camera Pose Regression

    No full text
    Visual localization is the task of accurate camera pose estimation in a known scene. It is a key problem in computer vision and robotics, with applications including selfdriving cars, Structure-from-Motion, SLAM, and Mixed Reality. Traditionally, the localization problem has been tackled using 3D geometry. Recently, end-to-end approaches based on convolutional neural networks have become popular. These methods learn to directly regress the camera pose from an input image. However, they do not achieve the same level of pose accuracy as 3D structure-based methods. To understand this behavior, we develop a theoretical model for camera pose regression. We use our model to predict failure cases for pose regression techniques and verify our predictions through experiments. We furthermore use our model to show that pose regression is more closely related to pose approximation via image retrieval than to accurate pose estimation via 3D structure. A key result is that current approaches do not consistently outperform a handcrafted image retrieval baseline. This clearly shows that additional research is needed before pose regression algorithms are ready to compete with structure-based method

    Temperature dependence of the Rayleigh Brillouin spectrum linewidth in air and nitrogen

    No full text
    The relation between spontaneous Rayleigh Brillouin (SRB) spectrum linewidth, gas temperature, and pressure are analyzed at the temperature range from 220 to 340 K and the pressure range from 0.1 to 1 bar, covering the stratosphere and troposphere relevant for the Earth’s atmosphere and for atmospheric Lidar missions. Based on the analysis, a model retrieving gas temperature from directly measured linewidth is established and the accuracy limitations are estimated. Furthermore, some experimental data of air and nitrogen are used to verify the accuracy of the model. As the results show, the retrieved temperature shows good agreement with the reference temperature, and the absolute difference is less than 3 K, which indicates that this method provides a fruitful tool in satellite retrieval to extract the gaseous properties of atmospheres on-line by directly measuring the SRB spectrum linewidth

    All-Sputtering, High-Transparency, Good-Stability Coplanar Top-Gate Thin Film Transistors

    No full text
    In this work, transparent, stable coplanar top-gate thin film transistors (TFTs) with an active layer of neodymium-doped indium oxide and zinc oxide (Nd-IZO) were successfully fabricated on a glass substrate by all sputtering processes. The devices with a post-annealing temperature of 400 &deg;C exhibited good electrical performances with a saturation mobility (&mu;sat) of 4.25 cm2&middot;V&minus;1&middot;S&minus;1, Ion/Ioff ratio about 106, Vth of &minus;0.97 V and SS about 0.34 V/decade. Furthermore, the devices exhibited excellent negative and positive bias stability (NBS, PBS) of only a &Delta;Vth shift of about &minus;0.04 V and 0.05 V after 1 h, respectively. In addition, the devices showed high transparency about 96% over the visible-light region of 400&ndash;700 nm, which indicates a great potential in transparent displays

    Sensory, somatomotor and internal mentation networks emerge dynamically in the resting brain with internal mentation predominating in older age

    No full text
    Age-related changes in the brain are associated with a decline in functional flexibility. Intrinsic functional flexibility is evident in the brain's dynamic ability to switch between alternative spatiotemporal states during resting state. However, the relationship between brain connectivity states, associated psychological functions during resting state, and the changes in normal aging remain poorly understood. In this study, we analyzed resting-state functional magnetic resonance imaging (rsfMRI) data from the Human Connectome Project (HCP; N = 812) and the UK Biobank (UKB; N = 6,716). Using signed community clustering to identify distinct states of dynamic functional connectivity, and text-mining of a large existing literature for functional annotation of each state, our findings from the HCP dataset indicated that the resting brain spontaneously transitions between three functionally specialized states: sensory, somatomotor, and internal mentation networks. The occurrence, transition-rate, and persistence-time parameters for each state were correlated with behavioural scores using canonical correlation analysis. We estimated the same brain states and parameters in the UKB dataset, subdivided into three distinct age ranges: 50–55, 56–67, and 68–78 years. We found that the internal mentation network was more frequently expressed in people aged 71 and older, whereas people younger than 55 more frequently expressed sensory and somatomotor networks. Furthermore, analysis of the functional entropy — a measure of uncertainty of functional connectivity — also supported this finding across the three age ranges. Our study demonstrates that dynamic functional connectivity analysis can expose the time-varying patterns of transition between functionally specialized brain states, which are strongly tied to increasing age
    corecore