20 research outputs found
Understanding the Limitations of CNN-based Absolute Camera Pose Regression
Visual localization is the task of accurate camera pose estimation in a known
scene. It is a key problem in computer vision and robotics, with applications
including self-driving cars, Structure-from-Motion, SLAM, and Mixed Reality.
Traditionally, the localization problem has been tackled using 3D geometry.
Recently, end-to-end approaches based on convolutional neural networks have
become popular. These methods learn to directly regress the camera pose from an
input image. However, they do not achieve the same level of pose accuracy as 3D
structure-based methods. To understand this behavior, we develop a theoretical
model for camera pose regression. We use our model to predict failure cases for
pose regression techniques and verify our predictions through experiments. We
furthermore use our model to show that pose regression is more closely related
to pose approximation via image retrieval than to accurate pose estimation via
3D structure. A key result is that current approaches do not consistently
outperform a handcrafted image retrieval baseline. This clearly shows that
additional research is needed before pose regression algorithms are ready to
compete with structure-based methods.Comment: Initial version of a paper accepted to CVPR 201
To Learn or Not to Learn: Visual Localization from Essential Matrices
Visual localization is the problem of estimating a camera within a scene and
a key component in computer vision applications such as self-driving cars and
Mixed Reality. State-of-the-art approaches for accurate visual localization use
scene-specific representations, resulting in the overhead of constructing these
models when applying the techniques to new scenes. Recently, deep
learning-based approaches based on relative pose estimation have been proposed,
carrying the promise of easily adapting to new scenes. However, it has been
shown such approaches are currently significantly less accurate than
state-of-the-art approaches. In this paper, we are interested in analyzing this
behavior. To this end, we propose a novel framework for visual localization
from relative poses. Using a classical feature-based approach within this
framework, we show state-of-the-art performance. Replacing the classical
approach with learned alternatives at various levels, we then identify the
reasons for why deep learned approaches do not perform well. Based on our
analysis, we make recommendations for future work.Comment: Accepted to ICRA 202
Is Geometry Enough for Matching in Visual Localization?
In this paper, we propose to go beyond the well-established approach to
vision-based localization that relies on visual descriptor matching between a
query image and a 3D point cloud. While matching keypoints via visual
descriptors makes localization highly accurate, it has significant storage
demands, raises privacy concerns and requires update to the descriptors in the
long-term. To elegantly address those practical challenges for large-scale
localization, we present GoMatch, an alternative to visual-based matching that
solely relies on geometric information for matching image keypoints to maps,
represented as sets of bearing vectors. Our novel bearing vectors
representation of 3D points, significantly relieves the cross-modal challenge
in geometric-based matching that prevented prior work to tackle localization in
a realistic environment. With additional careful architecture design, GoMatch
improves over prior geometric-based matching work with a reduction of
(10.67m,95.7deg) and (1.43m, 34.7deg) in average median pose errors on
Cambridge Landmarks and 7-Scenes, while requiring as little as 1.5/1.7% of
storage capacity in comparison to the best visual-based matching methods. This
confirms its potential and feasibility for real-world localization and opens
the door to future efforts in advancing city-scale visual localization methods
that do not require storing visual descriptors.Comment: ECCV2022 Camera Read
Functional connectivity of the right inferior frontal gyrus and orbitofrontal cortex in depression
The orbitofrontal cortex extends into the laterally adjacent inferior frontal gyrus. We analyzed how voxel-level functional connectivity of the inferior frontal gyrus and orbitofrontal cortex is related to depression in 282 people with major depressive disorder (125 were unmedicated) and 254 controls, using FDR correction P < 0.05 for pairs of voxels. In the unmedicated group, higher functional connectivity was found of the right inferior frontal gyrus with voxels in the lateral and medial orbitofrontal cortex, cingulate cortex, temporal lobe, angular gyrus, precuneus, hippocampus and frontal gyri. In medicated patients, these functional connectivities were lower and toward those in controls. Functional connectivities between the lateral orbitofrontal cortex and the precuneus, posterior cingulate cortex, inferior frontal gyrus, ventromedial prefrontal cortex and the angular and middle frontal gyri were higher in unmedicated patients, and closer to controls in medicated patients. Medial orbitofrontal cortex voxels had lower functional connectivity with temporal cortex areas, the parahippocampal gyrus and fusiform gyrus, and medication did not result in these being closer to controls. These findings are consistent with the hypothesis that the orbitofrontal cortex is involved in depression, and can influence mood and behavior via the right inferior frontal gyrus, which projects to premotor cortical areas
Understanding the Limitations of CNN-based Absolute Camera Pose Regression
Visual localization is the task of accurate camera pose estimation in a known scene. It is a key problem in computer vision and robotics, with applications including selfdriving cars, Structure-from-Motion, SLAM, and Mixed Reality. Traditionally, the localization problem has been tackled using 3D geometry. Recently, end-to-end approaches based on convolutional neural networks have become popular. These methods learn to directly regress the camera pose from an input image. However, they do not achieve the same level of pose accuracy as 3D structure-based methods. To understand this behavior, we develop a theoretical model for camera pose regression. We use our model to predict failure cases for pose regression techniques and verify our predictions through experiments. We furthermore use our model to show that pose regression is more closely related to pose approximation via image retrieval than to accurate pose estimation via 3D structure. A key result is that current approaches do not consistently outperform a handcrafted image retrieval baseline. This clearly shows that additional research is needed before pose regression algorithms are ready to compete with structure-based method
Temperature dependence of the Rayleigh Brillouin spectrum linewidth in air and nitrogen
The relation between spontaneous Rayleigh Brillouin (SRB) spectrum linewidth, gas temperature, and pressure are analyzed at the temperature range from 220 to 340 K and the pressure range from 0.1 to 1 bar, covering the stratosphere and troposphere relevant for the Earth’s atmosphere and for atmospheric Lidar missions. Based on the analysis, a model retrieving gas temperature from directly measured linewidth is established and the accuracy limitations are estimated. Furthermore, some experimental data of air and nitrogen are used to verify the accuracy of the model. As the results show, the retrieved temperature shows good agreement with the reference temperature, and the absolute difference is less than 3 K, which indicates that this method provides a fruitful tool in satellite retrieval to extract the gaseous properties of atmospheres on-line by directly measuring the SRB spectrum linewidth
All-Sputtering, High-Transparency, Good-Stability Coplanar Top-Gate Thin Film Transistors
In this work, transparent, stable coplanar top-gate thin film transistors (TFTs) with an active layer of neodymium-doped indium oxide and zinc oxide (Nd-IZO) were successfully fabricated on a glass substrate by all sputtering processes. The devices with a post-annealing temperature of 400 °C exhibited good electrical performances with a saturation mobility (μsat) of 4.25 cm2·V−1·S−1, Ion/Ioff ratio about 106, Vth of −0.97 V and SS about 0.34 V/decade. Furthermore, the devices exhibited excellent negative and positive bias stability (NBS, PBS) of only a ΔVth shift of about −0.04 V and 0.05 V after 1 h, respectively. In addition, the devices showed high transparency about 96% over the visible-light region of 400–700 nm, which indicates a great potential in transparent displays
Sensory, somatomotor and internal mentation networks emerge dynamically in the resting brain with internal mentation predominating in older age
Age-related changes in the brain are associated with a decline in functional flexibility. Intrinsic functional flexibility is evident in the brain's dynamic ability to switch between alternative spatiotemporal states during resting state. However, the relationship between brain connectivity states, associated psychological functions during resting state, and the changes in normal aging remain poorly understood. In this study, we analyzed resting-state functional magnetic resonance imaging (rsfMRI) data from the Human Connectome Project (HCP; N = 812) and the UK Biobank (UKB; N = 6,716). Using signed community clustering to identify distinct states of dynamic functional connectivity, and text-mining of a large existing literature for functional annotation of each state, our findings from the HCP dataset indicated that the resting brain spontaneously transitions between three functionally specialized states: sensory, somatomotor, and internal mentation networks. The occurrence, transition-rate, and persistence-time parameters for each state were correlated with behavioural scores using canonical correlation analysis. We estimated the same brain states and parameters in the UKB dataset, subdivided into three distinct age ranges: 50–55, 56–67, and 68–78 years. We found that the internal mentation network was more frequently expressed in people aged 71 and older, whereas people younger than 55 more frequently expressed sensory and somatomotor networks. Furthermore, analysis of the functional entropy — a measure of uncertainty of functional connectivity — also supported this finding across the three age ranges. Our study demonstrates that dynamic functional connectivity analysis can expose the time-varying patterns of transition between functionally specialized brain states, which are strongly tied to increasing age