Search CORE

78 research outputs found

Lifting GIS Maps into Strong Geometric Context for Scene Understanding

Author: Díaz Raúl
Fowlkes Charless C.
Lee Minhaeng
Schubert Jochen
Publication venue
Publication date: 08/01/2016
Field of study

Contextual information can have a substantial impact on the performance of visual tasks such as semantic segmentation, object detection, and geometric estimation. Data stored in Geographic Information Systems (GIS) offers a rich source of contextual information that has been largely untapped by computer vision. We propose to leverage such information for scene understanding by combining GIS resources with large sets of unorganized photographs using Structure from Motion (SfM) techniques. We present a pipeline to quickly generate strong 3D geometric priors from 2D GIS data using SfM models aligned with minimal user input. Given an image resectioned against this model, we generate robust predictions of depth, surface normals, and semantic labels. We show that the precision of the predicted geometry is substantially more accurate other single-image depth estimation methods. We then demonstrate the utility of these contextual constraints for re-scoring pedestrian detections, and use these GIS contextual features alongside object detection score maps to improve a CRF-based semantic segmentation framework, boosting accuracy over baseline models

arXiv.org e-Print Archive

Crossref

Parallelized computational 3D video microscopy of freely moving organisms at multiple gigapixels per second

Author: Bagnat Michel
Bagwell Jennifer
Bechtel Jack P.
Bègue Aurélien
Cook Clare B.
Cooke Colin L.
Doman Jed
Harfouche Mark
Horstmeyer Gregor
Horstmeyer Roarke
Jönsson Joakim
Kim Kanghyun
Konda Pavan C.
Kreiss Lucas
McCarroll Matthew
Park Jaehee
Reamey Paul
Saliu Veton
Zheng Maxwell
Zhou Kevin C.
Publication venue
Publication date: 19/01/2023
Field of study

To study the behavior of freely moving model organisms such as zebrafish (Danio rerio) and fruit flies (Drosophila) across multiple spatial scales, it would be ideal to use a light microscope that can resolve 3D information over a wide field of view (FOV) at high speed and high spatial resolution. However, it is challenging to design an optical instrument to achieve all of these properties simultaneously. Existing techniques for large-FOV microscopic imaging and for 3D image measurement typically require many sequential image snapshots, thus compromising speed and throughput. Here, we present 3D-RAPID, a computational microscope based on a synchronized array of 54 cameras that can capture high-speed 3D topographic videos over a 135-cm^2 area, achieving up to 230 frames per second at throughputs exceeding 5 gigapixels (GPs) per second. 3D-RAPID features a 3D reconstruction algorithm that, for each synchronized temporal snapshot, simultaneously fuses all 54 images seamlessly into a globally-consistent composite that includes a coregistered 3D height map. The self-supervised 3D reconstruction algorithm itself trains a spatiotemporally-compressed convolutional neural network (CNN) that maps raw photometric images to 3D topography, using stereo overlap redundancy and ray-propagation physics as the only supervision mechanism. As a result, our end-to-end 3D reconstruction algorithm is robust to generalization errors and scales to arbitrarily long videos from arbitrarily sized camera arrays. The scalable hardware and software design of 3D-RAPID addresses a longstanding problem in the field of behavioral imaging, enabling parallelized 3D observation of large collections of freely moving organisms at high spatiotemporal throughputs, which we demonstrate in ants (Pogonomyrmex barbatus), fruit flies, and zebrafish larvae

arXiv.org e-Print Archive

Ground Plane Matters: Picking Up Ground Plane Prior in Monocular 3D Object Detection

Author: Chen Hui
Ding Guiguang
Guo Yuchen
Han Jungong
Ni Kai
Xu Xinhao
Yang Fan
Publication venue
Publication date: 02/11/2022
Field of study

The ground plane prior is a very informative geometry clue in monocular 3D object detection (M3OD). However, it has been neglected by most mainstream methods. In this paper, we identify two key factors that limit the applicability of ground plane prior: the projection point localization issue and the ground plane tilt issue. To pick up the ground plane prior for M3OD, we propose a Ground Plane Enhanced Network (GPENet) which resolves both issues at one go. For the projection point localization issue, instead of using the bottom vertices or bottom center of the 3D bounding box (BBox), we leverage the object's ground contact points, which are explicit pixels in the image and easy for the neural network to detect. For the ground plane tilt problem, our GPENet estimates the horizon line in the image and derives a novel mathematical expression to accurately estimate the ground plane equation. An unsupervised vertical edge mining algorithm is also proposed to address the occlusion of the horizon line. Furthermore, we design a novel 3D bounding box deduction method based on a dynamic back projection algorithm, which could take advantage of the accurate contact points and the ground plane equation. Additionally, using only M3OD labels, contact point and horizon line pseudo labels can be easily generated with NO extra data collection and label annotation cost. Extensive experiments on the popular KITTI benchmark show that our GPENet can outperform other methods and achieve state-of-the-art performance, well demonstrating the effectiveness and the superiority of the proposed approach. Moreover, our GPENet works better than other methods in cross-dataset evaluation on the nuScenes dataset. Our code and models will be published.Comment: 13 pages, 10 figure

arXiv.org e-Print Archive

Freehand 2D Ultrasound Probe Calibration for Image Fusion with 3D MRI/CT

Author: Bartoli Adrien
Langhe Yogesh
Skerl Katrin
Publication venue
Publication date: 14/03/2023
Field of study

The aim of this work is to implement a simple freehand ultrasound (US) probe calibration technique. This will enable us to visualize US image data during surgical procedures using augmented reality. The performance of the system was evaluated with different experiments using two different pose estimation techniques. A near-millimeter accuracy can be achieved with the proposed approach. The developed system is cost-effective, simple and rapid with low calibration erro

arXiv.org e-Print Archive

Constant Velocity Constraints for Self-Supervised Monocular Depth Estimation

Author: Aleotti Filippo
Babu V Madhu
Eigen David
Fu Huan
Geiger Andreas
Godard Clément
Godard Clément
He Kaiming
Klodt Maria
Kuznietsov Yevhen
Laina Iro
Li Bo
Li Ruihao
Max Jaderberg
Newcombe A.
P.
Paszke Adam
Ranjan Anurag
Yin Zhichao
Zhan Huangying
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/12/2020
Field of study

We present a new method for self-supervised monocular depth estimation. Contemporary monocular depth estimation methods use a triplet of consecutive video frames to estimate the central depth image. We make the assumption that the ego-centric view progresses linearly in the scene, based on the kinematic and physical properties of the camera. During the training phase, we can exploit this assumption to create a depth estimation for each image in the triplet. We then apply a new geometry constraint that supports novel synthetic views, thus providing a strong supervisory signal. Our contribution is simple to implement, requires no additional trainable parameter, and produces competitive results when compared with other state-of-the-art methods on the popular KITTI corpus

Crossref

University of East Anglia digital repository

Image motion estimation for 3D model based video conferencing.

Author
Publication venue
Publication date: 01/01/2000
Field of study

Cheung Man-kin.Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.Includes bibliographical references (leaves 116-120).Abstracts in English and Chinese.Chapter 1) --- Introduction --- p.1Chapter 1.1) --- Building of the 3D Wireframe and Facial Model --- p.2Chapter 1.2) --- Description of 3D Model Based Video Conferencing --- p.3Chapter 1.3) --- Wireframe Model Fitting or Conformation --- p.6Chapter 1.4) --- Pose Estimation --- p.8Chapter 1.5) --- Facial Motion Estimation and Synthesis --- p.9Chapter 1.6) --- Thesis Outline --- p.10Chapter 2) --- Wireframe model Fitting --- p.11Chapter 2.1) --- Algorithm of WFM Fitting --- p.12Chapter 2.1.1) --- Global Deformation --- p.14Chapter a) --- Scaling --- p.14Chapter b) --- Shifting --- p.15Chapter 2.1.2) --- Local Deformation --- p.15Chapter a) --- Shifting --- p.16Chapter b) --- Scaling --- p.17Chapter 2.1.3) --- Fine Updating --- p.17Chapter 2.2) --- Steps of Fitting --- p.18Chapter 2.3) --- Functions of Different Deformation --- p.18Chapter 2.4) --- Experimental Results --- p.19Chapter 2.4.1) --- Output wireframe in each step --- p.19Chapter 2.4.2) --- Examples of Mis-fitted wireframe with incoming image --- p.22Chapter 2.4.3) --- Fitted 3D facial wireframe --- p.23Chapter 2.4.4) --- Effect of mis-fitted wireframe after compensation of motion --- p.24Chapter 2.5) --- Summary --- p.26Chapter 3) --- Epipolar Geometry --- p.27Chapter 3.1) --- Pinhole Camera Model and Perspective Projection --- p.28Chapter 3.2) --- Concepts in Epipolar Geometry --- p.31Chapter 3.2.1) --- Working with normalized image coordinates --- p.33Chapter 3.2.2) --- Working with pixel image coordinates --- p.35Chapter 3.2.3) --- Summary --- p.37Chapter 3.3) --- 8-point Algorithm (Essential and Fundamental Matrix) --- p.38Chapter 3.3.1) --- Outline of the 8-point algorithm --- p.38Chapter 3.3.2) --- Modification on obtained Fundamental Matrix --- p.39Chapter 3.3.3) --- Transformation of Image Coordinates --- p.40Chapter a) --- Translation to mean of points --- p.40Chapter b) --- Normalizing transformation --- p.41Chapter 3.3.4) --- Summary of 8-point algorithm --- p.41Chapter 3.4) --- Estimation of Object Position by Decomposition of Essential Matrix --- p.43Chapter 3.4.1) --- Algorithm Derivation --- p.43Chapter 3.4.2) --- Algorithm Outline --- p.46Chapter 3.5) --- Noise Sensitivity --- p.48Chapter 3.5.1) --- Rotation vector of model --- p.48Chapter 3.5.2) --- The projection of rotated model --- p.49Chapter 3.5.3) --- Noisy image --- p.51Chapter 3.5.4) --- Summary --- p.51Chapter 4) --- Pose Estimation --- p.54Chapter 4.1) --- Linear Method --- p.55Chapter 4.1.1) --- Theory --- p.55Chapter 4.1.2) --- Normalization --- p.57Chapter 4.1.3) --- Experimental Results --- p.58Chapter a) --- Synthesized image by linear method without normalization --- p.58Chapter b) --- Performance between linear method with and without normalization --- p.60Chapter c) --- Performance of linear method under quantization noise with different transformation components --- p.62Chapter d) --- Performance of normalized case without transformation in z- component --- p.63Chapter 4.1.4) --- Summary --- p.64Chapter 4.2) --- Two Stage Algorithm --- p.66Chapter 4.2.1) --- Introduction --- p.66Chapter 4.2.2) --- The Two Stage Algorithm --- p.67Chapter a) --- Stage 1 (Iterative Method) --- p.68Chapter b) --- Stage 2 ( Non-linear Optimization) --- p.71Chapter 4.2.3) --- Summary of the Two Stage Algorithm --- p.72Chapter 4.2.4) --- Experimental Results --- p.72Chapter 4.2.5) --- Summary --- p.80Chapter 5) --- Facial Motion Estimation and Synthesis --- p.81Chapter 5.1) --- Facial Expression based on face muscles --- p.83Chapter 5.1.1) --- Review of Action Unit Approach --- p.83Chapter 5.1.2) --- Distribution of Motion Unit --- p.85Chapter 5.1.3) --- Algorithm --- p.89Chapter a) --- For Unidirectional Motion Unit --- p.89Chapter b) --- For Circular Motion Unit (eyes) --- p.90Chapter c) --- For Another Circular Motion Unit (mouth) --- p.90Chapter 5.1.4) --- Experimental Results --- p.91Chapter 5.1.5) --- Summary --- p.95Chapter 5.2) --- Detection of Facial Expression by Muscle-based Approach --- p.96Chapter 5.2.1) --- Theory --- p.96Chapter 5.2.2) --- Algorithm --- p.97Chapter a) --- For Sheet Muscle --- p.97Chapter b) --- For Circular Muscle --- p.98Chapter c) --- For Mouth Muscle --- p.99Chapter 5.2.3) --- Steps of Algorithm --- p.100Chapter 5.2.4) --- Experimental Results --- p.101Chapter 5.2.5) --- Summary --- p.103Chapter 6) --- Conclusion --- p.104Chapter 6.1) --- WFM fitting --- p.104Chapter 6.2) --- Pose Estimation --- p.105Chapter 6.3) --- Facial Estimation and Synthesis --- p.106Chapter 6.4) --- Discussion on Future Improvements --- p.107Chapter 6.4.1) --- WFM Fitting --- p.107Chapter 6.4.2) --- Pose Estimation --- p.109Chapter 6.4.3) --- Facial Motion Estimation and Synthesis --- p.110Chapter 7) --- Appendix --- p.111Chapter 7.1) --- Newton's Method or Newton-Raphson Method --- p.111Chapter 7.2) --- H.261 --- p.113Chapter 7.3) --- 3D Measurement --- p.114Bibliography --- p.11

CUHK Digital Repository