Search CORE

25 research outputs found

RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline Model and DoF-based Curriculum Learning

Author: Liao Kang
Lin Chunyu
Nie Lang
Zhao Yao
Zheng Zishuo
Publication venue
Publication date: 05/09/2023
Field of study

The wide-angle lens shows appealing applications in VR technologies, but it introduces severe radial distortion into its captured image. To recover the realistic scene, previous works devote to rectifying the content of the wide-angle image. However, such a rectification solution inevitably distorts the image boundary, which changes related geometric distributions and misleads the current vision perception models. In this work, we explore constructing a win-win representation on both content and boundary by contributing a new learning model, i.e., Rectangling Rectification Network (RecRecNet). In particular, we propose a thin-plate spline (TPS) module to formulate the non-linear and non-rigid transformation for rectangling images. By learning the control points on the rectified image, our model can flexibly warp the source structure to the target domain and achieves an end-to-end unsupervised deformation. To relieve the complexity of structure approximation, we then inspire our RecRecNet to learn the gradual deformation rules with a DoF (Degree of Freedom)-based curriculum learning. By increasing the DoF in each curriculum stage, namely, from similarity transformation (4-DoF) to homography transformation (8-DoF), the network is capable of investigating more detailed deformations, offering fast convergence on the final rectangling task. Experiments show the superiority of our solution over the compared methods on both quantitative and qualitative evaluations. The code and dataset are available at https://github.com/KangLiao929/RecRecNet.Comment: Accepted to ICCV 202

arXiv.org e-Print Archive

Automatic Detection of Calibration Grids in Time-of-Flight Images

Author: Amat Michel
Evangelidis Georgios
Hansard Miles
Horaud Radu
Publication venue: 'Elsevier BV'
Publication date: 24/01/2014
Field of study

It is convenient to calibrate time-of-flight cameras by established methods, using images of a chequerboard pattern. The low resolution of the amplitude image, however, makes it difficult to detect the board reliably. Heuristic detection methods, based on connected image-components, perform very poorly on this data. An alternative, geometrically-principled method is introduced here, based on the Hough transform. The projection of a chequerboard is represented by two pencils of lines, which are identified as oriented clusters in the gradient-data of the image. A projective Hough transform is applied to each of the two clusters, in axis-aligned coordinates. The range of each transform is properly bounded, because the corresponding gradient vectors are approximately parallel. Each of the two transforms contains a series of collinear peaks; one for every line in the given pencil. This pattern is easily detected, by sweeping a dual line through the transform. The proposed Hough-based method is compared to the standard OpenCV detection routine, by application to several hundred time-of-flight images. It is shown that the new method detects significantly more calibration boards, over a greater variety of poses, without any overall loss of accuracy. This conclusion is based on an analysis of both geometric and photometric error.Comment: 11 pages, 11 figures, 1 tabl

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Queen Mary Research Online

DarSwin: Distortion Aware Radial Swin Transformer

Author: Afrasiyabi Arman
Ahmad Ola
Athwale Akshaya
Lague Justin
Lalonde Jean-Francois
Shili Ichrak
Publication venue
Publication date: 19/04/2023
Field of study

Wide-angle lenses are commonly used in perception tasks requiring a large field of view. Unfortunately, these lenses produce significant distortions making conventional models that ignore the distortion effects unable to adapt to wide-angle images. In this paper, we present a novel transformer-based model that automatically adapts to the distortion produced by wide-angle lenses. We leverage the physical characteristics of such lenses, which are analytically defined by the radial distortion profile (assumed to be known), to develop a distortion aware radial swin transformer (DarSwin). In contrast to conventional transformer-based architectures, DarSwin comprises a radial patch partitioning, a distortion-based sampling technique for creating token embeddings, and a polar position encoding for radial patch merging. We validate our method on classification tasks using synthetically distorted ImageNet data and show through extensive experiments that DarSwin can perform zero-shot adaptation to unseen distortions of different wide-angle lenses. Compared to other baselines, DarSwin achieves the best results (in terms of Top-1 and -5 accuracy), when tested on in-distribution data, with almost 2% (6%) gain in Top-1 accuracy under medium (high) distortion levels, and comparable to the state-of-the-art under low and very low distortion levels (perspective-like images).Comment: 8 pages, 8 figure

arXiv.org e-Print Archive

Geometric Inference with Microlens Arrays

Author: Schillebeeckx Ian
Publication venue: Washington University Open Scholarship
Publication date: 15/08/2016
Field of study

This dissertation explores an alternative to traditional fiducial markers where geometric information is inferred from the observed position of 3D points seen in an image. We offer an alternative approach which enables geometric inference based on the relative orientation of markers in an image. We present markers fabricated from microlenses whose appearance changes depending on the marker\u27s orientation relative to the camera. First, we show how to manufacture and calibrate chromo-coding lenticular arrays to create a known relationship between the observed hue and orientation of the array. Second, we use 2 small chromo-coding lenticular arrays to estimate the pose of an object. Third, we use 3 large chromo-coding lenticular arrays to calibrate a camera with a single image. Finally, we create another type of fiducial marker from lenslet arrays that encode orientation with discrete black and white appearances. Collectively, these approaches oer new opportunities for pose estimation and camera calibration that are relevant for robotics, virtual reality, and augmented reality

Washington University St. Louis: Open Scholarship

Algorithms for trajectory integration in multiple views

Author: Kayumbi-Kabeya Gabin-Wilfried
Publication venue
Publication date: 01/01/2009
Field of study

PhDThis thesis addresses the problem of deriving a coherent and accurate localization of moving objects from partial visual information when data are generated by cameras placed in di erent view angles with respect to the scene. The framework is built around applications of scene monitoring with multiple cameras. Firstly, we demonstrate how a geometric-based solution exploits the relationships between corresponding feature points across views and improves accuracy in object location. Then, we improve the estimation of objects location with geometric transformations that account for lens distortions. Additionally, we study the integration of the partial visual information generated by each individual sensor and their combination into one single frame of observation that considers object association and data fusion. Our approach is fully image-based, only relies on 2D constructs and does not require any complex computation in 3D space. We exploit the continuity and coherence in objects' motion when crossing cameras' elds of view. Additionally, we work under the assumption of planar ground plane and wide baseline (i.e. cameras' viewpoints are far apart). The main contributions are: i) the development of a framework for distributed visual sensing that accounts for inaccuracies in the geometry of multiple views; ii) the reduction of trajectory mapping errors using a statistical-based homography estimation; iii) the integration of a polynomial method for correcting inaccuracies caused by the cameras' lens distortion; iv) a global trajectory reconstruction algorithm that associates and integrates fragments of trajectories generated by each camera

Queen Mary Research Online

OpenGrey Repository

Deep learning applied to 2D video data for the estimation of clamp reaction forces acting on running prosthetic feet and experimental validation after bench and track tests

Author
Publication venue
Publication date
Field of study

Carbon fiber Running Specific Prostheses (RSPs) have allowed athletes with lower extremity amputations to recover their functional capability of running. RSPs are designed to replicate the spring-like nature of biological legs: they are passive components that mimic the tendons elastic potential energy storage and release during ground contact. The knowledge of loads acting on the prosthesis is crucial for evaluating athletes’ running technique, prevent injuries and designing Running Prosthetic Feet (RPF). The aim of the present work is to investigate a method to estimate forces acting on a RPF based on its geometrical configuration. Firstly, the use of kinematic data acquired with 2D videos was assessed, to understand if they can be a good approximation to the golden standard represented by motion capture (MOCAP). This was done by evaluating steps acquired during two running sessions (OS1 and OS3) with elite paralympic athletes. Then, the problem was formulated using a deep learning approach, training a neural network over data collected from in vitro bench tests, carried out on a hydraulic test bench. Two models were built: the first one was trained over data from standard procedures and validated on two steps of OS1; then, in order to improve the performance of the prototype, a second model was built and trained with data from newly studied procedures. It was then validated on three steps from OS3.Carbon fiber Running Specific Prostheses (RSPs) have allowed athletes with lower extremity amputations to recover their functional capability of running. RSPs are designed to replicate the spring-like nature of biological legs: they are passive components that mimic the tendons elastic potential energy storage and release during ground contact. The knowledge of loads acting on the prosthesis is crucial for evaluating athletes’ running technique, prevent injuries and designing Running Prosthetic Feet (RPF). The aim of the present work is to investigate a method to estimate forces acting on a RPF based on its geometrical configuration. Firstly, the use of kinematic data acquired with 2D videos was assessed, to understand if they can be a good approximation to the golden standard represented by motion capture (MOCAP). This was done by evaluating steps acquired during two running sessions (OS1 and OS3) with elite paralympic athletes. Then, the problem was formulated using a deep learning approach, training a neural network over data collected from in vitro bench tests, carried out on a hydraulic test bench. Two models were built: the first one was trained over data from standard procedures and validated on two steps of OS1; then, in order to improve the performance of the prototype, a second model was built and trained with data from newly studied procedures. It was then validated on three steps from OS3

Padua Thesis and Dissertation Archive

Audio and visual perceptions for mobile robot

Author: GUAN FENG
Publication venue
Publication date: 15/01/2007
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS