132 research outputs found
Sensor Networks TDOA Self-Calibration: 2D Complexity Analysis and Solutions
Given a network of receivers and transmitters, the process of determining
their positions from measured pseudo-ranges is known as network
self-calibration. In this paper we consider 2D networks with synchronized
receivers but unsynchronized transmitters and the corresponding calibration
techniques,known as TDOA techniques. Despite previous work, TDOA
self-calibration is computationally challenging. Iterative algorithms are very
sensitive to the initialization, causing convergence issues.In this paper, we
present a novel approach, which gives an algebraic solution to three previously
unsolved scenarios. Our solvers can lead to a position error <1.2% and are
robust to noise
Massive MIMO-based Localization and Mapping Exploiting Phase Information of Multipath Components
In this paper, we present a robust multipath-based localization and mapping
framework that exploits the phases of specular multipath components (MPCs)
using a massive multiple-input multiple-output (MIMO) array at the base
station. Utilizing the phase information related to the propagation distances
of the MPCs enables the possibility of localization with extraordinary accuracy
even with limited bandwidth. The specular MPC parameters along with the
parameters of the noise and the dense multipath component (DMC) are tracked
using an extended Kalman filter (EKF), which enables to preserve the
distance-related phase changes of the MPC complex amplitudes. The DMC comprises
all non-resolvable MPCs, which occur due to finite measurement aperture. The
estimation of the DMC parameters enhances the estimation quality of the
specular MPCs and therefore also the quality of localization and mapping. The
estimated MPC propagation distances are subsequently used as input to a
distance-based localization and mapping algorithm. This algorithm does not need
prior knowledge about the surrounding environment and base station position.
The performance is demonstrated with real radio-channel measurements using an
antenna array with 128 ports at the base station side and a standard cellular
signal bandwidth of 40 MHz. The results show that high accuracy localization is
possible even with such a low bandwidth.Comment: 14 pages (two columns), 13 figures. This work has been submitted to
the IEEE Transaction on Wireless Communications for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
3D Face Appearance Model
We build a 3d face shape model, including inter- and intra-shape variations, derive the analytical jacobian of its resulting 2d rendered image, and show example of its fitting performance with light, pose, id, expression and texture variations
Beyond Gr\"obner Bases: Basis Selection for Minimal Solvers
Many computer vision applications require robust estimation of the underlying
geometry, in terms of camera motion and 3D structure of the scene. These robust
methods often rely on running minimal solvers in a RANSAC framework. In this
paper we show how we can make polynomial solvers based on the action matrix
method faster, by careful selection of the monomial bases. These monomial bases
have traditionally been based on a Gr\"obner basis for the polynomial ideal.
Here we describe how we can enumerate all such bases in an efficient way. We
also show that going beyond Gr\"obner bases leads to more efficient solvers in
many cases. We present a novel basis sampling scheme that we evaluate on a
number of problems
Generalised epipolar constraints
The frontier of a curved surface is the envelope of contour generators showing the boundary, at least locally, of the visible region swept out under viewer motion. In general, the outlines of curved surfaces (apparent contours) from different viewpoints are generated by different contour generators on the surface and hence do not provide a constraint on viewer motion. Frontier points, however, have projections which correspond to a real point on the surface and can be used to constrain viewer motion by the epipolar constraint. We show how to recover viewer motion from frontier points and generalise the ordinary epipolar constraint to deal with points, curves and apparent contours of surfaces. This is done for both continuous and discrete motion, known or unknown orientation, calibrated and uncalibrated, perspective, weak perspective and orthographic cameras. Results of an iterative scheme to recover the epipolar line structure from real image sequences using only the outlines of curved surfaces, is presented. A statistical evaluation is performed to estimate the stability of the solution. It is also shown how the full motion of the camera from a sequence of images can be obtained from the relative motion between image pairs
Learning Future Object Prediction with a Spatiotemporal Detection Transformer
We explore future object prediction -- a challenging problem where all
objects visible in a future video frame are to be predicted. We propose to
tackle this problem end-to-end by training a detection transformer to directly
output future objects. In order to make accurate predictions about the future,
it is necessary to capture the dynamics in the scene, both of other objects and
of the ego-camera. We extend existing detection transformers in two ways to
capture the scene dynamics. First, we experiment with three different
mechanisms that enable the model to spatiotemporally process multiple frames.
Second, we feed ego-motion information to the model via cross-attention. We
show that both of these cues substantially improve future object prediction
performance. Our final approach learns to capture the dynamics and make
predictions on par with an oracle for 100 ms prediction horizons, and
outperform baselines for longer prediction horizons.Comment: 15 pages, 6 figure
LidarCLIP or: How I Learned to Talk to Point Clouds
Research connecting text and images has recently seen several breakthroughs,
with models like CLIP, DALL-E 2, and Stable Diffusion. However, the connection
between text and other visual modalities, such as lidar data, has received less
attention, prohibited by the lack of text-lidar datasets. In this work, we
propose LidarCLIP, a mapping from automotive point clouds to a pre-existing
CLIP embedding space. Using image-lidar pairs, we supervise a point cloud
encoder with the image CLIP embeddings, effectively relating text and lidar
data with the image domain as an intermediary. We show the effectiveness of
LidarCLIP by demonstrating that lidar-based retrieval is generally on par with
image-based retrieval, but with complementary strengths and weaknesses. By
combining image and lidar features, we improve upon both single-modality
methods and enable a targeted search for challenging detection scenarios under
adverse sensor conditions. We also explore zero-shot classification and show
that LidarCLIP outperforms existing attempts to use CLIP for point clouds by a
large margin. Finally, we leverage our compatibility with CLIP to explore a
range of applications, such as point cloud captioning and lidar-to-image
generation, without any additional training. Code and pre-trained models are
available at https://github.com/atonderski/lidarclip
Multiple Offsets Multilateration : A New Paradigm for Sensor Network Calibration with Unsynchronized Reference Nodes
Positioning using wave signal measurements is used in several applications, such as GPS systems, structure from sound and Wifi based positioning. Mathematically, such problems require the computation of the positions of receivers and/or transmitters as well as time offsets if the devices are unsynchronized. In this paper, we expand the previous state-of-the-art on positioning formulations by introducing Multiple Offsets Multilateration (MOM), a new mathematical framework to compute the receivers positions with pseudoranges from unsynchronized reference transmitters at known positions. This could be applied in several scenarios, for example structure from sound and positioning with LEO satellites. We mathematically describe MOM, determining how many receivers and transmitters are needed for the network to be solvable, a study on the number of possible distinct solutions is presented and stable solvers based on homotopy continuation are derived. The solvers are shown to be efficient and robust to noise both for synthetic and real audio data.©2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.fi=vertaisarvioitu|en=peerReviewed
Trust Your IMU: Consequences of Ignoring the IMU Drift
In this paper, we argue that modern pre-integration methods for inertial
measurement units (IMUs) are accurate enough to ignore the drift for short time
intervals. This allows us to consider a simplified camera model, which in turn
admits further intrinsic calibration. We develop the first-ever solver to
jointly solve the relative pose problem with unknown and equal focal length and
radial distortion profile while utilizing the IMU data. Furthermore, we show
significant speed-up compared to state-of-the-art algorithms, with small or
negligible loss in accuracy for partially calibrated setups. The proposed
algorithms are tested on both synthetic and real data, where the latter is
focused on navigation using unmanned aerial vehicles (UAVs). We evaluate the
proposed solvers on different commercially available low-cost UAVs, and
demonstrate that the novel assumption on IMU drift is feasible in real-life
applications. The extended intrinsic auto-calibration enables us to use
distorted input images, making tedious calibration processes obsolete, compared
to current state-of-the-art methods
Improving a real-time object detector with compact temporal information
Neural networks designed for real-time object detectionhave recently improved significantly, but in practice, look-ing at only a single RGB image at the time may not be ideal.For example, when detecting objects in videos, a foregrounddetection algorithm can be used to obtain compact temporaldata, which can be fed into a neural network alongside RGBimages. We propose an approach for doing this, based onan existing object detector, that re-uses pretrained weightsfor the processing of RGB images. The neural network wastested on the VIRAT dataset with annotations for object de-tection, a problem this approach is well suited for. The ac-curacy was found to improve significantly (up to 66%), witha roughly 40% increase in computational time
- …