408 research outputs found
Deformable 3-D Modelling from Uncalibrated Video Sequences
Submitted for the degree of Doctor of Philosophy, Queen Mary, University of Londo
Manifold Constrained Low-Rank Decomposition
Low-rank decomposition (LRD) is a state-of-the-art method for visual data
reconstruction and modelling. However, it is a very challenging problem when
the image data contains significant occlusion, noise, illumination variation,
and misalignment from rotation or viewpoint changes. We leverage the specific
structure of data in order to improve the performance of LRD when the data are
not ideal. To this end, we propose a new framework that embeds manifold priors
into LRD. To implement the framework, we design an alternating direction method
of multipliers (ADMM) method which efficiently integrates the manifold
constraints during the optimization process. The proposed approach is
successfully used to calculate low-rank models from face images, hand-written
digits and planar surface images. The results show a consistent increase of
performance when compared to the state-of-the-art over a wide range of
realistic image misalignments and corruptions
Human-centric light sensing and estimation from RGBD images: The invisible light switch
Lighting design in indoor environments is of primary importance for at least
two reasons: 1) people should perceive an adequate light; 2) an effective
lighting design means consistent energy saving. We present the Invisible Light
Switch (ILS) to address both aspects. ILS dynamically adjusts the room
illumination level to save energy while maintaining constant the light level
perception of the users. So the energy saving is invisible to them. Our
proposed ILS leverages a radiosity model to estimate the light level which is
perceived by a person within an indoor environment, taking into account the
person position and her/his viewing frustum (head pose). ILS may therefore dim
those luminaires, which are not seen by the user, resulting in an effective
energy saving, especially in large open offices (where light may otherwise be
ON everywhere for a single person). To quantify the system performance, we have
collected a new dataset where people wear luxmeter devices while working in
office rooms. The luxmeters measure the amount of light (in Lux) reaching the
people gaze, which we consider a proxy to their illumination level perception.
Our initial results are promising: in a room with 8 LED luminaires, the energy
consumption in a day may be reduced from 18585 to 6206 watts with ILS
(currently needing 1560 watts for operations). While doing so, the drop in
perceived lighting decreases by just 200 lux, a value considered negligible
when the original illumination level is above 1200 lux, as is normally the case
in offices
The Visual Social Distancing Problem
One of the main and most effective measures to contain the recent viral
outbreak is the maintenance of the so-called Social Distancing (SD). To comply
with this constraint, workplaces, public institutions, transports and schools
will likely adopt restrictions over the minimum inter-personal distance between
people. Given this actual scenario, it is crucial to massively measure the
compliance to such physical constraint in our life, in order to figure out the
reasons of the possible breaks of such distance limitations, and understand if
this implies a possible threat given the scene context. All of this, complying
with privacy policies and making the measurement acceptable. To this end, we
introduce the Visual Social Distancing (VSD) problem, defined as the automatic
estimation of the inter-personal distance from an image, and the
characterization of the related people aggregations. VSD is pivotal for a
non-invasive analysis to whether people comply with the SD restriction, and to
provide statistics about the level of safety of specific areas whenever this
constraint is violated. We then discuss how VSD relates with previous
literature in Social Signal Processing and indicate which existing Computer
Vision methods can be used to manage such problem. We conclude with future
challenges related to the effectiveness of VSD systems, ethical implications
and future application scenarios.Comment: 9 pages, 5 figures. All the authors equally contributed to this
manuscript and they are listed by alphabetical order. Under submissio
MX-LSTM: mixing tracklets and vislets to jointly forecast trajectories and head poses
Recent approaches on trajectory forecasting use tracklets to predict the
future positions of pedestrians exploiting Long Short Term Memory (LSTM)
architectures. This paper shows that adding vislets, that is, short sequences
of head pose estimations, allows to increase significantly the trajectory
forecasting performance. We then propose to use vislets in a novel framework
called MX-LSTM, capturing the interplay between tracklets and vislets thanks to
a joint unconstrained optimization of full covariance matrices during the LSTM
backpropagation. At the same time, MX-LSTM predicts the future head poses,
increasing the standard capabilities of the long-term trajectory forecasting
approaches. With standard head pose estimators and an attentional-based social
pooling, MX-LSTM scores the new trajectory forecasting state-of-the-art in all
the considered datasets (Zara01, Zara02, UCY, and TownCentre) with a dramatic
margin when the pedestrians slow down, a case where most of the forecasting
approaches struggle to provide an accurate solution.Comment: 10 pages, 3 figures to appear in CVPR 201
Data augmentation for NeRF: a geometric consistent solution based on view morphing
NeRF aims to learn a continuous neural scene representation by using a finite
set of input images taken from different viewpoints. The fewer the number of
viewpoints, the higher the likelihood of overfitting on them. This paper
mitigates such limitation by presenting a novel data augmentation approach to
generate geometrically consistent image transitions between viewpoints using
view morphing. View morphing is a highly versatile technique that does not
requires any prior knowledge about the 3D scene because it is based on general
principles of projective geometry. A key novelty of our method is to use the
very same depths predicted by NeRF to generate the image transitions that are
then added to NeRF training. We experimentally show that this procedure enables
NeRF to improve the quality of its synthesised novel views in the case of
datasets with few training viewpoints. We improve PSNR up to 1.8dB and 10.5dB
when eight and four views are used for training, respectively. To the best of
our knowledge, this is the first data augmentation strategy for NeRF that
explicitly synthesises additional new input images to improve the model
generalisation
- …