23,881 research outputs found
Towards multiple 3D bone surface identification and reconstruction using few 2D X-ray images for intraoperative applications
This article discusses a possible method to use a small number, e.g. 5, of conventional 2D X-ray images to reconstruct multiple 3D bone surfaces intraoperatively. Each bone’s edge contours in X-ray images are automatically identified. Sparse 3D landmark points of each bone are automatically reconstructed by pairing the 2D X-ray images. The reconstructed landmark point distribution on a surface is approximately optimal covering main characteristics of the surface. A statistical shape model, dense point distribution model (DPDM), is then used to fit the reconstructed optimal landmarks vertices to reconstruct a full surface of each bone separately. The reconstructed surfaces can then be visualised and manipulated by surgeons or used by surgical robotic systems
Optimal Camera Placement to measure Distances Conservativly Regarding Static and Dynamic Obstacles
In modern production facilities industrial robots and humans are supposed to
interact sharing a common working area. In order to avoid collisions, the
distances between objects need to be measured conservatively which can be done
by a camera network. To estimate the acquired distance, unmodelled objects,
e.g., an interacting human, need to be modelled and distinguished from
premodelled objects like workbenches or robots by image processing such as the
background subtraction method.
The quality of such an approach massively depends on the settings of the
camera network, that is the positions and orientations of the individual
cameras. Of particular interest in this context is the minimization of the
error of the distance using the objects modelled by the background subtraction
method instead of the real objects. Here, we show how this minimization can be
formulated as an abstract optimization problem. Moreover, we state various
aspects on the implementation as well as reasons for the selection of a
suitable optimization method, analyze the complexity of the proposed method and
present a basic version used for extensive experiments.Comment: 9 pages, 10 figure
Interactive Camera Network Design using a Virtual Reality Interface
Traditional literature on camera network design focuses on constructing
automated algorithms. These require problem specific input from experts in
order to produce their output. The nature of the required input is highly
unintuitive leading to an unpractical workflow for human operators. In this
work we focus on developing a virtual reality user interface allowing human
operators to manually design camera networks in an intuitive manner. From real
world practical examples we conclude that the camera networks designed using
this interface are highly competitive with, or superior to those generated by
automated algorithms, but the associated workflow is much more intuitive and
simple. The competitiveness of the human-generated camera networks is
remarkable because the structure of the optimization problem is a well known
combinatorial NP-hard problem. These results indicate that human operators can
be used in challenging geometrical combinatorial optimization problems given an
intuitive visualization of the problem.Comment: 11 pages, 8 figure
Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed
Speechreading or lipreading is the technique of understanding and getting
phonetic features from a speaker's visual features such as movement of lips,
face, teeth and tongue. It has a wide range of multimedia applications such as
in surveillance, Internet telephony, and as an aid to a person with hearing
impairments. However, most of the work in speechreading has been limited to
text generation from silent videos. Recently, research has started venturing
into generating (audio) speech from silent video sequences but there have been
no developments thus far in dealing with divergent views and poses of a
speaker. Thus although, we have multiple camera feeds for the speech of a user,
but we have failed in using these multiple video feeds for dealing with the
different poses. To this end, this paper presents the world's first ever
multi-view speech reading and reconstruction system. This work encompasses the
boundaries of multimedia research by putting forth a model which leverages
silent video feeds from multiple cameras recording the same subject to generate
intelligent speech for a speaker. Initial results confirm the usefulness of
exploiting multiple camera views in building an efficient speech reading and
reconstruction system. It further shows the optimal placement of cameras which
would lead to the maximum intelligibility of speech. Next, it lays out various
innovative applications for the proposed system focusing on its potential
prodigious impact in not just security arena but in many other multimedia
analytics problems.Comment: 2018 ACM Multimedia Conference (MM '18), October 22--26, 2018, Seoul,
Republic of Kore
Calibration Wizard: A Guidance System for Camera Calibration Based on Modelling Geometric and Corner Uncertainty
It is well known that the accuracy of a calibration depends strongly on the
choice of camera poses from which images of a calibration object are acquired.
We present a system -- Calibration Wizard -- that interactively guides a user
towards taking optimal calibration images. For each new image to be taken, the
system computes, from all previously acquired images, the pose that leads to
the globally maximum reduction of expected uncertainty on intrinsic parameters
and then guides the user towards that pose. We also show how to incorporate
uncertainty in corner point position in a novel principled manner, for both,
calibration and computation of the next best pose. Synthetic and real-world
experiments are performed to demonstrate the effectiveness of Calibration
Wizard.Comment: Oral presentation at ICCV 201
- …