23,881 research outputs found

    Towards multiple 3D bone surface identification and reconstruction using few 2D X-ray images for intraoperative applications

    Get PDF
    This article discusses a possible method to use a small number, e.g. 5, of conventional 2D X-ray images to reconstruct multiple 3D bone surfaces intraoperatively. Each bone’s edge contours in X-ray images are automatically identified. Sparse 3D landmark points of each bone are automatically reconstructed by pairing the 2D X-ray images. The reconstructed landmark point distribution on a surface is approximately optimal covering main characteristics of the surface. A statistical shape model, dense point distribution model (DPDM), is then used to fit the reconstructed optimal landmarks vertices to reconstruct a full surface of each bone separately. The reconstructed surfaces can then be visualised and manipulated by surgeons or used by surgical robotic systems

    Optimal Camera Placement to measure Distances Conservativly Regarding Static and Dynamic Obstacles

    Get PDF
    In modern production facilities industrial robots and humans are supposed to interact sharing a common working area. In order to avoid collisions, the distances between objects need to be measured conservatively which can be done by a camera network. To estimate the acquired distance, unmodelled objects, e.g., an interacting human, need to be modelled and distinguished from premodelled objects like workbenches or robots by image processing such as the background subtraction method. The quality of such an approach massively depends on the settings of the camera network, that is the positions and orientations of the individual cameras. Of particular interest in this context is the minimization of the error of the distance using the objects modelled by the background subtraction method instead of the real objects. Here, we show how this minimization can be formulated as an abstract optimization problem. Moreover, we state various aspects on the implementation as well as reasons for the selection of a suitable optimization method, analyze the complexity of the proposed method and present a basic version used for extensive experiments.Comment: 9 pages, 10 figure

    Interactive Camera Network Design using a Virtual Reality Interface

    Full text link
    Traditional literature on camera network design focuses on constructing automated algorithms. These require problem specific input from experts in order to produce their output. The nature of the required input is highly unintuitive leading to an unpractical workflow for human operators. In this work we focus on developing a virtual reality user interface allowing human operators to manually design camera networks in an intuitive manner. From real world practical examples we conclude that the camera networks designed using this interface are highly competitive with, or superior to those generated by automated algorithms, but the associated workflow is much more intuitive and simple. The competitiveness of the human-generated camera networks is remarkable because the structure of the optimization problem is a well known combinatorial NP-hard problem. These results indicate that human operators can be used in challenging geometrical combinatorial optimization problems given an intuitive visualization of the problem.Comment: 11 pages, 8 figure

    Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed

    Full text link
    Speechreading or lipreading is the technique of understanding and getting phonetic features from a speaker's visual features such as movement of lips, face, teeth and tongue. It has a wide range of multimedia applications such as in surveillance, Internet telephony, and as an aid to a person with hearing impairments. However, most of the work in speechreading has been limited to text generation from silent videos. Recently, research has started venturing into generating (audio) speech from silent video sequences but there have been no developments thus far in dealing with divergent views and poses of a speaker. Thus although, we have multiple camera feeds for the speech of a user, but we have failed in using these multiple video feeds for dealing with the different poses. To this end, this paper presents the world's first ever multi-view speech reading and reconstruction system. This work encompasses the boundaries of multimedia research by putting forth a model which leverages silent video feeds from multiple cameras recording the same subject to generate intelligent speech for a speaker. Initial results confirm the usefulness of exploiting multiple camera views in building an efficient speech reading and reconstruction system. It further shows the optimal placement of cameras which would lead to the maximum intelligibility of speech. Next, it lays out various innovative applications for the proposed system focusing on its potential prodigious impact in not just security arena but in many other multimedia analytics problems.Comment: 2018 ACM Multimedia Conference (MM '18), October 22--26, 2018, Seoul, Republic of Kore

    Calibration Wizard: A Guidance System for Camera Calibration Based on Modelling Geometric and Corner Uncertainty

    Get PDF
    It is well known that the accuracy of a calibration depends strongly on the choice of camera poses from which images of a calibration object are acquired. We present a system -- Calibration Wizard -- that interactively guides a user towards taking optimal calibration images. For each new image to be taken, the system computes, from all previously acquired images, the pose that leads to the globally maximum reduction of expected uncertainty on intrinsic parameters and then guides the user towards that pose. We also show how to incorporate uncertainty in corner point position in a novel principled manner, for both, calibration and computation of the next best pose. Synthetic and real-world experiments are performed to demonstrate the effectiveness of Calibration Wizard.Comment: Oral presentation at ICCV 201
    corecore