28,599 research outputs found

    A Study to Optimize Heterogeneous Resources for Open IoT

    Full text link
    Recently, IoT technologies have been progressed, and many sensors and actuators are connected to networks. Previously, IoT services were developed by vertical integration style. But now Open IoT concept has attracted attentions which achieves various IoT services by integrating horizontal separated devices and services. For Open IoT era, we have proposed the Tacit Computing technology to discover the devices with necessary data for users on demand and use them dynamically. We also implemented elemental technologies of Tacit Computing. In this paper, we propose three layers optimizations to reduce operation cost and improve performance of Tacit computing service, in order to make as a continuous service of discovered devices by Tacit Computing. In optimization process, appropriate function allocation or offloading specific functions are calculated on device, network and cloud layer before full-scale operation.Comment: 3 pages, 1 figure, 2017 Fifth International Symposium on Computing and Networking (CANDAR2017), Nov. 201

    Digital representation of historical globes : methods to make 3D and pseudo-3D models of sixteenth century Mercator globes

    Get PDF
    In this paper, the construction of digital representations of a terrestrial and celestial globe will be discussed. Virtual digital (3D) models play an important role in recent research and publications on cultural heritage. The globes discussed in this paper were made by Gerardus Mercator (1512-1594) in 1541 and 1551. Four techniques for the digital representation are discussed and analysed, all using high-resolution photographs of the globes. These photographs were taken under studio conditions in order to get equal lighting and to avoid unwanted light spots. These lighting conditions are important, since the globes have a highly reflective varnish covering. Processing these images using structure from motion, georeferencing of separate scenes and the combination of the photographs with terrestrial laser scanning data results in true 3D representations of the globes. Besides, pseudo-3D models of these globes were generated using dynamic imaging, which is an extensively used technique for visualisations over the Internet. The four techniques and the consequent results are compared on geometric and radiometric quality, with a special focus on their usefulness for distribution and visualisation during an exhibition in honour of the five hundredth birthday of Gerardus Mercator

    Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed

    Full text link
    Speechreading or lipreading is the technique of understanding and getting phonetic features from a speaker's visual features such as movement of lips, face, teeth and tongue. It has a wide range of multimedia applications such as in surveillance, Internet telephony, and as an aid to a person with hearing impairments. However, most of the work in speechreading has been limited to text generation from silent videos. Recently, research has started venturing into generating (audio) speech from silent video sequences but there have been no developments thus far in dealing with divergent views and poses of a speaker. Thus although, we have multiple camera feeds for the speech of a user, but we have failed in using these multiple video feeds for dealing with the different poses. To this end, this paper presents the world's first ever multi-view speech reading and reconstruction system. This work encompasses the boundaries of multimedia research by putting forth a model which leverages silent video feeds from multiple cameras recording the same subject to generate intelligent speech for a speaker. Initial results confirm the usefulness of exploiting multiple camera views in building an efficient speech reading and reconstruction system. It further shows the optimal placement of cameras which would lead to the maximum intelligibility of speech. Next, it lays out various innovative applications for the proposed system focusing on its potential prodigious impact in not just security arena but in many other multimedia analytics problems.Comment: 2018 ACM Multimedia Conference (MM '18), October 22--26, 2018, Seoul, Republic of Kore

    Towards multiple 3D bone surface identification and reconstruction using few 2D X-ray images for intraoperative applications

    Get PDF
    This article discusses a possible method to use a small number, e.g. 5, of conventional 2D X-ray images to reconstruct multiple 3D bone surfaces intraoperatively. Each bone’s edge contours in X-ray images are automatically identified. Sparse 3D landmark points of each bone are automatically reconstructed by pairing the 2D X-ray images. The reconstructed landmark point distribution on a surface is approximately optimal covering main characteristics of the surface. A statistical shape model, dense point distribution model (DPDM), is then used to fit the reconstructed optimal landmarks vertices to reconstruct a full surface of each bone separately. The reconstructed surfaces can then be visualised and manipulated by surgeons or used by surgical robotic systems

    Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression

    Full text link
    We present techniques for improving performance driven facial animation, emotion recognition, and facial key-point or landmark prediction using learned identity invariant representations. Established approaches to these problems can work well if sufficient examples and labels for a particular identity are available and factors of variation are highly controlled. However, labeled examples of facial expressions, emotions and key-points for new individuals are difficult and costly to obtain. In this paper we improve the ability of techniques to generalize to new and unseen individuals by explicitly modeling previously seen variations related to identity and expression. We use a weakly-supervised approach in which identity labels are used to learn the different factors of variation linked to identity separately from factors related to expression. We show how probabilistic modeling of these sources of variation allows one to learn identity-invariant representations for expressions which can then be used to identity-normalize various procedures for facial expression analysis and animation control. We also show how to extend the widely used techniques of active appearance models and constrained local models through replacing the underlying point distribution models which are typically constructed using principal component analysis with identity-expression factorized representations. We present a wide variety of experiments in which we consistently improve performance on emotion recognition, markerless performance-driven facial animation and facial key-point tracking.Comment: to appear in Image and Vision Computing Journal (IMAVIS

    Operator vision aids for space teleoperation assembly and servicing

    Get PDF
    This paper investigates concepts for visual operator aids required for effective telerobotic control. Operator visual aids, as defined here, mean any operational enhancement that improves man-machine control through the visual system. These concepts were derived as part of a study of vision issues for space teleoperation. Extensive literature on teleoperation, robotics, and human factors was surveyed to definitively specify appropriate requirements. This paper presents these visual aids in three general categories of camera/lighting functions, display enhancements, and operator cues. In the area of camera/lighting functions concepts are discussed for: (1) automatic end effector or task tracking; (2) novel camera designs; (3) computer-generated virtual camera views; (4) computer assisted camera/lighting placement; and (5) voice control. In the technology area of display aids, concepts are presented for: (1) zone displays, such as imminent collision or indexing limits; (2) predictive displays for temporal and spatial location; (3) stimulus-response reconciliation displays; (4) graphical display of depth cues such as 2-D symbolic depth, virtual views, and perspective depth; and (5) view enhancements through image processing and symbolic representations. Finally, operator visual cues (e.g., targets) that help identify size, distance, shape, orientation and location are discussed
    corecore