30 research outputs found

    Colored Point Cloud Completion for a Head Using Adversarial Rendered Image Loss

    No full text
    Recent advances in depth measurement and its utilization have made point cloud processing more critical. Additionally, the human head is essential for communication, and its three-dimensional data are expected to be utilized in this regard. However, a single RGB-Depth (RGBD) camera is prone to occlusion and depth measurement failure for dark hair colors such as black hair. Recently, point cloud completion, where an entire point cloud is estimated and generated from a partial point cloud, has been studied, but only the shape is learned, rather than the completion of colored point clouds. Thus, this paper proposes a machine learning-based completion method for colored point clouds with XYZ location information and the International Commission on Illumination (CIE) LAB (L*a*b*) color information. The proposed method uses the color difference between point clouds based on the Chamfer Distance (CD) or Earth Mover’s Distance (EMD) of point cloud shape evaluation as a color loss. In addition, an adversarial loss to L*a*b*-Depth images rendered from the output point cloud can improve the visual quality. The experiments examined networks trained using a colored point cloud dataset created by combining two 3D datasets: hairstyles and faces. Experimental results show that using the adversarial loss with the colored point cloud renderer in the proposed method improves the image domain’s evaluation

    Real-Time Multiview Recognition of Human Gestures by Distributed Image Processing

    No full text
    Since a gesture involves a dynamic and complex motion, multiview observation and recognition are desirable. For the better representation of gestures, one needs to know, in the first place, from which views a gesture should be observed. Furthermore, it becomes increasingly important how the recognition results are integrated when larger numbers of camera views are considered. To investigate these problems, we propose a framework under which multiview recognition is carried out, and an integration scheme by which the recognition results are integrated online and in realtime. For performance evaluation, we use the ViHASi (Virtual Human Action Silhouette) public image database as a benchmark and our Japanese sign language (JSL) image database that contains 18 kinds of hand signs. By examining the recognition rates of each gesture for each view, we found gestures that exhibit view dependency and the gestures that do not. Also, we found that the view dependency itself could vary depending on the target gesture sets. By integrating the recognition results of different views, our swarm-based integration provides more robust and better recognition performance than individual fixed-view recognition agents

    Real-Time Multiview Recognition of Human Gestures by Distributed Image Processing

    No full text
    <p/> <p>Since a gesture involves a dynamic and complex motion, multiview observation and recognition are desirable. For the better representation of gestures, one needs to know, in the first place, from which views a gesture should be observed. Furthermore, it becomes increasingly important how the recognition results are integrated when larger numbers of camera views are considered. To investigate these problems, we propose a framework under which multiview recognition is carried out, and an integration scheme by which the recognition results are integrated online and in realtime. For performance evaluation, we use the ViHASi (Virtual Human Action Silhouette) public image database as a benchmark and our Japanese sign language (JSL) image database that contains 18 kinds of hand signs. By examining the recognition rates of each gesture for each view, we found gestures that exhibit view dependency and the gestures that do not. Also, we found that the view dependency itself could vary depending on the target gesture sets. By integrating the recognition results of different views, our swarm-based integration provides more robust and better recognition performance than individual fixed-view recognition agents.</p

    On-demand learning system using 4K video source

    Get PDF
    EI 2009: Multimedia on Mobile Devices, IS&T/SPIE ELECTRONIC IMAGING, Jan 18-22, 2009, San Jose, California, United StatesThere are various kinds of learning systems in the world and quite a lot of them are using video sources. Also, those video sources have many kinds according to the content of learning and aim. In this paper, I'd like to describe the usability of learning systems by using a super high definition video source focusing on making handling of video source using super high resolution. Furthermore, the future progress and present problems would be considered by proposing an on-demand learning system using a super high definition video source. The super high resolution here means 4K (4096x2160 dots)
    corecore