59 research outputs found

    Sign language video anonymization

    Get PDF
    Deaf signers who wish to communicate in their native language frequently share videos on the Web. However, videos cannot preserve privacy—as is often desirable for discussion of sensitive topics—since both hands and face convey critical linguistic information and therefore cannot be obscured without degrading communication. Deaf signers have expressed interest in video anonymization that would preserve linguistic content. However, attempts to develop such technology have thus far shown limited success. We are developing a new method for such anonymization, with input from ASL signers. We modify a motion-based image animation model to generate high-resolution videos with the signer identity changed, but with preservation of linguistically significant motions and facial expressions. An asymmetric encoder-decoder structured image generator is used to generate the high-resolution target frame from the low-resolution source frame based on the optical flow and confidence map. We explicitly guide the model to attain clear generation of hands and face by using bounding boxes to improve the loss computation. FID and KID scores are used for evaluation of the realism of the generated frames. This technology shows great potential for practical applications to benefit deaf signers.Published versio

    Resources for computer-based sign recognition from video, and the criticality of consistency of gloss labeling across multiple large ASL video corpora

    Get PDF
    The WLASL purports to be “the largest video dataset for Word-Level American Sign Language (ASL) recognition.” It brings together various publicly shared video collections that could be quite valuable for sign recognition research, and it has been used extensively for such research. However, a critical problem with the accompanying annotations has heretofore not been recognized by the authors, nor by those who have exploited these data: There is no 1-1 correspondence between sign productions and gloss labels. Here we describe a large, linguistically annotated, video corpus of citation-form ASL signs shared by the ASLLRP—with 23,452 sign tokens and an online Sign Bank—in which such correspondences are enforced. We furthermore provide annotations for 19,672 of the WLASL video examples consistent with ASLLRP glossing conventions. For those wishing to use WLASL videos, this provides a set of annotations making it possible: (1) to use those data reliably for computational research; and/or (2) to combine the WLASL and ASLLRP datasets, creating a combined resource that is larger and richer than either of those datasets individually, with consistent gloss labeling for all signs. We also offer a summary of our own sign recognition research to date that exploits these data resources.Published versio

    Physics-Based Object Pose and Shape Estimation from Multiple Views

    No full text
    This paper presents a new algorithm for object pose and shape estimation from multiple views. Using a qualitative shape recovery scheme we first segment the image into parts which belong to a vocabulary of primitives. Based on the additional constraints provided by the qualitative shapes we extend our physics-based framework to allow object pose and shape estimation from stereo images where the two cameras have arbitrary relative orientations. We then generalize our algorithm to integrate measurements from multiple views. To recover more complex objects we generalize the definition for the global bending deformation. We also present an algorithm for model discretization which evenly tessellates the model surface. We demonstrate the usefulness of our technique in experiments involving real images from of a variety of object shapes which may be partially occluded. 1 Introduction The performance of most physics-based shape estimation techniques depends on the accuracy of the initial segm..

    Integration of Quantitative and Qualit at ive Techniques for Deformable Model Fitting from Orthographic, Perspective, and Stereo Projections

    No full text
    In this paper, we.synthesize a new approach to 3-0 object shape recovery by integrating qualitative shape recovery techniques and quantitative physics-based shape estimation techniques. Specifically, we,first use qualitative shape recovery and recognition techniques to provide strong fitting constraints on physics-based deformable model recovery techniques. Secondly, we extend our previously developed tech-nique of fitting deformable models to occluding image contours to the case of image data captured under gen-eral orthographic, perspective, and stereo projections.

    Physics-based object pose and shape estimation from multiple views

    Get PDF
    This paper presents a new algorithm for object pose and shape estimation from multiple views. Using a qualitative shape recovery scheme we first segment the image into parts which belong to a vocabulary of primitives. Based on the additional constraints provided by the qualitative shapes we extend our physics-based framework to allow object pose and shape estimation from stereo images where the two cameras have arbitrary relative orientations. We then generalize our algorithm to integrate measurements from multiple views. To recover more complex objects we generalize the definition for the global bending deformation. We also present an algorithm for model discretization which evenly tessellates the model surface. We demonstrate the usefulness of our technique in experiments involving real images from of a variety of object shapes which may be partially occluded
    • …
    corecore