235 research outputs found

    Pictonaut: movie cartoonization using 3D human pose estimation and GANs

    Get PDF
    This article describes Pictonaut, a novel method to automatically synthetise animated shots from motion picture footage. Its results are editable (backgrounds, characters, lighting, etc.) with conventional 3D software, and they have the finish of professional 2D animation. Rather than addressing the challenge solely as an image translation problem, a hybrid approach combining multi-person 3D human pose estimation and GANs is taken. Sub-sampled video frames are processed with OpenPose and SMPLify-X to obtain the 3D parameters of the pose (body, hands and face expression) of all depicted characters. The captured parameters are retargeted into manually selected 3D models, cel shaded to mimic the style of a 2D cartoon. The results of sub-sampled frames are interpolated to generate a complete and smooth motion for all the characters. The background is cartoonized with a GAN. Qualitative evaluation shows that the approach is feasible, and a small dataset of synthetised shots obtained from real movie scenes is provided.This work is partially supported by the Spanish Ministry of Science and Innovation under contract PID2019-107255GB, and by the SGR programme 2017-SGR-1414 of the Catalan Government.Peer ReviewedPostprint (published version

    Lester: rotoscope animation through video object segmentation and tracking

    Full text link
    This article introduces Lester, a novel method to automatically synthetise retro-style 2D animations from videos. The method approaches the challenge mainly as an object segmentation and tracking problem. Video frames are processed with the Segment Anything Model (SAM) and the resulting masks are tracked through subsequent frames with DeAOT, a method of hierarchical propagation for semi-supervised video object segmentation. The geometry of the masks' contours is simplified with the Douglas-Peucker algorithm. Finally, facial traits, pixelation and a basic shadow effect can be optionally added. The results show that the method exhibits an excellent temporal consistency and can correctly process videos with different poses and appearances, dynamic shots, partial shots and diverse backgrounds. The proposed method provides a more simple and deterministic approach than diffusion models based video-to-video translation pipelines, which suffer from temporal consistency problems and do not cope well with pixelated and schematic outputs. The method is also much most practical than techniques based on 3D human pose estimation, which require custom handcrafted 3D models and are very limited with respect to the type of scenes they can process

    Creating walk-through images from a video sequence of a dynamic scene

    Get PDF
    A comprehensive scheme for creating walk-through images from a video sequence by generalizing the idea of tour into the picture (TIP) was discussed. The proposed scheme was designed to incorporate a new modeling scheme on a vanishing circle identified in the video and an automatic background detection from the video. This scheme let users experience the feel of navigating into a video sequence with their own interpretation and imagination about a given scene. The proposed scheme covers several types of video films of dynamic scenes such as sports coverage, cartoon animation and movie films in which object continuously change shapes and locations

    A rule-based video database system architecture

    Get PDF
    Cataloged from PDF version of article.We propose a novel architecture for a video database system incorporating both spatio-temporal and semantic (keyword, event/activity and category-based) query facilities. The originality of our approach stems from the fact that we intend to provide full support for spatio-temporal, relative object-motion and similarity-based objecttrajectory queries by a rule-based system utilizing a knowledge-base while using an object-relational database to answer semantic-based queries. Our method of extracting and modeling spatio-temporal relations is also a unique one such that we segment video clips into shots using spatial relationships between objects in video frames rather than applying a traditional scene detection algorithm. The technique we use is simple, yet novel and powerful in terms of effectiveness and user query satisfaction: video clips are segmented into shots whenever the current set of relations between objects changes and the video frames, where these changes occur, are chosen as keyframes. The directional, topological and third-dimension relations used for shots are those of the keyframes selected to represent the shots and this information is kept, along with frame numbers of the keyframes, in a knowledge-base as Prolog facts. The system has a comprehensive set of inference rules to reduce the number of facts stored in the knowledge-base because a considerable number of facts, which otherwise would have to be stored explicitly, can be derived by rules with some extra effort. (C)2002 Elsevier Science Inc. All rights reserved

    Easterner, Vol. 67, No. 8, November 12, 2015

    Get PDF
    This issue of the Easterner contains articles about an anti-sexual assault rally, a National Press Club award for dean Vickie Shields, social work senior secretary Carol Golden, a Global Studies Lecture Series speaker Bipasha Biswa, a student production of Pocatello, the Hot Dogs for Heroes charity event, the men\u27s and women\u27s basketball team, the football defeat to Northern Arizona University, and the end of the soccer season.https://dc.ewu.edu/student_newspapers/1901/thumbnail.jp

    Hart v. Electronic Arts Inc

    Get PDF
    USDC for the District of New Jerse

    Selected Topics in Bayesian Image/Video Processing

    Get PDF
    In this dissertation, three problems in image deblurring, inpainting and virtual content insertion are solved in a Bayesian framework.;Camera shake, motion or defocus during exposure leads to image blur. Single image deblurring has achieved remarkable results by solving a MAP problem, but there is no perfect solution due to inaccurate image prior and estimator. In the first part, a new non-blind deconvolution algorithm is proposed. The image prior is represented by a Gaussian Scale Mixture(GSM) model, which is estimated from non-blurry images as training data. Our experimental results on a total twelve natural images have shown that more details are restored than previous deblurring algorithms.;In augmented reality, it is a challenging problem to insert virtual content in video streams by blending it with spatial and temporal information. A generic virtual content insertion (VCI) system is introduced in the second part. To the best of my knowledge, it is the first successful system to insert content on the building facades from street view video streams. Without knowing camera positions, the geometry model of a building facade is established by using a detection and tracking combined strategy. Moreover, motion stabilization, dynamic registration and color harmonization contribute to the excellent augmented performance in this automatic VCI system.;Coding efficiency is an important objective in video coding. In recent years, video coding standards have been developing by adding new tools. However, it costs numerous modifications in the complex coding systems. Therefore, it is desirable to consider alternative standard-compliant approaches without modifying the codec structures. In the third part, an exemplar-based data pruning video compression scheme for intra frame is introduced. Data pruning is used as a pre-processing tool to remove part of video data before they are encoded. At the decoder, missing data is reconstructed by a sparse linear combination of similar patches. The novelty is to create a patch library to exploit similarity of patches. The scheme achieves an average 4% bit rate reduction on some high definition videos

    Proof of Concept For the Use of Motion Capture Technology In Athletic Pedagogy

    Get PDF
    Visualization has long been an important method for conveying complex information. Where information transfer using written and spoken means might amount to 200-250 words per minute, visual media can often convey information at many times this rate. This makes visualization a potentially important tool for education. Athletic instruction, particularly, can involve communication about complex human movement that is not easily conveyed with written or spoken descriptions. Video based instruction can be problematic since video data can contain too much information, thereby making it more difficult for a student to absorb what is cognitively necessary. The lesson is to present the learner what is needed and not more. We present a novel use of motion capture animation as an educational tool for teaching athletic movements. The advantage of motion capture is its ability to accurately represent real human motion in a minimalist context which removes extraneous information normally found in video. Motion capture animation only displays motion information, not additional information regarding the motion context. Producing an “automated coach” would be too large and difficult a problem to solve within the scope of a Master's thesis but we can perform initial steps including producing a useful software tool which performs data analysis on two motion datasets. We believe such a tool would be beneficial to a human coach as an analysis tool and the work would provide some useful understanding of next important steps towards perhaps someday producing an automated coach

    A rule-based video database system architecture

    Get PDF
    We propose a novel architecture for a video database system incorporating both spatio-temporal and semantic (keyword, event/activity and category-based) query facilities. The originality of our approach stems from the fact that we intend to provide full support for spatio-temporal, relative object-motion and similarity-based object-trajectory queries by a rule-based system utilizing a knowledge-base while using an object-relational database to answer semantic-based queries. Our method of extracting and modeling spatio-temporal relations is also a unique one such that we segment video clips into shots using spatial relationships between objects in video frames rather than applying a traditional scene detection algorithm. The technique we use is simple, yet novel and powerful in terms of effectiveness and user query satisfaction: video clips are segmented into shots whenever the current set of relations between objects changes and the video frames, where these changes occur, are chosen as keyframes. The directional, topological and third-dimension relations used for shots are those of the keyframes selected to represent the shots and this information is kept, along with frame numbers of the keyframes, in a knowledge-base as Prolog facts. The system has a comprehensive set of inference rules to reduce the number of facts stored in the knowledge-base because a considerable number of facts, which otherwise would have to be stored explicitly, can be derived by rules with some extra effort. © 2002 Elsevier Science Inc. All rights reserved

    Virtual Reality Games for Motor Rehabilitation

    Get PDF
    This paper presents a fuzzy logic based method to track user satisfaction without the need for devices to monitor users physiological conditions. User satisfaction is the key to any product’s acceptance; computer applications and video games provide a unique opportunity to provide a tailored environment for each user to better suit their needs. We have implemented a non-adaptive fuzzy logic model of emotion, based on the emotional component of the Fuzzy Logic Adaptive Model of Emotion (FLAME) proposed by El-Nasr, to estimate player emotion in UnrealTournament 2004. In this paper we describe the implementation of this system and present the results of one of several play tests. Our research contradicts the current literature that suggests physiological measurements are needed. We show that it is possible to use a software only method to estimate user emotion
    corecore