Search CORE

11 research outputs found

Layered Neural Rendering for Retiming People in Video

Author: Cole Forrester
Dekel Tali
Freeman William T.
Lu Erika
Rubinstein Michael
Salesin David
Xie Weidi
Zisserman Andrew
Publication venue
Publication date: 01/01/2020
Field of study

We present a method for retiming people in an ordinary, natural video---manipulating and editing the time in which different motions of individuals in the video occur. We can temporally align different motions, change the speed of certain actions (speeding up/slowing down, or entirely "freezing" people), or "erase" selected people from the video altogether. We achieve these effects computationally via a dedicated learning-based layered video representation, where each frame in the video is decomposed into separate RGBA layers, representing the appearance of different people in the video. A key property of our model is that it not only disentangles the direct motions of each person in the input video, but also correlates each person automatically with the scene changes they generate---e.g., shadows, reflections, and motion of loose clothing. The layers can be individually retimed and recombined into a new video, allowing us to achieve realistic, high-quality renderings of retiming effects for real-world videos depicting complex actions and involving multiple individuals, including dancing, trampoline jumping, or group running.Comment: To appear in SIGGRAPH Asia 2020. Project webpage: https://retiming.github.io

arXiv.org e-Print Archive

Oxford University Research Archive

Recommended from our members

Models of Visual Appearance for Analyzing and Editing Images and Videos

Author: Sunkavalli Kalyan
Publication venue: 'Harvard University Botany Libraries'
Publication date: 15/08/2012
Field of study

The visual appearance of an image is a complex function of factors such as scene geometry, material reflectances and textures, illumination, and the properties of the camera used to capture the image. Understanding how these factors interact to produce an image is a fundamental problem in computer vision and graphics. This dissertation examines two aspects of this problem: models of visual appearance that allow us to recover scene properties from images and videos, and tools that allow users to manipulate visual appearance in images and videos in intuitive ways. In particular, we look at these problems in three different applications. First, we propose techniques for compositing images that differ significantly in their appearance. Our framework transfers appearance between images by manipulating the different levels of a multi-scale decomposition of the image. This allows users to create realistic composites with minimal interaction in a number of different scenarios. We also discuss techniques for compositing and replacing facial performances in videos. Second, we look at the problem of creating high-quality still images from low-quality video clips. Traditional multi-image enhancement techniques accomplish this by inverting the camera’s imaging process. Our system incorporates feature weights into these image models to create results that have better resolution, noise, and blur characteristics, and summarize the activity in the video. Finally, we analyze variations in scene appearance caused by changes in lighting. We develop a model for outdoor scene appearance that allows us to recover radiometric and geometric infor- mation about the scene from images. We apply this model to a variety of visual tasks, including color-constancy, background subtraction, shadow detection, scene reconstruction, and camera geo-location. We also show that the appearance of a Lambertian scene can be modeled as a combi- nation of distinct three-dimensional illumination subspaces — a result that leads to novel bounds on scene appearance, and a robust uncalibrated photometric stereo method.Engineering and Applied Science

Harvard University - DASH

Recommended from our members

LEARNING TO RIG CHARACTERS

Author: Xu Zhan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 08/08/2023
Field of study

With the emergence of 3D virtual worlds, 3D social media, and massive online games, the need for diverse, high-quality, animation-ready characters and avatars is greater than ever. To animate characters, artists hand-craft articulation structures, such as animation skeletons and part deformers, which require significant amount of manual and laborious interaction with 2D/3D modeling interfaces. This thesis presents deep learning methods that are able to significantly automate the process of character rigging. First, the thesis introduces RigNet, a method capable of predicting an animation skeleton for an input static 3D shape in the form of a polygon mesh. The predicted skeletons match the animator expectations in joint placement and topology. RigNet also estimates surface skin weights which determine how the mesh is animated given the different skeletal poses. In contrast to prior work that fits pre-defined skeletal templates with hand-tuned objectives, RigNet is able to automatically rig diverse characters, such as humanoids, quadrupeds, toys, birds, with varying articulation structure and geometry. RigNet is based on a deep neural architecture that directly operates on the mesh representation. The architecture is trained on a diverse dataset of rigged models that we mined online and curated. The dataset includes 2.7K polygon meshes, along with their associated skeletons and corresponding skin weights. Second, the thesis introduces Morig, a method that automatically rigs character meshes driven by single-view point cloud streams capturing the motion of performing characters. Compared to RigNet, MoRig\u27s rigging is \emph{motion-aware}: its neural network encodes motion cues from the point clouds into compact feature representations that are informative about the articulated parts of the performing character. These motion-aware features guide the inference of an appropriate skeletal rig for the input mesh. Furthermore, Morig is able to animate the rig according to the captured point cloud motion. Morig can handle diverse characters with different morphologies (e.g., humanoids, quadrupeds, toy characters). It also accounts for occluded regions in the point clouds and mismatches in the part proportions between the input mesh and captured character. Third, the thesis introduces APES, a method that takes as input 2D raster images depicting a small set of poses of a character shown in a sprite sheet, and identifies articulated parts useful for rigging the character. APES uses a combination of neural network inference and integer linear programming to identify a compact set of articulated body parts, e.g. head, torso and limbs, that best reconstruct the input poses. Compared to Morig and RigNet that require a large collection of training models with associated skeletons and skinning weights, APES\u27 neural architecture relies on less effortful supervision from (i) pixel correspondences readily available in existing large cartoon image datasets (e.g., Creative Flow), (ii) a relatively small dataset of 57 cartoon characters segmented into moving parts. Finally, the thesis discusses future research directions related to combining neural rigging with 3D and 4D reconstruction of characters from point cloud data and 2D video as well as automating the process of motion synthesis for 3D characters

ScholarWorks@UMass Amherst

Dementia in Parkinson’s Disease

Author
Publication venue: 'IntechOpen'
Publication date: 27/07/2022
Field of study

An estimated 50% to 80% of individuals with Parkinson’s disease experience Parkinson’s disease dementia (PDD). Based on the prevalence and clinical complexity of PDD, this book provides an in-depth update on topics including epidemiology, diagnosis, and treatment. Chapters discuss non-medical therapies and examine views on end-of-life issues as well. This book is a must-read for anyone interested in PDD whether they are a patient, caregiver, or doctor

Directory of Open Access Books (DOAB)

Water rights and related water supply issues

Author
Publication venue: U.S. Committee on Irrigation and Drainage
Publication date: 01/10/2004
Field of study

Presented during the USCID water management conference held on October 13-16, 2004 in Salt Lake City, Utah. The theme of the conference was "Water rights and related water supply issues."Includes bibliographical references.Proceedings sponsored by the U.S. Department of the Interior, Central Utah Project Completion Act Office and the U.S. Committee on Irrigation and Drainage.Consensus building as a primary tool to resolve water supply conflicts -- Administration to Colorado River allocations: the Law of the River and the Colorado River Water Delivery Agreement of 2003 -- Irrigation management in Afghanistan: the tradition of Mirabs -- Institutional reforms in irrigation sector of Pakistan: an approach towards integrated water resource management -- On-line and real-time water right allocation in Utah's Sevier River basin -- Improving equity of water distribution: the challenge for farmer organizations in Sindh, Pakistan -- Impacts from transboundary water rights violations in South Asia -- Impacts of water conservation and Endangered Species Act on large water project planning, Utah Lake Drainage Basin Water Delivery System, Bonneville Unit of the Central Utah Project -- Economic importance and environmental challenges of the Awash River basin to Ethiopia -- Accomplishing the impossible: overcoming obstacles of a combined irrigation project -- Estimating actual evapotranspiration without land use classification -- Improving water management in irrigated agricultue -- Beneficial uses of treated drainage water -- Comparative assessment of risk mitigation options for irrigated agricutlrue -- A multi-variable approach for the command of Canal de Provence Aix Nord Water Supply Subsystem -- Hierarchical Bayesian Analysis and Statistical Learning Theory II: water management application -- Soil moisture data collection and water supply forecasting -- Development and implementation of a farm water conservation program within the Coachella Valley Water District, California -- Concepts of ground water recharge and well augmentation in northeastern Colorado -- Water banking in Colorado: an experiment in trouble? -- Estimating conservable water in the Klamath Irrigation Project -- Socio-economic impacts of land retirement in Westlands Water District -- EPDM rubber lining system chosen to save valuable irrigation water -- A user-centered approach to develop decision support systems for estimating pumping and augmentation needs in Colorado's South Platte basin -- Utah's Tri-County Automation Project -- Using HEC-RAS to model canal systems -- Potential water and energy conservation and improved flexibility for water users in the Oasis area of the Coachella Valley Water District, California

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Analysing British sign language through the lens of systemic functional linguistics

Author: Rudge Luke A.
Publication venue
Publication date
Field of study

Approaches to understanding language via Systemic Functional Linguistics (SFL) have resulted in a compendium of literature focussing on language as a ‘social semiotic.’ One such area of this literature comprises systemic functional grammars: descriptions of various languages and the way in which they create meaning. Despite the application of SFL to numerous languages and the creation of systemic functional grammars, a common thread is that of modality: SFL has been applied to numerous languages in the spoken and written modalities, but not in any detail to languages in the visual-spatial modality.My thesis presents an initial attempt at analysing British Sign Language (BSL) through the systemic functional lens. Calling on various theories and methods found in sign linguistics and SFL, I perform an analysis on a sample of BSL clauses (N = 1,375) from three perspectives: how BSL manages exchanges of communication (the interpersonal metafunction); how BSL encodes aspects of experience and reality (the experiential metafunction); and how BSL may be organised to produce a coherent text with variance in information prominence (the textual metafunction). As a result, I present three sets of system networks based on these three metafunctions, complete with realisation statements and examples.This thesis provides considerable impact. From an academic perspective, this is the first in-depth systemic functional description of a language in the visual-spatial modality, providing insight both into how such languages function, and how analyses of these languages may feed back into those of spoken and written languages. From a social perspective, the BSL system networks can assist language learners of any level as a point of reference in clause construction. Furthermore, intermediate and higher BSL qualifications stipulate knowledge of sign linguistics as a required component, yet these assessments are based on resources that have not been updated in nearly twenty years. As such, the products of this thesis may go towards informing future BSL assessments

UWE Bristol Research Repository

Recommended from our members

Federal Register

Author: National Archives (U.S.)
United States. Office of the Federal Register.
Publication venue: United States. General Services Administration.
Publication date: 31/08/2011
Field of study

Daily publication of the U.S. Office of the Federal Register contains rules and regulations, proposed legislation and rule changes, and other notices, including "Presidential proclamations and Executive Orders, Federal agency documents having general applicability and legal effect, documents required to be published by act of Congress, and other Federal agency documents of public interest" (p. ii). Table of Contents starts on page iii

UNT Digital Library

Renewable resources in the Pacific : proceedings of the 12th Pacific Trade and Development Conference, held in Vancouver, Canada, 7-11 Sept. 1981

Author: English H.E.
Scott Anthony
Publication venue: IDRC, Ottawa, ON, CA
Publication date: 01/01/1982
Field of study

Meeting: Pacific Trade and Development Conference, 12th, 7-11 Sept. 1981, Vancouver, B.C., C

International Development Research Centre: IDRC Digital Library

Recommended from our members

1995 BRAC Commission

Author
Publication venue
Publication date
Field of study

Data Call - Naval Command, Control and Ocean Surveillance Center, Research Development Test Evaluation Division - San Diego, CA. Data Call #5. Box 186, L-118

UNT Digital Library

The grammar of immersion: a social semiotic study of nonfiction cinematic virtual reality

Author: Doyle Phillip
Publication venue
Publication date: 01/11/2023
Field of study

Cinematic virtual reality (CVR) is an audio-visual form viewed in a virtual reality headset. Its novelty lies in the way it immerses its audience in highly realistic 360° visual representations. Being camera-based, CVR facilitates many of the practices of conventional filmmaking but fundamentally alters them through its lack of a rectangular frame. As such, CVR has garnered scholarly attention as a ‘frameless’ storytelling medium yet to develop its own language. The form has gained traction with producers of nonfiction who recognize CVR’s capacity to transport audiences to remote social worlds, leading to claims that equate CVR’s immersion with a social and emotional response to its filmed subjects. A strand of CVR scholarship has emerged, grounding nonfiction CVR theoretically and critiquing such deterministic claims. Broadly speaking, these parallel strands of inquiry point to a common concern with CVR’s semiotics; as the meaning potential of the 360° format, and the social aspects of its use in documenting reality. Currently however, there appears to be a lack of systematic analyses that foreground CVR’s semiotics. This study addresses this gap by using social semiotic methods to complement these threads of inquiry, subsuming them into a holistic account of CVR’s semantics. Utilizing systemic functional methods, multimodal discourse analyses were performed on nonfiction CVR texts addressing core research objectives. The first objective is the systematic description of CVR as a semiotic technology, and the configuring of discourse through its novel 360° modality. The CVR spectator is described for their role in the real-time construction of low-level meanings. Higher-level concepts further characterize CVR texts as technologically enabled, virtual sites of social discourse. The second research objective concerns clarifying the implications of CVR for nonfiction practitioners. Nonfiction discourse is conceptualized as the negotiation of semiotic autonomy, independence, and control, between viewing spectator, filmed subject, and CVR author respectively. The third objective concerns the development of an analytical approach tailored specifically for CVR. Extant systems from image, text, film, and action analyses are reflexively applied, appraised, and adapted for use in the study of CVR and new frames are presented to cater for the 360° modality. The findings show CVR to be an inherently logical, contextualizing form, where the spectator has a degree of sense-making autonomy in the construction of representational and social meanings. This semantic autonomy is found to camouflage the deeper textual constructions in what appear as ‘reality experiences’. The repercussions for the CVR producer are the indeterminacy of meanings which are ‘at risk’ in particular ways when conventional framing methods cannot be utilized, and when the spectator is given reflexive agency to make meaningful connections across the 360° image. Systemic functional analytical methods prove flexible enough to be applied to the texts, and open enough for the study to present additional systems and frames for a more fulsome approach to the analysis of CVR

DCU Online Research Access Service