242 research outputs found

    Simultaneous camera path optimization and distraction removal for improving amateur video

    Get PDF
    A major difference between amateur and professional video lies in the quality of camera paths. Previous work on video stabilization has considered how to improve amateur video by smoothing the camera path. In this paper, we show that additional changes to the camera path can further improve video aesthetics. Our new optimization method achieves multiple simultaneous goals: 1) stabilizing video content over short time scales; 2) ensuring simple and consistent camera paths over longer time scales; and 3) improving scene composition by automatically removing distractions, a common occurrence in amateur video. Our approach uses an L1 camera path optimization framework, extended to handle multiple constraints. Two passes of optimization are used to address both low-level and high-level constraints on the camera path. The experimental and user study results show that our approach outputs video that is perceptually better than the input, or the results of using stabilization only

    Simultaneous Camera Path Optimization and Distraction Removal for Improving Amateur Video

    Full text link

    Video Stabilisation Based on Spatial Transformer Networks

    Get PDF
    User-Generated Content is normally recorded with mobile phones by non-professionals, which leads to a low viewing experience due to artifacts such as jitter and blur. Other jittery videos are those recorded with mounted cameras or moving platforms. In these scenarios, Digital Video Stabilization (DVS) has been utilized, to create high quality, professional level videos. In the industry and academia, there are a number of traditional and Deep Learning (DL)-based DVS systems, however both approaches have limitations: the former struggles to extract and track features in a number of scenarios, and the latter struggles with camera path smoothing, a hard problem to define in this context. On the other hand, traditional methods have shown good performance in smoothing camera path whereas DL methods are effective in feature extraction, tracking, and motion parameter estimation. Hence, to the best of our knowledge the available DVS systems struggle to stabilize videos in a wide variety of scenarios, especially with high motion and certain scene content, such as textureless areas, dark scenes, close object, lack of depth, amongst others. Another challenge faced by current DVS implementations is the resulting artifacts that such systems add to the stabilized videos, degrading the viewing experience. These artifacts are mainly distortion, blur, zoom, and ghosting effects. In this thesis, we utilize the strengths of Deep Learning and traditional methods for video stabilization. Our approach is robust to a wide variety of scene content and camera motion, and avoids adding artifacts to the stabilized video. First, we provide a dataset and evaluation framework for Deep Learning-based DVS. Then, we present our image alignment module, which contains a Spatial Transformer Network (STN). Next, we leverage this module to propose a homography-based video stabilization system. Aiming at avoiding blur and distortion caused by homographies, our next proposal is a translation-based video stabilization method, which contains Exponential Weighted Moving Averages (EWMAs) to smooth the camera path. Finally, instead of using EWMAs, we study the utilization of filters in our approach. In this case, we compare a number of filters and choose the filters with best performance. Since the quality of experience of a viewer does not only consist of video stability, but also of blur and distortion, we consider it is a good trade off to allow some jitter left on the video while avoiding adding distortion and blur. In all three cases, we show that this approach pays off, since our systems ourperform the state-of-the-art proposals

    Agent and object aware tracking and mapping methods for mobile manipulators

    Get PDF
    The age of the intelligent machine is upon us. They exist in our factories, our warehouses, our military, our hospitals, on our roads, and on the moon. Most of these things we call robots. When placed in a controlled or known environment such as an automotive factory or a distribution warehouse they perform their given roles with exceptional efficiency, achieving far more than is within reach of a humble human being. Despite the remarkable success of intelligent machines in such domains, they have yet to make a full-hearted deployment into our homes. The missing link between the robots we have now and the robots that are soon to come to our houses is perception. Perception as we mean it here refers to a level of understanding beyond the collection and aggregation of sensory data. Much of the available sensory information is noisy and unreliable, our homes contain many reflective surfaces, repeating textures on large flat surfaces, and many disruptive moving elements, including humans. These environments change over time, with objects frequently moving within and between rooms. This idea of change in an environment is fundamental to robotic applications, as in most cases we expect them to be effectors of such change. We can identify two particular challenges1 that must be solved for robots to make the jump to less structured environments - how to manage noise and disruptive elements in observational data, and how to understand the world as a set of changeable elements (objects) which move over time within a wider environment. In this thesis we look at one possible approach to solving each of these problems. For the first challenge we use proprioception aboard a robot with an articulated arm to handle difficult and unreliable visual data caused both by the robot and the environment. We use sensor data aboard the robot to improve the pose tracking of a visual system when the robot moves rapidly, with high jerk, or when observing a scene with little visual variation. For the second challenge, we build a model of the world on the level of rigid objects, and relocalise them both as they change location between different sequences and as they move. We use semantics, image keypoints, and 3D geometry to register and align objects between sequences, showing how their position has moved between disparate observations.Open Acces

    A vision-based approach for human hand tracking and gesture recognition.

    Get PDF
    Hand gesture interface has been becoming an active topic of human-computer interaction (HCI). The utilization of hand gestures in human-computer interface enables human operators to interact with computer environments in a natural and intuitive manner. In particular, bare hand interpretation technique frees users from cumbersome, but typically required devices in communication with computers, thus offering the ease and naturalness in HCI. Meanwhile, virtual assembly (VA) applies virtual reality (VR) techniques in mechanical assembly. It constructs computer tools to help product engineers planning, evaluating, optimizing, and verifying the assembly of mechanical systems without the need of physical objects. However, traditional devices such as keyboards and mice are no longer adequate due to their inefficiency in handling three-dimensional (3D) tasks. Special VR devices, such as data gloves, have been mandatory in VA. This thesis proposes a novel gesture-based interface for the application of VA. It develops a hybrid approach to incorporate an appearance-based hand localization technique with a skin tone filter in support of gesture recognition and hand tracking in the 3D space. With this interface, bare hands become a convenient substitution of special VR devices. Experiment results demonstrate the flexibility and robustness introduced by the proposed method to HCI.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .L8. Source: Masters Abstracts International, Volume: 43-03, page: 0883. Adviser: Xiaobu Yuan. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    Multiplexed photography : single-exposure capture of multiple camera settings

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 115-124).The space of camera settings is large and individual settings can vary dramatically from scene to scene. This thesis explores methods for capturing and manipulating multiple camera settings in a single exposure. Multiplexing multiple camera settings in a single exposure can allow post-exposure control and improve the quality of photographs taken in challenging lighting environments (e.g. low light or high motion). We first describe the design and implementation of a prototype optical system and associated algorithms to capture four images of a scene in a single exposure, each taken with a different aperture setting. Our system can be used with commercially available DSLR cameras and photographic lenses without modification to either. We demonstrate several applications of our multi-aperture camera, such as post-exposure depth of field control, synthetic refocusing, and depth-guided deconvolution. Next we describe multiplexed flash illumination to recover both flash and ambient light information as well as extract depth information in a single exposure. Traditional photographic flashes illuminate the scene with a spatially-constant light beam. By adding a mask and optics to a flash, we can project a spatially varying illumination onto the scene which allows us to spatially multiplex the flash and ambient illuminations onto the imager. We apply flash multiplexing to enable single exposure flash/no-flash image fusion, in particular, performing flash/no-flash relighting on dynamic scenes with moving objects. Finally, we propose spatio-temporal multiplexing, a novel image sensor feature that enables simultaneous capture of flash and ambient illumination.(cont.) We describe two possible applications of spatio-temporal multiplexing: single-image flash/no-flash relighting and white balancing scenes containing two distinct illuminants (e.g. flash and fluorescent lighting).by Paul Elijah Green.Ph.D

    Designing to Support Workspace Awareness in Remote Collaboration using 2D Interactive Surfaces

    Get PDF
    Increasing distributions of the global workforce are leading to collaborative workamong remote coworkers. The emergence of such remote collaborations is essentiallysupported by technology advancements of screen-based devices ranging from tabletor laptop to large displays. However, these devices, especially personal and mobilecomputers, still suffer from certain limitations caused by their form factors, that hinder supporting workspace awareness through non-verbal communication suchas bodily gestures or gaze. This thesis thus aims to design novel interfaces andinteraction techniques to improve remote coworkers’ workspace awareness throughsuch non-verbal cues using 2D interactive surfaces.The thesis starts off by exploring how visual cues support workspace awareness infacilitated brainstorming of hybrid teams of co-located and remote coworkers. Basedon insights from this exploration, the thesis introduces three interfaces for mobiledevices that help users maintain and convey their workspace awareness with their coworkers. The first interface is a virtual environment that allows a remote person to effectively maintain his/her awareness of his/her co-located collaborators’ activities while interacting with the shared workspace. To help a person better express his/her hand gestures in remote collaboration using a mobile device, the second interfacepresents a lightweight add-on for capturing hand images on and above the device’sscreen; and overlaying them on collaborators’ device to improve their workspace awareness. The third interface strategically leverages the entire screen space of aconventional laptop to better convey a remote person’s gaze to his/her co-locatedcollaborators. Building on the top of these three interfaces, the thesis envisions an interface that supports a person using a mobile device to effectively collaborate with remote coworkers working with a large display.Together, these interfaces demonstrate the possibilities to innovate on commodity devices to offer richer non-verbal communication and better support workspace awareness in remote collaboration

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences
    • …
    corecore