4,490 research outputs found

    Intent Preserving 360 Video Stabilization Using Constrained Optimization

    Get PDF
    A system and method are disclosed, that solve for rotational updates in 360 videos by removing camera shakes, while preserving user intended motions. The method uses a constrained nonlinear optimization approach in quaternion space. At first, optimal 3D camera rotations are computed between key frames. 3D camera rotations between consecutive frames are then computed. The first, second, and third derivatives of the resulting camera path are minimized, to stabilize the camera orientation path. The computation strives to find a smooth path, while also limiting its deviation from the original path. The system keeps the orientations close to the original, for example, even when the videographer takes a turn. Each frame is then warped to the stabilized path, which results in a smoother video. The rotational camera updates may be applied to the input stream at source or added as metadata. The technology may influence standards by making rotational updates metadata a component of 360 videos. KEYWORDS: 360 degree video, camera rotation, removing camera shake, computing camera rotatio

    Camera Stabilization in 360° Videos and Its Impact on Cyber Sickness, Environmental Perceptions, and Psychophysiological Responses to a Simulated Nature Walk: A Single-Blinded Randomized Trial

    Get PDF
    Immersive virtual environments (IVEs) technology has emerged as a valuable tool to environmental psychology research in general, and specifically to studies of human–nature interactions. However, virtual reality is known to induce cyber sickness, which limits its application and highlights the need for scientific strategies to optimize virtual experiences. In this study, we assessed the impact of improved camera stability on cyber sickness, presence, and psychophysiological responses to a simulated nature walk. In a single-blinded trial, 50 participants were assigned to watch, using a head-mounted display, one of two 10-min 360° videos showing a first-person nature walk: one video contained small-magnitude scene oscillations associated with cameraman locomotion, while in the other video, the oscillations were drastically reduced thanks to an electric stabilizer and a dolly. Measurements of cyber sickness (in terms of both occurrence and severity of symptoms), perceptions of the IVE (presence and perceived environmental restorativeness), and indicators of psychophysiological responses [affect, enjoyment, and heart rate (HR)] were collected before and/or after the exposure. Compared to the low-stability (LS) condition, in the high-stability (HS) condition, participants reported lower severity of cyber sickness symptoms. The delta values for pre–post changes in affect for the LS video revealed a deterioration of participants’ affect profile with a significant increase in ratings of negative affect and fatigue, and decrease in ratings of positive affect. In contrast, there were no pre–post changes in affect for the HS video. No differences were found between the HS and LS conditions with respect to presence, perceived environmental restorativeness, enjoyment, and HR. Cyber sickness was significantly correlated with all components of affect and enjoyment, but not with presence, perceived environmental restorativeness, or HR. These findings demonstrate that improved camera stability in 360° videos is crucial to reduce cyber sickness symptoms and negative affective responses in IVE users. The lack of associations between improved stability and presence, perceived environmental restorativeness, and HR suggests that other aspects of IVE technology must be taken into account in order to improve virtual experiences of nature.publishedVersio

    A robust and efficient video representation for action recognition

    Get PDF
    This paper introduces a state-of-the-art video representation and applies it to efficient action recognition and detection. We first propose to improve the popular dense trajectory features by explicit camera motion estimation. More specifically, we extract feature point matches between frames using SURF descriptors and dense optical flow. The matches are used to estimate a homography with RANSAC. To improve the robustness of homography estimation, a human detector is employed to remove outlier matches from the human body as human motion is not constrained by the camera. Trajectories consistent with the homography are considered as due to camera motion, and thus removed. We also use the homography to cancel out camera motion from the optical flow. This results in significant improvement on motion-based HOF and MBH descriptors. We further explore the recent Fisher vector as an alternative feature encoding approach to the standard bag-of-words histogram, and consider different ways to include spatial layout information in these encodings. We present a large and varied set of evaluations, considering (i) classification of short basic actions on six datasets, (ii) localization of such actions in feature-length movies, and (iii) large-scale recognition of complex events. We find that our improved trajectory features significantly outperform previous dense trajectories, and that Fisher vectors are superior to bag-of-words encodings for video recognition tasks. In all three tasks, we show substantial improvements over the state-of-the-art results

    The Architecture of First Amendment Free Speech

    Get PDF

    Domain-Specific Face Synthesis for Video Face Recognition from a Single Sample Per Person

    Full text link
    The performance of still-to-video FR systems can decline significantly because faces captured in unconstrained operational domain (OD) over multiple video cameras have a different underlying data distribution compared to faces captured under controlled conditions in the enrollment domain (ED) with a still camera. This is particularly true when individuals are enrolled to the system using a single reference still. To improve the robustness of these systems, it is possible to augment the reference set by generating synthetic faces based on the original still. However, without knowledge of the OD, many synthetic images must be generated to account for all possible capture conditions. FR systems may, therefore, require complex implementations and yield lower accuracy when training on many less relevant images. This paper introduces an algorithm for domain-specific face synthesis (DSFS) that exploits the representative intra-class variation information available from the OD. Prior to operation, a compact set of faces from unknown persons appearing in the OD is selected through clustering in the captured condition space. The domain-specific variations of these face images are projected onto the reference stills by integrating an image-based face relighting technique inside the 3D reconstruction framework. A compact set of synthetic faces is generated that resemble individuals of interest under the capture conditions relevant to the OD. In a particular implementation based on sparse representation classification, the synthetic faces generated with the DSFS are employed to form a cross-domain dictionary that account for structured sparsity. Experimental results reveal that augmenting the reference gallery set of FR systems using the proposed DSFS approach can provide a higher level of accuracy compared to state-of-the-art approaches, with only a moderate increase in its computational complexity
    • …
    corecore