145,510 research outputs found

    Global motion compensated visual attention-based video watermarking

    Get PDF
    Imperceptibility and robustness are two key but complementary requirements of any watermarking algorithm. Low-strength watermarking yields high imperceptibility but exhibits poor robustness. High-strength watermarking schemes achieve good robustness but often suffer from embedding distortions resulting in poor visual quality in host media. This paper proposes a unique video watermarking algorithm that offers a fine balance between imperceptibility and robustness using motion compensated wavelet-based visual attention model (VAM). The proposed VAM includes spatial cues for visual saliency as well as temporal cues. The spatial modeling uses the spatial wavelet coefficients while the temporal modeling accounts for both local and global motion to arrive at the spatiotemporal VAM for video. The model is then used to develop a video watermarking algorithm, where a two-level watermarking weighting parameter map is generated from the VAM saliency maps using the saliency model and data are embedded into the host image according to the visual attentiveness of each region. By avoiding higher strength watermarking in the visually attentive region, the resulting watermarked video achieves high perceived visual quality while preserving high robustness. The proposed VAM outperforms the state-of-the-art video visual attention methods in joint saliency detection and low computational complexity performance. For the same embedding distortion, the proposed visual attention-based watermarking achieves up to 39% (nonblind) and 22% (blind) improvement in robustness against H.264/AVC compression, compared to existing watermarking methodology that does not use the VAM. The proposed visual attention-based video watermarking results in visual quality similar to that of low-strength watermarking and a robustness similar to those of high-strength watermarking

    Integration and Evaluation of a Video Surveillance System

    Get PDF
    Visual surveillance systems are getting a lot of attention over the last few years, due to a growing need for surveillance applications. In this thesis, we present a visual surveillance system that integrates modules for motion detection, tracking, and trajectory characterization to achieve robust monitoring of moving objects in scenes under surveillance. The system operates on video sequences acquired by stationary color and infra-red surveillance cameras. Motion detection is implemented using an algorithm that combines thresholding of temporal variance and background modeling. The tracking algorithm combines motion and appearance information into an appearance model and uses a particle filter framework for object tracking. The trajectory analysis module builds a model for a given normal activity using a factorization approach, and uses this model for the detection of any abnormal motion pattern. The system was tested on a large ground-truthed data set containing hundreds of color and FLIR image sequences. Results of performance evaluation using these sequences are reported in this thesis

    Automatic human behaviour anomaly detection in surveillance video

    Get PDF
    This thesis work focusses upon developing the capability to automatically evaluate and detect anomalies in human behaviour from surveillance video. We work with static monocular cameras in crowded urban surveillance scenarios, particularly air- ports and commercial shopping areas. Typically a person is 100 to 200 pixels high in a scene ranging from 10 - 20 meters width and depth, populated by 5 to 40 peo- ple at any given time. Our procedure evaluates human behaviour unobtrusively to determine outlying behavioural events, agging abnormal events to the operator. In order to achieve automatic human behaviour anomaly detection we address the challenge of interpreting behaviour within the context of the social and physical environment. We develop and evaluate a process for measuring social connectivity between individuals in a scene using motion and visual attention features. To do this we use mutual information and Euclidean distance to build a social similarity matrix which encodes the social connection strength between any two individuals. We de- velop a second contextual basis which acts by segmenting a surveillance environment into behaviourally homogeneous subregions which represent high tra c slow regions and queuing areas. We model the heterogeneous scene in homogeneous subgroups using both contextual elements. We bring the social contextual information, the scene context, the motion, and visual attention features together to demonstrate a novel human behaviour anomaly detection process which nds outlier behaviour from a short sequence of video. The method, Nearest Neighbour Ranked Outlier Clusters (NN-RCO), is based upon modelling behaviour as a time independent se- quence of behaviour events, can be trained in advance or set upon a single sequence. We nd that in a crowded scene the application of Mutual Information-based social context permits the ability to prevent self-justifying groups and propagate anomalies in a social network, granting a greater anomaly detection capability. Scene context uniformly improves the detection of anomalies in all the datasets we test upon. We additionally demonstrate that our work is applicable to other data domains. We demonstrate upon the Automatic Identi cation Signal data in the maritime domain. Our work is capable of identifying abnormal shipping behaviour using joint motion dependency as analogous for social connectivity, and similarly segmenting the shipping environment into homogeneous regions

    Guiding object recognition: a shape model with co-activation networks

    Get PDF
    The goal of image understanding research is to develop techniques to automatically extract meaningful information from a population of images. This abstract goal manifests itself in a variety of application domains. Video understanding is a natural extension of image understanding. Many video understanding algorithms apply static-image algorithms to successive frames to identify patterns of consistency. This consumes a significant amount of irrelevant computation and may have erroneous results because static algorithms are not designed to indicate corresponding pixel locations between frames. Video is more than a collection of images, it is an ordered collection of images that exhibits temporal coherence, which is an additional feature like edges, colors, and textures. Motion information provides another level of visual information that can not be obtained from an isolated image. Leveraging motion cues prevents an algorithm from ?starting fresh? at each frame by focusing the region of attention. This approach is analogous to the attentional system of the human visual system. Relying on motion information alone is insufficient due to the aperture problem, where local motion information is ambiguous in at least one direction. Consequently, motion cues only provide leading and trailing motion edges and bottom-up approaches using gradient or region properties to complete moving regions are limited. Object recognition facilitates higher-level processing and is an integral component of image understanding. We present a components-based object detection and localization algorithm for static images. We show how this same system provides top-down segmentation for the detected object. We present a detailed analysis of the model dynamics during the localization process. This analysis shows consistent behavior in response to a variety of input, permitting model reduction and a substantial speed increase with little or no performance degradation. We present four specific enhancements to reduce false positives when instances of the target category are not present. First, a one-shot rule is used to discount coincident secondary hypotheses. Next, we demonstrate that the use of an entire shape model is inappropriate to localize any single instance and introduce the use of co-activation networks to represent the appropriate component relations for a particular recognition context. Next, we describe how the co-activation network can be combined with motion cues to overcome the aperture problem by providing context-specific, top-down shape information to achieve detection and segmentation in video. Finally, we present discriminating features arising from these enhancements and apply supervised learning techniques to embody the informational contribution of each approach to associate a confidence measure with each detection

    Smart Assistive Technology for People with Visual Field Loss

    Get PDF
    Visual field loss results in the lack of ability to clearly see objects in the surrounding environment, which affects the ability to determine potential hazards. In visual field loss, parts of the visual field are impaired to varying degrees, while other parts may remain healthy. This defect can be debilitating, making daily life activities very stressful. Unlike blind people, people with visual field loss retain some functional vision. It would be beneficial to intelligently augment this vision by adding computer-generated information to increase the users' awareness of possible hazards by providing early notifications. This thesis introduces a smart hazard attention system to help visual field impaired people with their navigation using smart glasses and a real-time hazard classification system. This takes the form of a novel, customised, machine learning-based hazard classification system that can be integrated into wearable assistive technology such as smart glasses. The proposed solution provides early notifications based on (1) the visual status of the user and (2) the motion status of the detected object. The presented technology can detect multiple objects at the same time and classify them into different hazard types. The system design in this work consists of four modules: (1) a deep learning-based object detector to recognise static and moving objects in real-time, (2) a Kalman Filter-based multi-object tracker to track the detected objects over time to determine their motion model, (3) a Neural Network-based classifier to determine the level of danger for each hazard using its motion features extracted while the object is in the user's field of vision, and (4) a feedback generation module to translate the hazard level into a smart notification to increase user's cognitive perception using the healthy vision within the visual field. For qualitative system testing, normal and personalised defected vision models were implemented. The personalised defected vision model was created to synthesise the visual function for the people with visual field defects. Actual central and full-field test results were used to create a personalised model that is used in the feedback generation stage of this system, where the visual notifications are displayed in the user's healthy visual area. The proposed solution will enhance the quality of life for people suffering from visual field loss conditions. This non-intrusive, wearable hazard detection technology can provide obstacle avoidance solution, and prevent falls and collisions early with minimal information

    Motion Conflict Detection and Resolution in Visual-Inertial Localization Algorithm

    Get PDF
    In this dissertation, we have focused on conflicts that occur due to disagreeing motions in multi-modal localization algorithms. In spite of the recent achievements in robust localization by means of multi-sensor fusion, these algorithms are not applicable to all environments. This is primarily attributed to the following fundamental assumptions: (i) the environment is predominantly stationary, (ii) only ego-motion of the sensor platform exists, and (iii) multiple sensors are always in agreement with each other regarding the observed motion. Recently, studies have shown how to relax the static environment assumption using outlier rejection techniques and dynamic object segmentation. Additionally, to handle non ego-motion, approaches that extend the localization algorithm to multi-body tracking have been studied. However, there has been no attention given to the conditions where multiple sensors contradict each other with regard to the motions observed. Vision based localization has become an attractive approach for both indoor and outdoor applications due to the large information bandwidth provided by images and reduced cost of the cameras used. In order to improve the robustness and overcome the limitations of vision, an Inertial Measurement Unit (IMU) may be used. Even though visual-inertial localization has better accuracy and improved robustness due to the complementary nature of camera and IMU sensor, they are affected by disagreements in motion observations. We term such dynamic situations as environments with motion conflictbecause these are caused when multiple different but self- consistent motions are observed by different sensors. Tightly coupled visual inertial fusion approaches that disregard such challenging situations exhibit drift that can lead to catastrophic errors. We have provided a probabilistic model for motion conflict. Additionally, a novel algorithm to detect and resolve motion conflicts is also presented. Our method to detect motion conflicts is based on per-frame positional estimate discrepancy and per- landmark reprojection errors. Motion conflicts were resolved by eliminating inconsistent IMU and landmark measurements. Finally, a Motion Conflict aware Visual Inertial Odometry (MC- VIO) algorithm that combined both detection and resolution of motion conflict was implemented. Both quantitative and qualitative evaluation of MC-VIO on visually and inertially challenging datasets were obtained. Experimental results indicated that MC-VIO algorithm reduced the absolute trajectory error by 70% and the relative pose error by 34% in scenes with motion conflict, in comparison to the reference VIO algorithm. Motion conflict detection and resolution enables the application of visual inertial localization algorithms to real dynamic environments. This paves the way for articulate object tracking in robotics. It may also find numerous applications in active long term augmented reality

    Gait Analysis and Recognition for Automated Visual Surveillance

    No full text
    Human motion analysis has received a great attention from researchers in the last decade due to its potential use in different applications such as automated visual surveillance. This field of research focuses on the perception and recognition of human activities, including people identification. We explore a new approach for walking pedestrian detection in an unconstrained outdoor environment. The proposed algorithm is based on gait motion as the rhythm of the footprint pattern of walking people is considered the stable and characteristic feature for the classification of moving objects. The novelty of our approach is motivated by the latest research for people identification using gait. The experimental results confirmed the robustness of our method to discriminate between single walking subject, groups of people and vehicles with a successful detection rate of 100%. Furthermore, the results revealed the potential of our method to extend visual surveillance systems to recognize walking people. Furthermore, we propose a new approach to extract human joints (vertex positions) using a model-based method. The spatial templates describing the human gait motion are produced via gait analysis performed on data collected from manual labeling. The Elliptic Fourier Descriptors are used to represent the motion models in a parametric form. The heel strike data is exploited to reduce the dimensionality of the parametric models. People walk normal to the viewing plane, as major gait information is available in a sagittal view. The ankle, knee and hip joints are successfully extracted with high accuracy for indoor and outdoor data. In this way, we have established a baseline analysis which can be deployed in recognition, marker-less analysis and other areas. The experimental results confirmed the robustness of the model-based approach to recognise walking subjects with a correct classification rate of 95% using purely the dynamic features derived from the joint motion. Therefore, this confirms the early psychological theories claiming that the discriminative features for motion perception and people recognition are embedded in gait kinematics. Furthermore, to quantify the intrusive nature of gait recognition we explore the effects of the different covariate factors on the performance of gait recognition. The covariate factors include footwear, clothing, carrying conditions and walking speed. As far as the author can determine, this is the first major study of its kind in this field to analyse the covariate factors using a model-based method
    • …
    corecore