862 research outputs found

    Prioritizing Content of Interest in Multimedia Data Compression

    Get PDF
    Image and video compression techniques make data transmission and storage in digital multimedia systems more efficient and feasible for the system's limited storage and bandwidth. Many generic image and video compression techniques such as JPEG and H.264/AVC have been standardized and are now widely adopted. Despite their great success, we observe that these standard compression techniques are not the best solution for data compression in special types of multimedia systems such as microscopy videos and low-power wireless broadcast systems. In these application-specific systems where the content of interest in the multimedia data is known and well-defined, we should re-think the design of a data compression pipeline. We hypothesize that by identifying and prioritizing multimedia data's content of interest, new compression methods can be invented that are far more effective than standard techniques. In this dissertation, a set of new data compression methods based on the idea of prioritizing the content of interest has been proposed for three different kinds of multimedia systems. I will show that the key to designing efficient compression techniques in these three cases is to prioritize the content of interest in the data. The definition of the content of interest of multimedia data depends on the application. First, I show that for microscopy videos, the content of interest is defined as the spatial regions in the video frame with pixels that don't only contain noise. Keeping data in those regions with high quality and throwing out other information yields to a novel microscopy video compression technique. Second, I show that for a Bluetooth low energy beacon based system, practical multimedia data storage and transmission is possible by prioritizing content of interest. I designed custom image compression techniques that preserve edges in a binary image, or foreground regions of a color image of indoor or outdoor objects. Last, I present a new indoor Bluetooth low energy beacon based augmented reality system that integrates a 3D moving object compression method that prioritizes the content of interest.Doctor of Philosoph

    Spatiotemporal Saliency Detection: State of Art

    Get PDF
    Saliency detection has become a very prominent subject for research in recent time. Many techniques has been defined for the saliency detection.In this paper number of techniques has been explained that include the saliency detection from the year 2000 to 2015, almost every technique has been included.all the methods are explained briefly including their advantages and disadvantages. Comparison between various techniques has been done. With the help of table which includes authors name,paper name,year,techniques,algorithms and challenges. A comparison between levels of acceptance rates and accuracy levels are made

    Application-aware video coding architecture using camera and object motion-models

    Get PDF
    The proliferation of video consumption, especially over mobile devices, has created a demand for efficient interactive video applications and high-level video analysis. This is particularly significant in real-time applications and resource-limited scenarios. Pixel-domain video processing is often inefficient for many of these applications due to its complexity, whereas compressed domain processing offer fast but unreliable results. In order to achieve fast and effective video processing, this paper proposes a novel video encoding architecture that facilitate efficient compressed domain processing, while maintaining compliance with the mainstream coding standards. This is achieved by optimizing the accuracy of motion information embedded in the compressed video, in addition to compression efficiency. In a motion detection application, we demonstrate that the motion estimated by the proposed encoder can be directly used to extract object information, as opposed to conventionally coded video. The incurred rate distortion overheads can be weighed against the reduced processing required for video analysis targeting a wide spectrum of computer vision applications

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Deep Learning-Based Low Complexity and High Efficiency Moving Object Detection Methods

    Get PDF
    Moving object detection (MOD) is the process of extracting dynamic foreground content from the video frames, such as moving vehicles or pedestrians, while discarding the nonmoving background. It plays an essential role in computer vision field. The traditional methods meet difficulties when applied in complex scenarios, such as videos with illumination changes, shadows, night scenes,and dynamic backgrounds. Deep learning methods have been actively applied to moving object detection in recent years and demonstrated impressive results. However, many existing models render superior detection accuracy at the cost of high computational complexity and slow inference speed. This fact has hindered the development of such models in mobile and embedded vision tasks, which need to be carried out in a timely fashion on a computationally limited platform. The current research aims to use the technique of separable convolution in both 2D and 3D CNN together with our proposed multi-input multi-output strategy and two-branch structure to devise new deep network models that significantly improve inference speed, yet require smaller model size and achieve reduction in floating-point operations as compared to existing deep learning models with competitive detection accuracy. This research devised three deep neural network models, addressing the following main problems in the area of moving object detection: 1. Improving Detection Accuracy by extracting both spatial and temporal information: To improve detection accuracy, the proposed models adopt 3D convolution which is more suitable to extract both spatial and temporal information in video data than 2D convolution. We also put this 3D convolution into two-branch network that extracts both high-level global features and low-level detailed features can further increase the accuracy. 2. Reduce model size and computational complexity by changing network structure: The standard 2D and 3D convolution are further decomposed into depthwise and pointwise convolutions. While existing 3D separable CNN all addressed other problems such as gesture recognition, force prediction, 3D object classification or reconstruction, our work applied it to the moving object detection task for the first time in the literature. 3. Increasing inference speed by changing the input-output relationship: We proposed a multi-input multi-output (MIMO) strategy to increase inference speed, which can take multiple frames as the network input and output multiple frames of detection results. This MIMO embedded in 3Dseparable CNN can further increase model inference speed significantly and maintain high detection accuracy. Compared to state-of-the-art approaches, our proposed methods significantly increases the inference speed, reduces the model size, meanwhile achieving the highest detection accuracy in the scene dependent evaluation (SDE) setup and maintaining a competitive detection accuracy in the scene independent evaluation (SIE) setup. The SDE setup is widely used to tune and test the model on a specific video as the training and test sets are from the same video. The SIE setup is designed to assess the generalization capability of the model on completely unseen videos
    • …