70 research outputs found

    Surveillance System with Object-Aware Video Transcoder

    Full text link

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    The Virtual Device: Expanding Wireless Communication Services Through Service Discovery and Session Mobility

    Get PDF
    We present a location-based, ubiquitous service architecture, based on the Session Initiation Protocol (SIP) and a service discovery protocol that enables users to enhance the multimedia communications services available on their mobile devices by discovering other local devices, and including them in their active sessions, creating a 'virtual device.' We have implemented our concept based on Columbia University's multimedia environment and we show its feasibility by a performance analysis

    Dynamic Rate Control for JPEG 2000 Transcoding

    Full text link

    Fusion and Perspective Correction of Multiple Networked Video Sensors

    Get PDF
    A network of adaptive processing elements has been developed that transforms and fuses video captured from multiple sensors. Unlike systems that rely on end-systems to process data, this system distributes the computation throughout the network in order to reduce overall network bandwidth. The network architecture is scalable because it uses a hierarchy of processing engines to perform signal processing. Nodes within the network can be dynamically reprogrammed in order to compose video from multiple sources, digitally transform camera perspectives, and adapt the video format to meet the needs of specific applications. A prototype has been developed using reconfigurable hardware that collects and processes real-time, streaming video of an urban environment. Multiple video cameras gather data from different perspectives and fuse that data into a unified, top-down view. The hardware exploits both the spatial and temporal parallelism of the video streams and the regular processing when applying the transforms. Recon-figurable hardware allows for the functions at nodes to be reprogrammed for dynamic changes in topology. Hardware-based video processors also consume less power than high frequency software-based solutions. Performance and scalability are compared to a distributed software-based implementation. The reconfigurable hardware design is coded in VHDL and prototyped using Washington University’s Field Programmable Port Extender (FPX) platform. The transform engine circuit utilizes approximately 34 percent of the resources of a Xilinx Virtex 2000E FPGA, and can be clocked at frequencies up to 48 MHz. The com-position engine circuit utilizes approximately 39 percent of the resources of a Xilinx Virtex 2000E FPGA, and can be clocked at frequencies up to 45 MHz

    Toward a General Parametric Model for Assessing the Impact of Video Transcoding on Objective Video Quality

    Get PDF
    Video transcoding can cause degradation to an original video. Currently, there is no general model that assesses the impact of video transcoding on video quality. Such a model could play a critical role in evaluating the quality of the transcoded video, and thereby optimizing delivery of video to end-users while meeting their expectations. The main contribution of this research is the development and substantiation of a general parametric model, called the Video Transcoding Objective-quality Model (VTOM), that provides an extensible video transcoding service selection mechanism, which takes into account both the format and characteristics of the original video and the desired output, i.e., viewing format with preferred quality of service. VTOM represents a mathematical function that uses a set of media-related parameters for the original video and desired output, including codec, bit rate, frame rate, and frame size to predict the quality of the transcoded video generated from a specific transcoding. VTOM includes four quality sub-models, each describing the impact of each of these parameters on objective video quality, as well as a weighted-product aggregation function that combines these quality sub-models with four additional error sub-models in a single function for assessing the overall video quality. I compared the predicted quality results generated from the VTOM with quality values generated from an existing objective-quality metric. These comparisons yielded results that showed good correlations, with low error values. VTOM helps the researchers and developers of video delivery systems and applications to calculate the degradation that video transcoding can cause on the fly, rather than evaluate it statistically using statistical methods that only consider the desired output. Because VTOM takes into account the quality of the input video, i.e., video format and characteristics, and the desired quality of the output video, it can be used for dynamic video transcoding service selection and composition. A number of quality metrics were examined and used in development of VTOM and its assessment. However, this research discovered that, to date, there are no suitable metrics in the literature for comparing two videos with different frame rates. Therefore, this dissertation defines a new metric, called Frame Rate Metric (FRM) as part of its contributions. FRM can use any frame-based quality metric for comparing frames from both videos. Finally, this research presents and adapts four Quality of Service (QoS)-aware video transcoding service selection algorithms. The experimental results showed that these four algorithms achieved good results in terms of time complexity, success ratio, and user satisfaction rate

    Secure JPEG Scrambling enabling Privacy in Photo Sharing

    Get PDF
    With the popularization of online social networks (OSNs) and smart mobile devices, photo sharing is becoming a part of people’ daily life. An unprecedented number of photos are being uploaded and shared everyday through online social networks or photo hosting services, such as Facebook, Twitter, Instagram, and Flickr. However, such unrestrained online photo or multimedia sharing has raised serious privacy concerns, especially after reports of citizens surveillance by governmental agencies and scandalous leakage of private photos from prominent photo sharing sites or online cloud services. Popular OSNs typically offer privacy protection solutions only in response to the public demand and therefore are often rudimental, complex to use, and provide limited degree of control and protection. Most solutions allow users to control either who can access the shared photos or for how long they can be accessed. In contrast, in this paper, we take a structured privacy by design approach to the problem of online photo privacy protection. We propose a privacy-preserving photo sharing architecture based on a secure JPEG scrambling algorithm capable of protecting the privacy of multiple users involved in a photo. We demonstrate the proposed photo sharing architecture with a prototype application called ProShare that offers JPEG scrambling as the privacy protection tool for selected regions in a photo, secure access to the protected images, and secure photo sharing on Facebook

    Receiver-Driven Video Adaptation

    Get PDF
    In the span of a single generation, video technology has made an incredible impact on daily life. Modern use cases for video are wildly diverse, including teleconferencing, live streaming, virtual reality, home entertainment, social networking, surveillance, body cameras, cloud gaming, and autonomous driving. As these applications continue to grow more sophisticated and heterogeneous, a single representation of video data can no longer satisfy all receivers. Instead, the initial encoding must be adapted to each receiver's unique needs. Existing adaptation strategies are fundamentally flawed, however, because they discard the video's initial representation and force the content to be re-encoded from scratch. This process is computationally expensive, does not scale well with the number of videos produced, and throws away important information embedded in the initial encoding. Therefore, a compelling need exists for the development of new strategies that can adapt video content without fully re-encoding it. To better support the unique needs of smart receivers, diverse displays, and advanced applications, general-use video systems should produce and offer receivers a more flexible compressed representation that supports top-down adaptation strategies from an original, compressed-domain ground truth. This dissertation proposes an alternate model for video adaptation that addresses these challenges. The key idea is to treat the initial compressed representation of a video as the ground truth, and allow receivers to drive adaptation by dynamically selecting which subsets of the captured data to receive. In support of this model, three strategies for top-down, receiver-driven adaptation are proposed. First, a novel, content-agnostic entropy coding technique is implemented in which symbols are selectively dropped from an input abstract symbol stream based on their estimated probability distributions to hit a target bit rate. Receivers are able to guide the symbol dropping process by supplying the encoder with an appropriate rate controller algorithm that fits their application needs and available bandwidths. Next, a domain-specific adaptation strategy is implemented for H.265/HEVC coded video in which the prediction data from the original source is reused directly in the adapted stream, but the residual data is recomputed as directed by the receiver. By tracking the changes made to the residual, the encoder can compensate for decoder drift to achieve near-optimal rate-distortion performance. Finally, a fully receiver-driven strategy is proposed in which the syntax elements of a pre-coded video are cataloged and exposed directly to clients through an HTTP API. Instead of requesting the entire stream at once, clients identify the exact syntax elements they wish to receive using a carefully designed query language. Although an implementation of this concept is not provided, an initial analysis shows that such a system could save bandwidth and computation when used by certain targeted applications.Doctor of Philosoph
    corecore