519 research outputs found
Cross-layer Optimized Wireless Video Surveillance
A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system.
The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion.
In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality.
The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos.
In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work.
Adviser: Song C
Cross-layer Optimized Wireless Video Surveillance
A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system.
The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion.
In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality.
The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos.
In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work.
Adviser: Song C
Recommended from our members
Adaptive intra refresh for robust wireless multi-view video
This thesis was submitted for the award of PhD and was awarded by Brunel University LondonMobile wireless communication technology is a fast developing field and every day new mobile communication techniques and means are becoming available. In this thesis multi-view video (MVV) is also refers to as 3D video. Thus, the 3D video signals through wireless communication are shaping telecommunication industry and academia. However, wireless channels are prone to high level of bit and burst errors that largely deteriorate the quality of service (QoS). Noise along the wireless transmission path can introduce distortion or make a compressed bitstream lose vital information. The error caused by noise progressively spread to subsequent frames and among multiple views due to prediction. This error may compel the receiver to pause momentarily and wait for the subsequent INTRA picture to continue decoding. The pausing of video stream affects the user's Quality of Experience (QoE). Thus, an error resilience strategy is needed to protect the compressed bitstream against transmission errors. This thesis focuses on error resilience Adaptive Intra Refresh (AIR) technique. The AIR method is developed to make the compressed 3D video more robust to channel errors. The process involves periodic injection of Intra-coded macroblocks in a cyclic pattern using H.264/AVC standard. The algorithm takes into account individual features in each macroblock and the feedback information sent by the decoder about the channel condition in order to generate an MVV-AIR map. MVV-AIR map generation regulates the order of packets arrival and identifies the motion activities in each macroblock. Based on the level of motion activity contained in each macroblock, the MVV-AIR map classifies frames as high or low motion macroblocks. A proxy MVV-AIR transcoder is used to validate the efficiency of the generated MVV-AIR map. The MVV-AIR transcoding algorithm uses spatial and views downscaling scheme to convert from MVV to single view. Various experimental results indicate that the proposed error resilient MVV-AIR transcoder technique effectively improves the quality of reconstructed 3D video in wireless networks. A comparison of MVV-AIR transcoder algorithm with some traditional error resilience techniques demonstrates that MVV-AIR algorithm performs better in an error prone channel. Results of simulation revealed significant improvements in both objective and subjective qualities. No additional computational complexity emanates from the scheme while the QoS and QoE requirements are still fully met.Tertiary Institution Trust Fund (TETFund) of Nigeri
Intra-Refresh Provision for WiMAX Data-Partitioned Video Streaming
Mobile, broadband wireless access is increasingly being used for video streaming. This paper is a study of the impact of intra-refresh provision upon a robust video streaming scheme intended for WiMAX. The paper demonstrates the use of intra-refresh macroblocks within inter-coded video frames as an alternative to periodic intra-refresh video frames. In fact, the proposed scheme combines intra-refresh macroblocks with data-partitioned video compression, both error resilience tools from the H.264 video codec. Redundant video packets along with adaptive channel coding are also used to protect video streams. In harsh wireless channel conditions, it is found that all the proposed measures are necessary. This is because error bursts, arising from both slow and fast fading, as well as other channel impairments, are possible. The main conclusions from a detailed analysis are that: because of the effect on packet size it is important to select a moderate quantization parameter; and because of the higher overhead from cyclic intra macroblock line update it is better to select a low percentage per frame of intra-refresh macroblocks. The proposed video streaming scheme will be applicable to other 4G wireless technologies such as LTE
WATCHING PEOPLE: ALGORITHMS TO STUDY HUMAN MOTION AND ACTIVITIES
Nowadays human motion analysis is one of the most active research topics in Computer Vision and it is receiving an increasing attention from both the industrial and scientific communities.
The growing interest in human motion analysis is motivated by the increasing number of promising applications, ranging from surveillance, human–computer interaction, virtual reality to healthcare, sports, computer games and video conferencing, just to name a few.
The aim of this thesis is to give an overview of the various tasks involved in visual motion analysis of the human body and to present the issues and possible solutions related to it.
In this thesis, visual motion analysis is categorized into three major areas related to the interpretation of human motion: tracking of human motion using virtual pan-tilt-zoom (vPTZ) camera, recognition of human motions and human behaviors segmentation.
In the field of human motion tracking, a virtual environment for PTZ cameras (vPTZ) is presented to overcame the mechanical limitations of PTZ cameras. The vPTZ is built on equirectangular images acquired by 360° cameras and it allows not only the development of pedestrian tracking algorithms but also the comparison of their performances. On the basis of this virtual environment, three novel pedestrian tracking algorithms for 360° cameras were developed, two of which adopt a tracking-by-detection approach while the last adopts a Bayesian approach.
The action recognition problem is addressed by an algorithm that represents actions in terms of multinomial distributions of frequent sequential patterns of different length. Frequent sequential patterns are series of data descriptors that occur many times in the data. The proposed method learns a codebook of frequent sequential patterns by means of an apriori-like algorithm. An action is then represented with a Bag-of-Frequent-Sequential-Patterns approach.
In the last part of this thesis a methodology to semi-automatically annotate behavioral data given a small set of manually annotated data is presented. The resulting methodology is not only effective in the semi-automated annotation task but can also be used in presence of abnormal behaviors, as demonstrated empirically by testing the system on data collected from children affected by neuro-developmental disorders
Automatic Vehicle Detection, Tracking and Recognition of License Plate in Real Time Videos
Automatic video analysis from traffic surveillance cameras is a fast-emerging field based on computer vision techniques. It is a key technology to public safety, intelligent transport system (ITS) and for efficient management of traffic. In recent years, there has been an increased scope for automatic analysis of traffic activity. We define video analytics as computer-vision-based surveillance algorithms and systems to extract contextual information from video. In traffic scenarios several monitoring objectives can be supported by the application of computer vision and pattern recognition techniques, including the detection of traffic violations (e.g., illegal turns and one-way streets) and the identification of road users (e.g., vehicles, motorbikes, and pedestrians). Currently most reliable approach is through the recognition of number plates, i.e., automatic number plate recognition (ANPR), which is also known as automatic license plate recognition (ALPR), or radio frequency transponders. Here full-featured automatic system for vehicle detection, tracking and license plate recognition is presented. This system has many applications in pattern recognition and machine vision and they ranges from complex security systems to common areas and from parking admission to urban traffic control. This system has complex characteristics due to diverse effects as fog, rain, shadows, uneven illumination conditions, occlusion, variable distances, velocity of car, scene's angle in frame, rotation of plate, number of vehicles in the scene and others. The main objective of this work is to show a system that solves the practical problem of car identification for real scenes. All steps of the process, from video acquisition to optical character recognition are considered to achieve an automatic identification of plates
Application-specific protocol architectures for wireless networks
Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.Includes bibliographical references (p. 145-154).In recent years, advances in energy-efficient design and wireless technologies have enabled exciting new applications for wireless devices. These applications span a wide range, including real-time and streaming video and audio delivery, remote monitoring using networked microsensors, personal medical monitoring, and home networking of everyday appliances. While these applications require high performance from the network, they suffer from resource constraints that do not appear in more traditional wired computing environments. In particular, wireless spectrum is scarce, often limiting the bandwidth available to applications and making the channel error-prone, and the nodes are battery-operated, often limiting available energy. My thesis is that this harsh environment with severe resource constraints requires an application-specific protocol architecture, rather than the traditional layered approach, to obtain the best possible performance. This dissertation supports this claim using detailed case studies on microsensor networks and wireless video delivery. The first study develops LEACH (Low-Energy Adaptive Clustering Hierarchy), an architecture for remote microsensor networks that combines the ideas of energy-efficient cluster-based routing and media access together with application-specific data aggregation to achieve good performance in terms of system lifetime, latency, and application-perceived quality. This approach improves system lifetime by an order of magnitude compared to general-purpose approaches when the node energy is limited. The second study develops an unequal error protection scheme for MPEG-4 compressed video delivery that adapts the level of protection applied to portions of a packet to the degree of importance of the corresponding bits. This approach obtains better application-perceived performance than current approaches for the same amount of transmission bandwidth. These two systems show that application-specific protocol architectures achieve the energy and latency efficiency and error robustness needed for wireless networks.by Wendi Beth Heinzelman.Ph.D
Recommended from our members
3D multiple description coding for error resilience over wireless networks
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Mobile communications has gained a growing interest from both customers and service providers alike in the last 1-2 decades. Visual information is used in many application domains such as remote health care, video –on demand, broadcasting, video surveillance etc. In order to enhance the visual effects of digital video content, the depth perception needs to be provided with the actual visual content. 3D video has earned a significant interest from the research community in recent years, due to the tremendous impact it leaves on viewers and its enhancement of the user’s quality of experience (QoE). In the near future, 3D video is likely to be used in most video applications, as it offers a greater sense of immersion and perceptual experience. When 3D video is compressed and transmitted over error prone channels, the associated packet loss leads to visual quality degradation. When a picture is lost or corrupted so severely that the concealment result is not acceptable, the receiver typically pauses video playback and waits for the next INTRA picture to resume decoding. Error propagation caused by employing predictive coding may degrade the video quality severely. There are several ways used to mitigate the effects of such transmission errors. One widely used technique in International Video Coding Standards is error resilience.
The motivation behind this research work is that, existing schemes for 2D colour video compression such as MPEG, JPEG and H.263 cannot be applied to 3D video content. 3D video signals contain depth as well as colour information and are bandwidth demanding, as they require the transmission of multiple high-bandwidth 3D video streams. On the other hand, the capacity of wireless channels is limited and wireless links are prone to various types of errors caused by noise, interference, fading, handoff, error burst and network congestion. Given the maximum bit rate budget to represent the 3D scene, optimal bit-rate allocation between texture and depth information rendering distortion/losses should be minimised. To mitigate the effect of these errors on the perceptual 3D video quality, error resilience video coding needs to be investigated further to offer better quality of experience (QoE) to end users.
This research work aims at enhancing the error resilience capability of compressed 3D video, when transmitted over mobile channels, using Multiple Description Coding (MDC) in order to improve better user’s quality of experience (QoE).
Furthermore, this thesis examines the sensitivity of the human visual system (HVS) when employed to view 3D video scenes. The approach used in this study is to use subjective testing in order to rate people’s perception of 3D video under error free and error prone conditions through the use of a carefully designed bespoke questionnaire.Petroleum Technology Development Fund (PTDF
Adaptive video delivery using semantics
The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications
- …