507 research outputs found
Design Of Computer Vision Systems For Optimizing The Threat Detection Accuracy
This dissertation considers computer vision (CV) systems in which a central monitoring station receives and analyzes the video streams captured and delivered wirelessly by multiple cameras. It addresses how the bandwidth can be allocated to various cameras by presenting a cross-layer solution that optimizes the overall detection or recognition accuracy. The dissertation presents and develops a real CV system and subsequently provides a detailed experimental analysis of cross-layer optimization. Other unique features of the developed solution include employing the popular HTTP streaming approach, utilizing homogeneous cameras as well as heterogeneous ones with varying capabilities and limitations, and including a new algorithm for estimating the effective medium airtime. The results show that the proposed solution significantly improves the CV accuracy.
Additionally, the dissertation features an improved neural network system for object detection. The proposed system considers inherent video characteristics and employs different motion detection and clustering algorithms to focus on the areas of importance in consecutive frames, allowing the system to dynamically and efficiently distribute the detection task among multiple deployments of object detection neural networks. Our experimental results indicate that our proposed method can enhance the mAP (mean average precision), execution time, and required data transmissions to object detection networks.
Finally, as recognizing an activity provides significant automation prospects in CV systems, the dissertation presents an efficient activity-detection recurrent neural network that utilizes fast pose/limbs estimation approaches. By combining object detection with pose estimation, the domain of activity detection is shifted from a volume of RGB (Red, Green, and Blue) pixel values to a time-series of relatively small one-dimensional arrays, thereby allowing the activity detection system to take advantage of highly capable neural networks that have been trained on large GPU clusters for thousands of hours. Consequently, capable activity detection systems with considerably fewer training sets and processing hours can be built
Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery
PCA is one of the most widely used dimension reduction techniques. A related
easier problem is "subspace learning" or "subspace estimation". Given
relatively clean data, both are easily solved via singular value decomposition
(SVD). The problem of subspace learning or PCA in the presence of outliers is
called robust subspace learning or robust PCA (RPCA). For long data sequences,
if one tries to use a single lower dimensional subspace to represent the data,
the required subspace dimension may end up being quite large. For such data, a
better model is to assume that it lies in a low-dimensional subspace that can
change over time, albeit gradually. The problem of tracking such data (and the
subspaces) while being robust to outliers is called robust subspace tracking
(RST). This article provides a magazine-style overview of the entire field of
robust subspace learning and tracking. In particular solutions for three
problems are discussed in detail: RPCA via sparse+low-rank matrix decomposition
(S+LR), RST via S+LR, and "robust subspace recovery (RSR)". RSR assumes that an
entire data vector is either an outlier or an inlier. The S+LR formulation
instead assumes that outliers occur on only a few data vector indices and hence
are well modeled as sparse corruptions.Comment: To appear, IEEE Signal Processing Magazine, July 201
Scale-Adaptive Video Understanding.
The recent rise of large-scale, diverse video data has urged a new era of high-level video understanding. It is increasingly critical for intelligent systems to extract semantics from videos. In this dissertation, we explore the use of supervoxel hierarchies as a type of video representation for high-level video understanding. The supervoxel hierarchies contain rich multiscale decompositions of video content, where various structures can be found at various levels. However, no single level of scale contains all the desired structures we need. It is essential to adaptively choose the scales for subsequent video analysis. Thus, we present a set of tools to manipulate scales in supervoxel hierarchies including both scale generation and scale selection methods.
In our scale generation work, we evaluate a set of seven supervoxel methods in the context of what we consider to be a good supervoxel for video representation. We address a key limitation that has traditionally prevented supervoxel scale generation on long videos. We do so by proposing an approximation framework for streaming hierarchical scale generation that is able to generate multiscale decompositions for arbitrarily-long videos using constant memory.
Subsequently, we present two scale selection methods that are able to adaptively choose the scales according to application needs. The first method flattens the entire supervoxel hierarchy into a single segmentation that overcomes the limitation induced by trivial selection of a single scale. We show that the selection can be driven by various post hoc feature criteria. The second scale selection method combines the supervoxel hierarchy with a conditional random field for the task of labeling actors and actions in videos. We formulate the scale selection problem and the video labeling problem in a joint framework. Experiments on a novel large-scale video dataset demonstrate the effectiveness of the explicit consideration of scale selection in video understanding.
Aside from the computational methods, we present a visual psychophysical study to quantify how well the actor and action semantics in high-level video understanding are retained in supervoxel hierarchies. The ultimate findings suggest that some semantics are well-retained in the supervoxel hierarchies and can be used for further video analysis.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133202/1/cliangxu_1.pd
- …