767 research outputs found

    Crowd Counting in Low-Resolution Crowded Scenes Using Region-Based Deep Convolutional Neural Networks

    Full text link
    © 2013 IEEE. Crowd counting and density estimation is an important and challenging problem in the visual analysis of the crowd. Most of the existing approaches use regression on density maps for the crowd count from a single image. However, these methods cannot localize individual pedestrian and therefore cannot estimate the actual distribution of pedestrians in the environment. On the other hand, detection-based methods detect and localize pedestrians in the scene, but the performance of these methods degrades when applied in high-density situations. To overcome the limitations of pedestrian detectors, we proposed a motion-guided filter (MGF) that exploits spatial and temporal information between consecutive frames of the video to recover missed detections. Our framework is based on the deep convolution neural network (DCNN) for crowd counting in the low-to-medium density videos. We employ various state-of-the-art network architectures, namely, Visual Geometry Group (VGG16), Zeiler and Fergus (ZF), and VGGM in the framework of a region-based DCNN for detecting pedestrians. After pedestrian detection, the proposed motion guided filter is employed. We evaluate the performance of our approach on three publicly available datasets. The experimental results demonstrate the effectiveness of our approach, which significantly improves the performance of the state-of-the-art detectors

    AN ADAPTIVE MULTIPLE-OBJECT TRACKING ARCHITECTURE FOR LONG-DURATION VIDEOS WITH VARIABLE TARGET DENSITY

    Get PDF
    Multiple-Object Tracking (MOT) methods are used to detect targets in individual video frames, e.g., vehicles, people, and other objects, and then record each unique target’s path over time. Current state-of-the-art approaches are extremely complex because most rely on extracting and comparing visual features at every frame to track each object. These approaches are geared toward high-difficulty-tracking scenarios, e.g., crowded airports, and require expensive dedicated hardware, e.g., Graphics Processing Units. In hardware-constrained applications, researchers are turning to older, less complex MOT methods, which reveals a serious scalability issue within the state-of-the-art. Crowded environments are a niche application for MOT, i.e., there are far more residential areas than there are airports. Given complex approaches are not required for low-difficulty-tracking scenarios, i.e., video showing mainly isolated targets, there is an opportunity to utilize more efficient MOT methods for these environments. Nevertheless, little recent research has focused on developing more efficient MOT methods. This thesis describes a novel MOT method, ClusterTracker, that is built to handle variable-difficulty-tracking environments an order of magnitude faster than the state-of-the-art. It achieves this by avoiding visual features and using quadratic-complexity algorithms instead of the cubic-complexity algorithms found in other trackers. ClusterTracker performs spatial clustering on object detections from short frame sequences, treats clusters as tracklets, and then connects successive tracklets with high bounding-box overlap to form tracks. With recorded video, parallel processing can be applied to several steps of ClusterTracker. This thesis evaluates ClusterTracker’s baseline performance on several benchmark datasets, describes its intended operating environments, and identifies its weaknesses. Subsequent modifications patch these weaknesses while also addressing the scalability concerns of more complex MOT methods. The modified architecture uses clustering feedback to separate isolated targets from non-isolated targets, re-processing the latter with a more complex MOT method. Results show ClusterTracker is uniquely suited for such an approach and allows complex MOT methods to be applied to the challenging tracking situations for which they are intended

    Understanding Vehicular Traffic Behavior from Video: A Survey of Unsupervised Approaches

    Full text link
    Recent emerging trends for automatic behavior analysis and understanding from infrastructure video are reviewed. Research has shifted from high-resolution estimation of vehicle state and instead, pushed machine learning approaches to extract meaningful patterns in aggregates in an unsupervised fashion. These patterns represent priors on observable motion, which can be utilized to describe a scene, answer behavior questions such as where is a vehicle going, how many vehicles are performing the same action, and to detect an abnormal event. The review focuses on two main methods for scene description, trajectory clustering and topic modeling. Example applications that utilize the behavioral modeling techniques are also presented. In addition, the most popular public datasets for behavioral analysis are presented. Discussion and comment on future directions in the field are also provide
    • …
    corecore