40 research outputs found
Learning to Segment Moving Objects in Videos
We segment moving objects in videos by ranking spatio-temporal segment
proposals according to "moving objectness": how likely they are to contain a
moving object. In each video frame, we compute segment proposals using multiple
figure-ground segmentations on per frame motion boundaries. We rank them with a
Moving Objectness Detector trained on image and motion fields to detect moving
objects and discard over/under segmentations or background parts of the scene.
We extend the top ranked segments into spatio-temporal tubes using random
walkers on motion affinities of dense point trajectories. Our final tube
ranking consistently outperforms previous segmentation methods in the two
largest video segmentation benchmarks currently available, for any number of
proposals. Further, our per frame moving object proposals increase the
detection rate up to 7\% over previous state-of-the-art static proposal
methods
A Multi-cut Formulation for Joint Segmentation and Tracking of Multiple Objects
Recently, Minimum Cost Multicut Formulations have been proposed and proven to
be successful in both motion trajectory segmentation and multi-target tracking
scenarios. Both tasks benefit from decomposing a graphical model into an
optimal number of connected components based on attractive and repulsive
pairwise terms. The two tasks are formulated on different levels of granularity
and, accordingly, leverage mostly local information for motion segmentation and
mostly high-level information for multi-target tracking. In this paper we argue
that point trajectories and their local relationships can contribute to the
high-level task of multi-target tracking and also argue that high-level cues
from object detection and tracking are helpful to solve motion segmentation. We
propose a joint graphical model for point trajectories and object detections
whose Multicuts are solutions to motion segmentation {\it and} multi-target
tracking problems at once. Results on the FBMS59 motion segmentation benchmark
as well as on pedestrian tracking sequences from the 2D MOT 2015 benchmark
demonstrate the promise of this joint approach
Click Carving: Segmenting Objects in Video with Point Clicks
We present a novel form of interactive video object segmentation where a few
clicks by the user helps the system produce a full spatio-temporal segmentation
of the object of interest. Whereas conventional interactive pipelines take the
user's initialization as a starting point, we show the value in the system
taking the lead even in initialization. In particular, for a given video frame,
the system precomputes a ranked list of thousands of possible segmentation
hypotheses (also referred to as object region proposals) using image and motion
cues. Then, the user looks at the top ranked proposals, and clicks on the
object boundary to carve away erroneous ones. This process iterates (typically
2-3 times), and each time the system revises the top ranked proposal set, until
the user is satisfied with a resulting segmentation mask. Finally, the mask is
propagated across the video to produce a spatio-temporal object tube. On three
challenging datasets, we provide extensive comparisons with both existing work
and simpler alternative methods. In all, the proposed Click Carving approach
strikes an excellent balance of accuracy and human effort. It outperforms all
similarly fast methods, and is competitive or better than those requiring 2 to
12 times the effort.Comment: A preliminary version of the material in this document was filed as
University of Texas technical report no. UT AI16-0
Brief Analysis of Methods for Detecting Moving Objects Using Computer Vision
In many computer vision applications, moving object detection has drawn notable interest. The scientific community has made numerous contributions to address the significant difficulties of moving object detection in practical settings. The research thoroughly analyzes several moving object recognition methods, which are divided into four groups: methods based on background modeling, Approaches rooted in frame differences, methods based on visual motion estimation, and methodologies based on deep learning. Additionally, thorough explanations of numerous techniques in each category are offered