69,443 research outputs found
Efficient MRF Energy Propagation for Video Segmentation via Bilateral Filters
Segmentation of an object from a video is a challenging task in multimedia
applications. Depending on the application, automatic or interactive methods
are desired; however, regardless of the application type, efficient computation
of video object segmentation is crucial for time-critical applications;
specifically, mobile and interactive applications require near real-time
efficiencies. In this paper, we address the problem of video segmentation from
the perspective of efficiency. We initially redefine the problem of video
object segmentation as the propagation of MRF energies along the temporal
domain. For this purpose, a novel and efficient method is proposed to propagate
MRF energies throughout the frames via bilateral filters without using any
global texture, color or shape model. Recently presented bi-exponential filter
is utilized for efficiency, whereas a novel technique is also developed to
dynamically solve graph-cuts for varying, non-lattice graphs in general linear
filtering scenario. These improvements are experimented for both automatic and
interactive video segmentation scenarios. Moreover, in addition to the
efficiency, segmentation quality is also tested both quantitatively and
qualitatively. Indeed, for some challenging examples, significant time
efficiency is observed without loss of segmentation quality.Comment: Multimedia, IEEE Transactions on (Volume:16, Issue: 5, Aug. 2014
Event-Based Motion Segmentation by Motion Compensation
In contrast to traditional cameras, whose pixels have a common exposure time,
event-based cameras are novel bio-inspired sensors whose pixels work
independently and asynchronously output intensity changes (called "events"),
with microsecond resolution. Since events are caused by the apparent motion of
objects, event-based cameras sample visual information based on the scene
dynamics and are, therefore, a more natural fit than traditional cameras to
acquire motion, especially at high speeds, where traditional cameras suffer
from motion blur. However, distinguishing between events caused by different
moving objects and by the camera's ego-motion is a challenging task. We present
the first per-event segmentation method for splitting a scene into
independently moving objects. Our method jointly estimates the event-object
associations (i.e., segmentation) and the motion parameters of the objects (or
the background) by maximization of an objective function, which builds upon
recent results on event-based motion-compensation. We provide a thorough
evaluation of our method on a public dataset, outperforming the
state-of-the-art by as much as 10%. We also show the first quantitative
evaluation of a segmentation algorithm for event cameras, yielding around 90%
accuracy at 4 pixels relative displacement.Comment: When viewed in Acrobat Reader, several of the figures animate. Video:
https://youtu.be/0q6ap_OSBA
Learning to Segment and Represent Motion Primitives from Driving Data for Motion Planning Applications
Developing an intelligent vehicle which can perform human-like actions
requires the ability to learn basic driving skills from a large amount of
naturalistic driving data. The algorithms will become efficient if we could
decompose the complex driving tasks into motion primitives which represent the
elementary compositions of driving skills. Therefore, the purpose of this paper
is to segment unlabeled trajectory data into a library of motion primitives. By
applying a probabilistic inference based on an iterative
Expectation-Maximization algorithm, our method segments the collected
trajectories while learning a set of motion primitives represented by the
dynamic movement primitives. The proposed method utilizes the mutual
dependencies between the segmentation and representation of motion primitives
and the driving-specific based initial segmentation. By utilizing this mutual
dependency and the initial condition, this paper presents how we can enhance
the performance of both the segmentation and the motion primitive library
establishment. We also evaluate the applicability of the primitive
representation method to imitation learning and motion planning algorithms. The
model is trained and validated by using the driving data collected from the
Beijing Institute of Technology intelligent vehicle platform. The results show
that the proposed approach can find the proper segmentation and establish the
motion primitive library simultaneously
The First Passage Probability of Intracellular Particle Trafficking
The first passage probability (FPP), of trafficked intracellular particles
reaching a displacement L, in a given time t or inverse velocity S = t/L, can
be calculated robustly from measured particle tracks, and gives a measure of
particle movement in which different types of motion, e.g. diffusion, ballistic
motion, and transient run-rest motion, can readily be distinguished in a single
graph, and compared with mathematical models. The FPP is attractive in that it
offers a means of reducing the data in the measured tracks, without making
assumptions about the mechanism of motion: for example, it does not employ
smoothing, segementation or arbitrary thresholds to discriminate between
different types of motion in a particle track. Taking experimental data from
tracked endocytic vesicles, and calculating the FPP, we see how three molecular
treatments affect the trafficking. We show the FPP can quantify complicated
movement which is neither completely random nor completely deterministic,
making it highly applicable to trafficked particles in cell biology.Comment: Article: 13 pages, 8 figure
Neural Models of Motion Integration, Segmentation, and Probablistic Decision-Making
When brain mechanism carry out motion integration and segmentation processes that compute unambiguous global motion percepts from ambiguous local motion signals? Consider, for example, a deer running at variable speeds behind forest cover. The forest cover is an occluder that creates apertures through which fragments of the deer's motion signals are intermittently experienced. The brain coherently groups these fragments into a trackable percept of the deer in its trajectory. Form and motion processes are needed to accomplish this using feedforward and feedback interactions both within and across cortical processing streams. All the cortical areas V1, V2, MT, and MST are involved in these interactions. Figure-ground processes in the form stream through V2, such as the seperation of occluding boundaries of the forest cover from the boundaries of the deer, select the motion signals which determine global object motion percepts in the motion stream through MT. Sparse, but unambiguous, feauture tracking signals are amplified before they propogate across position and are intergrated with far more numerous ambiguous motion signals. Figure-ground and integration processes together determine the global percept. A neural model predicts the processing stages that embody these form and motion interactions. Model concepts and data are summarized about motion grouping across apertures in response to a wide variety of displays, and probabilistic decision making in parietal cortex in response to random dot displays.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
An extension of min/max flow framework
In this paper, the min/max flow scheme for image restoration is revised. The novelty consists of the fol-
24 lowing three parts. The first is to analyze the reason of the speckle generation and then to modify the
25 original scheme. The second is to point out that the continued application of this scheme cannot result
26 in an adaptive stopping of the curvature flow. This is followed by modifications of the original scheme
27 through the introduction of the Gradient Vector Flow (GVF) field and the zero-crossing detector, so as
28 to control the smoothing effect. Our experimental results with image restoration show that the proposed
29 schemes can reach a steady state solution while preserving the essential structures of objects. The third is
30 to extend the min/max flow scheme to deal with the boundary leaking problem, which is indeed an
31 intrinsic shortcoming of the familiar geodesic active contour model. The min/max flow framework pro-
32 vides us with an effective way to approximate the optimal solution. From an implementation point of
33 view, this extended scheme makes the speed function simpler and more flexible. The experimental
34 results of segmentation and region tracking show that the boundary leaking problem can be effectively
35 suppressed
Adaptive Temporal Encoding Network for Video Instance-level Human Parsing
Beyond the existing single-person and multiple-person human parsing tasks in
static images, this paper makes the first attempt to investigate a more
realistic video instance-level human parsing that simultaneously segments out
each person instance and parses each instance into more fine-grained parts
(e.g., head, leg, dress). We introduce a novel Adaptive Temporal Encoding
Network (ATEN) that alternatively performs temporal encoding among key frames
and flow-guided feature propagation from other consecutive frames between two
key frames. Specifically, ATEN first incorporates a Parsing-RCNN to produce the
instance-level parsing result for each key frame, which integrates both the
global human parsing and instance-level human segmentation into a unified
model. To balance between accuracy and efficiency, the flow-guided feature
propagation is used to directly parse consecutive frames according to their
identified temporal consistency with key frames. On the other hand, ATEN
leverages the convolution gated recurrent units (convGRU) to exploit temporal
changes over a series of key frames, which are further used to facilitate the
frame-level instance-level parsing. By alternatively performing direct feature
propagation between consistent frames and temporal encoding network among key
frames, our ATEN achieves a good balance between frame-level accuracy and time
efficiency, which is a common crucial problem in video object segmentation
research. To demonstrate the superiority of our ATEN, extensive experiments are
conducted on the most popular video segmentation benchmark (DAVIS) and a newly
collected Video Instance-level Parsing (VIP) dataset, which is the first video
instance-level human parsing dataset comprised of 404 sequences and over 20k
frames with instance-level and pixel-wise annotations.Comment: To appear in ACM MM 2018. Code link:
https://github.com/HCPLab-SYSU/ATEN. Dataset link: http://sysu-hcp.net/li
- …