69,443 research outputs found

    Motion Segmentation at Any Speed

    Full text link

    Efficient MRF Energy Propagation for Video Segmentation via Bilateral Filters

    Get PDF
    Segmentation of an object from a video is a challenging task in multimedia applications. Depending on the application, automatic or interactive methods are desired; however, regardless of the application type, efficient computation of video object segmentation is crucial for time-critical applications; specifically, mobile and interactive applications require near real-time efficiencies. In this paper, we address the problem of video segmentation from the perspective of efficiency. We initially redefine the problem of video object segmentation as the propagation of MRF energies along the temporal domain. For this purpose, a novel and efficient method is proposed to propagate MRF energies throughout the frames via bilateral filters without using any global texture, color or shape model. Recently presented bi-exponential filter is utilized for efficiency, whereas a novel technique is also developed to dynamically solve graph-cuts for varying, non-lattice graphs in general linear filtering scenario. These improvements are experimented for both automatic and interactive video segmentation scenarios. Moreover, in addition to the efficiency, segmentation quality is also tested both quantitatively and qualitatively. Indeed, for some challenging examples, significant time efficiency is observed without loss of segmentation quality.Comment: Multimedia, IEEE Transactions on (Volume:16, Issue: 5, Aug. 2014

    Event-Based Motion Segmentation by Motion Compensation

    Full text link
    In contrast to traditional cameras, whose pixels have a common exposure time, event-based cameras are novel bio-inspired sensors whose pixels work independently and asynchronously output intensity changes (called "events"), with microsecond resolution. Since events are caused by the apparent motion of objects, event-based cameras sample visual information based on the scene dynamics and are, therefore, a more natural fit than traditional cameras to acquire motion, especially at high speeds, where traditional cameras suffer from motion blur. However, distinguishing between events caused by different moving objects and by the camera's ego-motion is a challenging task. We present the first per-event segmentation method for splitting a scene into independently moving objects. Our method jointly estimates the event-object associations (i.e., segmentation) and the motion parameters of the objects (or the background) by maximization of an objective function, which builds upon recent results on event-based motion-compensation. We provide a thorough evaluation of our method on a public dataset, outperforming the state-of-the-art by as much as 10%. We also show the first quantitative evaluation of a segmentation algorithm for event cameras, yielding around 90% accuracy at 4 pixels relative displacement.Comment: When viewed in Acrobat Reader, several of the figures animate. Video: https://youtu.be/0q6ap_OSBA

    Learning to Segment and Represent Motion Primitives from Driving Data for Motion Planning Applications

    Full text link
    Developing an intelligent vehicle which can perform human-like actions requires the ability to learn basic driving skills from a large amount of naturalistic driving data. The algorithms will become efficient if we could decompose the complex driving tasks into motion primitives which represent the elementary compositions of driving skills. Therefore, the purpose of this paper is to segment unlabeled trajectory data into a library of motion primitives. By applying a probabilistic inference based on an iterative Expectation-Maximization algorithm, our method segments the collected trajectories while learning a set of motion primitives represented by the dynamic movement primitives. The proposed method utilizes the mutual dependencies between the segmentation and representation of motion primitives and the driving-specific based initial segmentation. By utilizing this mutual dependency and the initial condition, this paper presents how we can enhance the performance of both the segmentation and the motion primitive library establishment. We also evaluate the applicability of the primitive representation method to imitation learning and motion planning algorithms. The model is trained and validated by using the driving data collected from the Beijing Institute of Technology intelligent vehicle platform. The results show that the proposed approach can find the proper segmentation and establish the motion primitive library simultaneously

    The First Passage Probability of Intracellular Particle Trafficking

    Full text link
    The first passage probability (FPP), of trafficked intracellular particles reaching a displacement L, in a given time t or inverse velocity S = t/L, can be calculated robustly from measured particle tracks, and gives a measure of particle movement in which different types of motion, e.g. diffusion, ballistic motion, and transient run-rest motion, can readily be distinguished in a single graph, and compared with mathematical models. The FPP is attractive in that it offers a means of reducing the data in the measured tracks, without making assumptions about the mechanism of motion: for example, it does not employ smoothing, segementation or arbitrary thresholds to discriminate between different types of motion in a particle track. Taking experimental data from tracked endocytic vesicles, and calculating the FPP, we see how three molecular treatments affect the trafficking. We show the FPP can quantify complicated movement which is neither completely random nor completely deterministic, making it highly applicable to trafficked particles in cell biology.Comment: Article: 13 pages, 8 figure

    Neural Models of Motion Integration, Segmentation, and Probablistic Decision-Making

    Full text link
    When brain mechanism carry out motion integration and segmentation processes that compute unambiguous global motion percepts from ambiguous local motion signals? Consider, for example, a deer running at variable speeds behind forest cover. The forest cover is an occluder that creates apertures through which fragments of the deer's motion signals are intermittently experienced. The brain coherently groups these fragments into a trackable percept of the deer in its trajectory. Form and motion processes are needed to accomplish this using feedforward and feedback interactions both within and across cortical processing streams. All the cortical areas V1, V2, MT, and MST are involved in these interactions. Figure-ground processes in the form stream through V2, such as the seperation of occluding boundaries of the forest cover from the boundaries of the deer, select the motion signals which determine global object motion percepts in the motion stream through MT. Sparse, but unambiguous, feauture tracking signals are amplified before they propogate across position and are intergrated with far more numerous ambiguous motion signals. Figure-ground and integration processes together determine the global percept. A neural model predicts the processing stages that embody these form and motion interactions. Model concepts and data are summarized about motion grouping across apertures in response to a wide variety of displays, and probabilistic decision making in parietal cortex in response to random dot displays.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    An extension of min/max flow framework

    Get PDF
    In this paper, the min/max flow scheme for image restoration is revised. The novelty consists of the fol- 24 lowing three parts. The first is to analyze the reason of the speckle generation and then to modify the 25 original scheme. The second is to point out that the continued application of this scheme cannot result 26 in an adaptive stopping of the curvature flow. This is followed by modifications of the original scheme 27 through the introduction of the Gradient Vector Flow (GVF) field and the zero-crossing detector, so as 28 to control the smoothing effect. Our experimental results with image restoration show that the proposed 29 schemes can reach a steady state solution while preserving the essential structures of objects. The third is 30 to extend the min/max flow scheme to deal with the boundary leaking problem, which is indeed an 31 intrinsic shortcoming of the familiar geodesic active contour model. The min/max flow framework pro- 32 vides us with an effective way to approximate the optimal solution. From an implementation point of 33 view, this extended scheme makes the speed function simpler and more flexible. The experimental 34 results of segmentation and region tracking show that the boundary leaking problem can be effectively 35 suppressed

    Adaptive Temporal Encoding Network for Video Instance-level Human Parsing

    Full text link
    Beyond the existing single-person and multiple-person human parsing tasks in static images, this paper makes the first attempt to investigate a more realistic video instance-level human parsing that simultaneously segments out each person instance and parses each instance into more fine-grained parts (e.g., head, leg, dress). We introduce a novel Adaptive Temporal Encoding Network (ATEN) that alternatively performs temporal encoding among key frames and flow-guided feature propagation from other consecutive frames between two key frames. Specifically, ATEN first incorporates a Parsing-RCNN to produce the instance-level parsing result for each key frame, which integrates both the global human parsing and instance-level human segmentation into a unified model. To balance between accuracy and efficiency, the flow-guided feature propagation is used to directly parse consecutive frames according to their identified temporal consistency with key frames. On the other hand, ATEN leverages the convolution gated recurrent units (convGRU) to exploit temporal changes over a series of key frames, which are further used to facilitate the frame-level instance-level parsing. By alternatively performing direct feature propagation between consistent frames and temporal encoding network among key frames, our ATEN achieves a good balance between frame-level accuracy and time efficiency, which is a common crucial problem in video object segmentation research. To demonstrate the superiority of our ATEN, extensive experiments are conducted on the most popular video segmentation benchmark (DAVIS) and a newly collected Video Instance-level Parsing (VIP) dataset, which is the first video instance-level human parsing dataset comprised of 404 sequences and over 20k frames with instance-level and pixel-wise annotations.Comment: To appear in ACM MM 2018. Code link: https://github.com/HCPLab-SYSU/ATEN. Dataset link: http://sysu-hcp.net/li
    corecore