75,388 research outputs found
Region-based spatial and temporal image segmentation
This work discusses region-based representations for image and video sequence segmentation. It presents effective image segmentation techniques and demonstrates how these techniques may be integrated into algorithms that solve some of the motion segmentation problems. The region-based representation offers a way to perform a first level of abstraction and to reduce the number of elements to process with respect to the classical pixel-based representation.
Motion segmentation is a fundamental technique for the analysis and the understanding of image sequences of real scenes. Motion segmentation 'describes' the sequence as sets of pixels moving coherently across one sequence with associated motions. This description is essential to the identification of the objects in the scene and to a more efficient manipulation of video sequences.
This thesis presents a hybrid framework based on the combination of spatial and motion information for the segmentation of moving objects in image sequences accordingly with their motion. We formulate the problem as graph labelling over a region moving graph where nodes correspond coherently to moving atomic regions. This is a flexible high-level representation which individualizes moving independent objects. Starting from an over-segmentation of the image, the objects are formed by merging neighbouring regions together based on their mutual spatial and temporal similarity, taking spatial and motion information into account with the emphasis being on the second. Final segmentation is obtained by a spectral-based graph cuts approach.
The initial phase for the moving object segmentation aims to reduce image noise without destroying the topological structure of the objects by anisotropic bilateral filtering. An initial spatial partition into a set of homogeneous regions is obtained by the watershed transform. Motion vector of each region is estimated by a variational approach. Next a region moving graph is constructed by a combination of normalized similarity between regions where mean intensity of the regions, gradient magnitude between regions, and motion information of the regions are considered. The motion similarity measure among regions is based on human perceptual characteristics. Finally, a spectral-based graph cut approach clusters and labels each moving region.
The motion segmentation approach is based on a static image segmentation method proposed by the author of this dissertation. The main idea is to use atomic regions to guide a segmentation using the intensity and the gradient information through a similarity graph-based approach. This method produces simpler segmentations, less over-segmented and compares favourably with the state-of-the-art methods. To evaluate the segmentation results a new evaluation metric is proposed, which takes into attention the way humans perceive visual information.
By incorporating spatial and motion information simultaneously in a region-based framework, we can visually obtain meaningful segmentation results. Experimental results of the proposed technique performance are given for different image sequences with or without camera motion and for still images. In the last case a comparison with the state-of-the-art approaches is made
Real-time moving object segmentation in H.264 compressed domain based on approximate reasoning
AbstractThis paper presents a real-time segmentation algorithm to obtain moving objects from the H.264 compressed domain. The proposed segmentation works with very little information and is based on two features of the H.264 compressed video: motion vectors associated to the macroblocks and decision modes. The algorithm uses fuzzy logic and allows to describe position, velocity and size of the detected regions in a comprehensive way, so the proposed approach works with low level information but manages highly comprehensive linguistic concepts. The performance of the algorithm is improved using dynamic design of fuzzy sets that avoids merge and split problems. Experimental results for several traffic scenes demonstrate the real-time performance and the encouraging results in diverse situations
Unsupervised Myocardial Segmentation for Cardiac BOLD
A fully automated 2-D+time myocardial segmentation framework is proposed for cardiac magnetic resonance (CMR)
blood-oxygen-level-dependent (BOLD) data sets. Ischemia detection with CINE BOLD CMR relies on spatio-temporal patterns in myocardial
intensity, but these patterns also trouble supervised segmentation methods, the de facto standard for myocardial segmentation in cine MRI.
Segmentation errors severely undermine the accurate extraction of these patterns. In this paper, we build a joint motion and appearance method
that relies on dictionary learning to find a suitable subspace.Our method is based on variational pre-processing and spatial regularization using
Markov random fields, to further improve performance. The superiority of the proposed segmentation technique is demonstrated on a data set
containing cardiac phase resolved BOLD MR and standard CINE MR image sequences acquired in baseline and is chemic condition across ten
canine subjects. Our unsupervised approach outperforms even supervised state-of-the-art segmentation techniques by at least 10% when using
Dice to measure accuracy on BOLD data and performs at par for standard CINE MR. Furthermore, a novel segmental analysis method attuned
for BOLD time series is utilized to demonstrate the effectiveness of the proposed method in preserving key BOLD patterns
Lucid Data Dreaming for Video Object Segmentation
Convolutional networks reach top quality in pixel-level video object
segmentation but require a large amount of training data (1k~100k) to deliver
such results. We propose a new training strategy which achieves
state-of-the-art results across three evaluation datasets while using 20x~1000x
less annotated data than competing methods. Our approach is suitable for both
single and multiple object segmentation. Instead of using large training sets
hoping to generalize across domains, we generate in-domain training data using
the provided annotation on the first frame of each video to synthesize ("lucid
dream") plausible future video frames. In-domain per-video training data allows
us to train high quality appearance- and motion-based models, as well as tune
the post-processing stage. This approach allows to reach competitive results
even when training from only a single annotated frame, without ImageNet
pre-training. Our results indicate that using a larger training set is not
automatically better, and that for the video object segmentation task a smaller
training set that is closer to the target domain is more effective. This
changes the mindset regarding how many training samples and general
"objectness" knowledge are required for the video object segmentation task.Comment: Accepted in International Journal of Computer Vision (IJCV
Robust Temporally Coherent Laplacian Protrusion Segmentation of 3D Articulated Bodies
In motion analysis and understanding it is important to be able to fit a
suitable model or structure to the temporal series of observed data, in order
to describe motion patterns in a compact way, and to discriminate between them.
In an unsupervised context, i.e., no prior model of the moving object(s) is
available, such a structure has to be learned from the data in a bottom-up
fashion. In recent times, volumetric approaches in which the motion is captured
from a number of cameras and a voxel-set representation of the body is built
from the camera views, have gained ground due to attractive features such as
inherent view-invariance and robustness to occlusions. Automatic, unsupervised
segmentation of moving bodies along entire sequences, in a temporally-coherent
and robust way, has the potential to provide a means of constructing a
bottom-up model of the moving body, and track motion cues that may be later
exploited for motion classification. Spectral methods such as locally linear
embedding (LLE) can be useful in this context, as they preserve "protrusions",
i.e., high-curvature regions of the 3D volume, of articulated shapes, while
improving their separation in a lower dimensional space, making them in this
way easier to cluster. In this paper we therefore propose a spectral approach
to unsupervised and temporally-coherent body-protrusion segmentation along time
sequences. Volumetric shapes are clustered in an embedding space, clusters are
propagated in time to ensure coherence, and merged or split to accommodate
changes in the body's topology. Experiments on both synthetic and real
sequences of dense voxel-set data are shown. This supports the ability of the
proposed method to cluster body-parts consistently over time in a totally
unsupervised fashion, its robustness to sampling density and shape quality, and
its potential for bottom-up model constructionComment: 31 pages, 26 figure
Segmentation of the left ventricle of the heart in 3-D+t MRI data using an optimized nonrigid temporal model
Modern medical imaging modalities provide large amounts of information in both the spatial and temporal domains and the incorporation of this information in a coherent algorithmic framework is a significant challenge. In this paper, we present a novel and intuitive approach to combine 3-D spatial and temporal (3-D + time) magnetic resonance imaging (MRI) data in an integrated segmentation algorithm to extract the myocardium of the left ventricle. A novel level-set segmentation process is developed that simultaneously delineates and tracks the boundaries of the left ventricle muscle. By encoding prior knowledge about cardiac temporal evolution in a parametric framework, an expectation-maximization algorithm optimally tracks the myocardial deformation over the cardiac cycle. The expectation step deforms the level-set function while the maximization step updates the prior temporal model parameters to perform the segmentation in a nonrigid sense
- …