7,680 research outputs found
Deep learning cardiac motion analysis for human survival prediction
Motion analysis is used in computer vision to understand the behaviour of
moving objects in sequences of images. Optimising the interpretation of dynamic
biological systems requires accurate and precise motion tracking as well as
efficient representations of high-dimensional motion trajectories so that these
can be used for prediction tasks. Here we use image sequences of the heart,
acquired using cardiac magnetic resonance imaging, to create time-resolved
three-dimensional segmentations using a fully convolutional network trained on
anatomical shape priors. This dense motion model formed the input to a
supervised denoising autoencoder (4Dsurvival), which is a hybrid network
consisting of an autoencoder that learns a task-specific latent code
representation trained on observed outcome data, yielding a latent
representation optimised for survival prediction. To handle right-censored
survival outcomes, our network used a Cox partial likelihood loss function. In
a study of 302 patients the predictive accuracy (quantified by Harrell's
C-index) was significantly higher (p < .0001) for our model C=0.73 (95 CI:
0.68 - 0.78) than the human benchmark of C=0.59 (95 CI: 0.53 - 0.65). This
work demonstrates how a complex computer vision task using high-dimensional
medical image data can efficiently predict human survival
Online Mutual Foreground Segmentation for Multispectral Stereo Videos
The segmentation of video sequences into foreground and background regions is
a low-level process commonly used in video content analysis and smart
surveillance applications. Using a multispectral camera setup can improve this
process by providing more diverse data to help identify objects despite adverse
imaging conditions. The registration of several data sources is however not
trivial if the appearance of objects produced by each sensor differs
substantially. This problem is further complicated when parallax effects cannot
be ignored when using close-range stereo pairs. In this work, we present a new
method to simultaneously tackle multispectral segmentation and stereo
registration. Using an iterative procedure, we estimate the labeling result for
one problem using the provisional result of the other. Our approach is based on
the alternating minimization of two energy functions that are linked through
the use of dynamic priors. We rely on the integration of shape and appearance
cues to find proper multispectral correspondences, and to properly segment
objects in low contrast regions. We also formulate our model as a frame
processing pipeline using higher order terms to improve the temporal coherence
of our results. Our method is evaluated under different configurations on
multiple multispectral datasets, and our implementation is available online.Comment: Preprint accepted for publication in IJCV (December 2018
Component-based Segmentation of words from handwritten Arabic text
Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words segmentation. Meanwhile, an improved projection based method is also employed for baseline detection. The proposed method has been successfully tested on IFN/ENIT database consisting of 26459 Arabic words handwritten by 411 different writers, and the results were promising and very encouraging in more accurate detection of the baseline and segmentation of words for further recognition
Continuous Action Recognition Based on Sequence Alignment
Continuous action recognition is more challenging than isolated recognition
because classification and segmentation must be simultaneously carried out. We
build on the well known dynamic time warping (DTW) framework and devise a novel
visual alignment technique, namely dynamic frame warping (DFW), which performs
isolated recognition based on per-frame representation of videos, and on
aligning a test sequence with a model sequence. Moreover, we propose two
extensions which enable to perform recognition concomitant with segmentation,
namely one-pass DFW and two-pass DFW. These two methods have their roots in the
domain of continuous recognition of speech and, to the best of our knowledge,
their extension to continuous visual action recognition has been overlooked. We
test and illustrate the proposed techniques with a recently released dataset
(RAVEL) and with two public-domain datasets widely used in action recognition
(Hollywood-1 and Hollywood-2). We also compare the performances of the proposed
isolated and continuous recognition algorithms with several recently published
methods
- …