39,324 research outputs found
Ball-Scale Based Hierarchical Multi-Object Recognition in 3D Medical Images
This paper investigates, using prior shape models and the concept of ball
scale (b-scale), ways of automatically recognizing objects in 3D images without
performing elaborate searches or optimization. That is, the goal is to place
the model in a single shot close to the right pose (position, orientation, and
scale) in a given image so that the model boundaries fall in the close vicinity
of object boundaries in the image. This is achieved via the following set of
key ideas: (a) A semi-automatic way of constructing a multi-object shape model
assembly. (b) A novel strategy of encoding, via b-scale, the pose relationship
between objects in the training images and their intensity patterns captured in
b-scale images. (c) A hierarchical mechanism of positioning the model, in a
one-shot way, in a given image from a knowledge of the learnt pose relationship
and the b-scale image of the given image to be segmented. The evaluation
results on a set of 20 routine clinical abdominal female and male CT data sets
indicate the following: (1) Incorporating a large number of objects improves
the recognition accuracy dramatically. (2) The recognition algorithm can be
thought as a hierarchical framework such that quick replacement of the model
assembly is defined as coarse recognition and delineation itself is known as
finest recognition. (3) Scale yields useful information about the relationship
between the model assembly and any given image such that the recognition
results in a placement of the model close to the actual pose without doing any
elaborate searches or optimization. (4) Effective object recognition can make
delineation most accurate.Comment: This paper was published and presented in SPIE Medical Imaging 201
Gaussian Process Morphable Models
Statistical shape models (SSMs) represent a class of shapes as a normal
distribution of point variations, whose parameters are estimated from example
shapes. Principal component analysis (PCA) is applied to obtain a
low-dimensional representation of the shape variation in terms of the leading
principal components. In this paper, we propose a generalization of SSMs,
called Gaussian Process Morphable Models (GPMMs). We model the shape variations
with a Gaussian process, which we represent using the leading components of its
Karhunen-Loeve expansion. To compute the expansion, we make use of an
approximation scheme based on the Nystrom method. The resulting model can be
seen as a continuous analogon of an SSM. However, while for SSMs the shape
variation is restricted to the span of the example data, with GPMMs we can
define the shape variation using any Gaussian process. For example, we can
build shape models that correspond to classical spline models, and thus do not
require any example data. Furthermore, Gaussian processes make it possible to
combine different models. For example, an SSM can be extended with a spline
model, to obtain a model that incorporates learned shape characteristics, but
is flexible enough to explain shapes that cannot be represented by the SSM. We
introduce a simple algorithm for fitting a GPMM to a surface or image. This
results in a non-rigid registration approach, whose regularization properties
are defined by a GPMM. We show how we can obtain different registration
schemes,including methods for multi-scale, spatially-varying or hybrid
registration, by constructing an appropriate GPMM. As our approach strictly
separates modelling from the fitting process, this is all achieved without
changes to the fitting algorithm. We show the applicability and versatility of
GPMMs on a clinical use case, where the goal is the model-based segmentation of
3D forearm images
Structured Knowledge Representation for Image Retrieval
We propose a structured approach to the problem of retrieval of images by
content and present a description logic that has been devised for the semantic
indexing and retrieval of images containing complex objects. As other
approaches do, we start from low-level features extracted with image analysis
to detect and characterize regions in an image. However, in contrast with
feature-based approaches, we provide a syntax to describe segmented regions as
basic objects and complex objects as compositions of basic ones. Then we
introduce a companion extensional semantics for defining reasoning services,
such as retrieval, classification, and subsumption. These services can be used
for both exact and approximate matching, using similarity measures. Using our
logical approach as a formal specification, we implemented a complete
client-server image retrieval system, which allows a user to pose both queries
by sketch and queries by example. A set of experiments has been carried out on
a testbed of images to assess the retrieval capabilities of the system in
comparison with expert users ranking. Results are presented adopting a
well-established measure of quality borrowed from textual information
retrieval
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
STV-based Video Feature Processing for Action Recognition
In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end
- …