867 research outputs found
Retinal Vessel Segmentation Using the 2-D Morlet Wavelet and Supervised Classification
We present a method for automated segmentation of the vasculature in retinal
images. The method produces segmentations by classifying each image pixel as
vessel or non-vessel, based on the pixel's feature vector. Feature vectors are
composed of the pixel's intensity and continuous two-dimensional Morlet wavelet
transform responses taken at multiple scales. The Morlet wavelet is capable of
tuning to specific frequencies, thus allowing noise filtering and vessel
enhancement in a single step. We use a Bayesian classifier with
class-conditional probability density functions (likelihoods) described as
Gaussian mixtures, yielding a fast classification, while being able to model
complex decision surfaces and compare its performance with the linear minimum
squared error classifier. The probability distributions are estimated based on
a training set of labeled pixels obtained from manual segmentations. The
method's performance is evaluated on publicly available DRIVE and STARE
databases of manually labeled non-mydriatic images. On the DRIVE database, it
achieves an area under the receiver operating characteristic (ROC) curve of
0.9598, being slightly superior than that presented by the method of Staal et
al.Comment: 9 pages, 7 figures and 1 table. Accepted for publication in IEEE
Trans Med Imag; added copyright notic
Online learning and fusion of orientation appearance models for robust rigid object tracking
We introduce a robust framework for learning and fusing of orientation appearance models based on both texture and depth information for rigid object tracking. Our framework fuses data obtained from a standard visual camera and dense depth maps obtained by low-cost consumer depth cameras such as the Kinect. To combine these two completely different modalities, we propose to use features that do not depend on the data representation: angles. More specifically, our framework combines image gradient orientations as extracted from intensity images with the directions of surface normals computed from dense depth fields. We propose to capture the correlations between the obtained orientation appearance models using a fusion approach motivated by the original Active Appearance Models (AAMs). To incorporate these features in a learning framework, we use a robust kernel based on the Euler representation of angles which does not require off-line training, and can be efficiently implemented online. The robustness of learning from orientation appearance models is presented both theoretically and experimentally in this work. This kernel enables us to cope with gross measurement errors, missing data as well as other typical problems such as illumination changes and occlusions. By combining the proposed models with a particle filter, the proposed framework was used for performing 2D plus 3D rigid object tracking, achieving robust performance in very difficult tracking scenarios including extreme pose variations. © 2014 Elsevier B.V. All rights reserved
Robust Control and Fault Detection Filter Design for Aircraft Pitch Axis
This paper presents a robust control and fault detection filter
design for linearized longitudinal dynamics of F-16 aircraft. The
control design is based on mu; synthesis method which guarantees
the robust performance requirements and takes the structured
uncertainty into consideration. In case of F-16 aircraft, it is
assumed that an elevator failure and a sensor failure occur
during the system operation. To ensure the safety of aircraft
control system a fault detection and isolation (FDI) filter is
designed. The fault detection filter design based on geometric
approach relies on the use of (C,A) invariant subspaces which
makes possible the decoupling of different types of failure.
Typically, the FDI filter design approach is elaborated for open
loop model and it is applied in the closed loop. In this paper the
FDI filter designed for aircraft control system will be analyzed
for a closed loop system
Signal Subspace Processing in the Beam Space of a True Time Delay Beamformer Bank
A number of techniques for Radio Frequency (RF) source location for wide bandwidth signals have been described that utilize coherent signal subspace processing, but often suffer from limitations such as the requirement for preliminary source location estimation, the need to apply the technique iteratively, computational expense or others. This dissertation examines a method that performs subspace processing of the data from a bank of true time delay beamformers. The spatial diversity of the beamformer bank alleviates the need for a preliminary estimate while simultaneously reducing the dimensionality of subsequent signal subspace processing resulting in computational efficiency. The pointing direction of the true time delay beams is independent of frequency, which results in a mapping from element space to beam space that is wide bandwidth in nature. This dissertation reviews previous methods, introduces the present method, presents simulation results that demonstrate the assertions, discusses an analysis of performance in relation to the Cramer-Rao Lower Bound (CRLB) with various levels of noise in the system, and discusses computational efficiency. One limitation of the method is that in practice it may be appropriate for systems that can tolerate a limited field of view. The application of Electronic Intelligence is one such application. This application is discussed as one that is appropriate for a method exhibiting high resolution of very wide bandwidth closely spaced sources and often does not require a wide field of view. In relation to system applications, this dissertation also discusses practical employment of the novel method in terms of antenna elements, arrays, platforms, engagement geometries, and other parameters. The true time delay beam space method is shown through modeling and simulation to be capable of resolving closely spaced very wideband sources over a relevant field of view in a single algorithmic pass, requiring no course preliminary estimation, and exhibiting low computational expense superior to many previous wideband coherent integration techniques
Model-driven and Data-driven Approaches for some Object Recognition Problems
Recognizing objects from images and videos has been a long standing problem in computer vision. The recent surge in the prevalence of visual cameras has given rise to two main challenges where, (i) it is important to understand different sources of object variations in more unconstrained scenarios, and (ii) rather than describing an object in isolation, efficient learning methods for modeling object-scene `contextual' relations are required to resolve visual ambiguities.
This dissertation addresses some aspects of these challenges, and consists of two parts. First part of the work focuses on obtaining object descriptors that are largely preserved across certain sources of variations, by utilizing models for image formation and local image features. Given a single instance of an object, we investigate the following three problems. (i) Representing a 2D projection of a 3D non-planar shape invariant to articulations, when there are no self-occlusions. We propose an articulation invariant distance that is preserved across piece-wise affine transformations of a non-rigid object `parts', under a weak perspective imaging model, and then obtain a shape context-like descriptor to perform recognition; (ii) Understanding the space of `arbitrary' blurred images of an object, by representing an unknown blur kernel of a known maximum size using a complete set of orthonormal basis functions spanning that space, and showing that subspaces resulting from convolving a clean object and its blurred versions with these basis functions are equal under some assumptions. We then view the invariant subspaces as points on a Grassmann manifold, and use statistical tools that account for the underlying non-Euclidean nature of the space of these invariants to perform recognition across blur; (iii) Analyzing the robustness of local feature descriptors to different illumination conditions. We perform an empirical study of these descriptors for the problem of face recognition under lighting change, and show that the direction of image gradient largely preserves object properties across varying lighting conditions.
The second part of the dissertation utilizes information conveyed by large quantity of data to learn contextual information shared by an object (or an entity) with its surroundings. (i) We first consider a supervised two-class problem of detecting lane markings from road video sequences, where we learn relevant feature-level contextual information through a machine learning algorithm based on boosting. We then focus on unsupervised object classification scenarios where, (ii) we perform clustering using maximum margin principles, by deriving some basic properties on the affinity of `a pair of points' belonging to the same cluster using the information conveyed by `all' points in the system, and (iii) then consider correspondence-free adaptation of statistical classifiers across domain shifting transformations, by generating meaningful `intermediate domains' that incrementally convey potential information about the domain change
Projection Based Models for High Dimensional Data
In recent years, many machine learning applications have arisen which deal with the
problem of finding patterns in high dimensional data. Principal component analysis
(PCA) has become ubiquitous in this setting. PCA performs dimensionality reduction
by estimating latent factors which minimise the reconstruction error between
the original data and its low-dimensional projection. We initially consider a situation
where influential observations exist within the dataset which have a large,
adverse affect on the estimated PCA model. We propose a measure of “predictive
influence” to detect these points based on the contribution of each point to the
leave-one-out reconstruction error of the model using an analytic PRedicted REsidual
Sum of Squares (PRESS) statistic. We then develop a robust alternative to PCA
to deal with the presence of influential observations and outliers which minimizes
the predictive reconstruction error.
In some applications there may be unobserved clusters in the data, for which
fitting PCA models to subsets of the data would provide a better fit. This is known
as the subspace clustering problem. We develop a novel algorithm for subspace
clustering which iteratively fits PCA models to subsets of the data and assigns observations
to clusters based on their predictive influence on the reconstruction error.
We study the convergence of the algorithm and compare its performance to a number
of subspace clustering methods on simulated data and in real applications from
computer vision involving clustering object trajectories in video sequences and images
of faces.
We extend our predictive clustering framework to a setting where two high-dimensional
views of data have been obtained. Often, only either clustering or predictive modelling is performed between the views. Instead, we aim to recover
clusters which are maximally predictive between the views. In this setting two block
partial least squares (TB-PLS) is a useful model. TB-PLS performs dimensionality
reduction in both views by estimating latent factors that are highly predictive. We
fit TB-PLS models to subsets of data and assign points to clusters based on their
predictive influence under each model which is evaluated using a PRESS statistic.
We compare our method to state of the art algorithms in real applications in webpage
and document clustering and find that our approach to predictive clustering
yields superior results.
Finally, we propose a method for dynamically tracking multivariate data streams
based on PLS. Our method learns a linear regression function from multivariate
input and output streaming data in an incremental fashion while also performing
dimensionality reduction and variable selection. Moreover, the recursive regression
model is able to adapt to sudden changes in the data generating mechanism and also
identifies the number of latent factors. We apply our method to the enhanced index
tracking problem in computational finance
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
- …