220 research outputs found
Detection of image structures using the Fisher information and the Rao metric
In many detection problems, the structures to be detected are parameterized by the points of a parameter space. If the conditional probability density function for the measurements is known, then detection can be achieved by sampling the parameter space at a finite number of points and checking each point to see if the corresponding structure is supported by the data. The number of samples and the distances between neighboring samples are calculated using the Rao metric on the parameter space. The Rao metric is obtained from the Fisher information which is, in turn, obtained from the conditional probability density function. An upper bound is obtained for the probability of a false detection. The calculations are simplified in the low noise case by making an asymptotic approximation to the Fisher information. An application to line detection is described. Expressions are obtained for the asymptotic approximation to the Fisher information, the volume of the parameter space, and the number of samples. The time complexity for line detection is estimated. An experimental comparison is made with a Hough transform-based method for detecting lines
The Fisher-Rao metric for projective transformations of the line
A conditional probability density function is defined for measurements arising from a projective transformation of the line. The conditional density is a member of a parameterised family of densities in which the parameter takes values in the three dimensional manifold of projective transformations of the line. The Fisher information of the family defines on the manifold a Riemannian metric known as the Fisher-Rao metric. The Fisher-Rao metric has an approximation which is accurate if the variance of the measurement errors is small. It is shown that the manifold of parameter values has a finite volume under the approximating metric.
These results are the basis of a simple algorithm for detecting those projective transformations of the line which are compatible with a given set of measurements. The algorithm searches a finite list of representative parameter values for those values compatible with the measurements. Experiments with the algorithm suggest that it can detect a projective transformation of the line even when the correspondences between the components of the measurements in the domain and the range of the projective transformation are unknown
A Fisher-Rao metric for paracatadioptric images of lines
In a central paracatadioptric imaging system a perspective camera takes an image of a scene reflected in a paraboloidal mirror. A 360° field of view is obtained, but
the image is severely distorted. In particular, straight lines in the scene project to circles in the image. These distortions make it diffcult to detect projected lines using standard image processing algorithms. The distortions are removed using a Fisher-Rao metric which is defined on the space of projected lines in the paracatadioptric image. The space of projected lines is divided into subsets such that on each subset the Fisher-Rao metric is closely approximated by the Euclidean metric. Each subset is sampled at the vertices of a square grid and values are assigned to the sampled points using an adaptation of the trace transform. The result is a set of digital images to which standard image processing algorithms can be applied.
The effectiveness of this approach to line detection is illustrated using two algorithms, both of which are based on the Sobel edge operator. The task of line detection is reduced to the task of finding isolated peaks in a Sobel image. An experimental comparison is made between these two algorithms and third algorithm taken from the literature and
based on the Hough transform
Application of the Fisher-Rao metric to ellipse detection
The parameter space for the ellipses in a two dimensional image is a five dimensional manifold, where each point of the manifold corresponds to an ellipse in the image. The parameter space becomes a Riemannian manifold under a Fisher-Rao metric, which is derived from a Gaussian model for the blurring of ellipses in the image. Two points in the parameter space are close together under the Fisher-Rao metric if the corresponding ellipses are close together in the image. The Fisher-Rao metric is accurately approximated by a simpler metric under the assumption that the blurring is small compared with the sizes of the ellipses under consideration. It is shown that the parameter space for the ellipses in the image has a finite volume under the approximation to the Fisher-Rao metric. As a consequence the parameter space can be replaced, for the purpose of ellipse detection, by a finite set of points sampled from it. An efficient algorithm for sampling the parameter space is described. The algorithm uses the fact that the approximating metric is flat, and therefore locally Euclidean, on each three dimensional family of ellipses with a fixed orientation and a fixed eccentricity. Once the sample points have been obtained, ellipses are detected in a given image by checking each sample point in turn to see if the corresponding ellipse is supported by the nearby image pixel values. The resulting algorithm for ellipse detection is implemented. A multiresolution version of the algorithm is also implemented. The experimental results suggest that ellipses can be reliably detected in a given low resolution image and that the number of false detections
can be reduced using the multiresolution algorithm
Non-parametric hidden conditional random fields for action classification
Conditional Random Fields (CRF), a structured prediction method, combines probabilistic graphical models and discriminative classification techniques in order to predict class labels in sequence recognition problems. Its extension the Hidden Conditional Random Fields (HCRF) uses hidden state variables in order to capture intermediate structures. The number of hidden states in an HCRF must be specified a priori. This number is often not known in advance. A non-parametric extension to the HCRF, with the number of hidden states automatically inferred from data, is proposed here. This is a significant advantage over the classical HCRF since it avoids ad hoc model selection procedures. Further, the training and inference procedure is fully Bayesian eliminating the over fitting problem associated with frequentist methods. In particular, our construction is based on scale mixtures of Gaussians as priors over the HCRF parameters and makes use of Hierarchical Dirichlet Process (HDP) and Laplace distribution. The proposed inference procedure uses elliptical slice sampling, a Markov Chain Monte Carlo (MCMC) method, in order to sample optimal and sparse posterior HCRF parameters. The above technique is applied for classifying human actions that occur in depth image sequences – a challenging computer vision problem. Experiments with real world video datasets confirm the efficacy of our classification approach
Action classification using a discriminative multilevel HDP-HMM
We classify human actions occurring in depth image sequences using features based on skeletal joint positions. The action classes are represented by a multi-level Hierarchical Dirichlet Process – Hidden Markov Model (HDP-HMM). The non-parametric HDP-HMM allows the inference of hidden states automatically from training data. The model parameters of each class are formulated as transformations from a shared base distribution, thus promoting the use of unlabelled examples during training and borrowing information across action classes. Further, the parameters are learnt in a discriminative way. We use a normalized gamma process representation of HDP and margin based likelihood functions for this purpose. We sample parameters from the complex posterior distribution induced by our discriminative likelihood function using elliptical slice sampling. Experiments with two different datasets show that action class models learnt using our technique produce good classification results
On plane-based camera calibration: a general algorithm, singularities, applications
We present a general algorithm for plane-based calibration that can deal with arbitrary numbers of views and calibration planes. The algorithm can simultaneously calibrate different views from a camera with variable intrinsic parameters and it is easy to incorporate known values of intrinsic parameters. For some minimal cases, we describe all singularities, naming the parameters that can not be estimated. Experimental results of our method are shown that exhibit the singularities while revealing good performance in non-singular conditions. Several applications of plane-based 3D geometry inference are discussed as wel
Generalized visual information analysis via tensorial algebra
High order data is modeled using matrices whose entries are numerical arrays of a fixed size. These arrays, called t-scalars, form a commutative ring under the convolution product. Matrices with elements in the ring of t-scalars are referred to as t-matrices. The t-matrices can be scaled, added and multiplied in the usual way. There are t-matrix generalizations of positive matrices, orthogonal matrices and Hermitian symmetric matrices. With the t-matrix model, it is possible to generalize many well known matrix algorithms. In particular, the t-matrices are used to generalize the SVD (Singular Value Decomposition), HOSVD (High Order SVD), PCA (Principal Component Analysis), 2DPCA (two Dimensional PCA) and GCA (Grassmannian Component Analysis). The generalized t-matrix algorithms, namely TSVD, THOSVD, TPCA, T2DPCA and TGCA, are applied to low-rank approximation, reconstruction and supervised classification of images. Experiments show that the t-matrix algorithms compare favourably with standard matrix algorithms
A system for learning statistical motion patterns
Analysis of motion patterns is an effective approach for anomaly detection and behavior prediction. Current approaches for the analysis of motion patterns depend on known scenes, where objects move in predefined ways. It is highly desirable to automatically construct object motion patterns which reflect the knowledge of the scene. In this paper, we present a system for automatically learning motion patterns for anomaly detection and behavior prediction based on a proposed algorithm for robustly tracking multiple objects. In the tracking algorithm, foreground pixels are clustered using a fast accurate fuzzy k-means algorithm. Growing and prediction of the cluster centroids of foreground pixels ensure that each cluster centroid is associated with a moving object in the scene. In the algorithm for learning motion patterns, trajectories are clustered hierarchically using spatial and temporal information and then each motion pattern is represented with a chain of Gaussian distributions. Based on the learned statistical motion patterns, statistical methods are used to detect anomalies and predict behaviors. Our system is tested using image sequences acquired, respectively, from a crowded real traffic scene and a model traffic scene. Experimental results show the robustness of the tracking algorithm, the efficiency of the algorithm for learning motion patterns, and the encouraging performance of algorithms for anomaly detection and behavior prediction
A case against epipolar geometry
We discuss briefly a number of areas where epipolar geometry is currently central in carrying out visual tasks. In contrast we demonstrate configurations for which 3D projective invariants can be computed from perspective stereo pairs, but epipolar geometry (and full projective structure) cannot. We catalogue a number of these configurations which generally involve isotropies under the 3D projective group, and investigate the connection with camera calibration. Examples are given of the invariants recovered from real images. We also indicate other areas where a strong reliance on epipolar geometry should be avoided, in particular for image transfer
- …