77 research outputs found

    Probabilistic methods for pose-invariant recognition in computer vision

    Get PDF
    This thesis is concerned with two central themes in computer vision, the properties of oriented quadrature filters, and methods for implementing rotation invariance in an object matching and recognition system. Objects are modeled as combinations of local features, and human faces are used as the reference object class. The topics covered include optimal design of filter banks for feature detection and object recognition, modeling of pose effects in filter responses and the construction of probability-based pose-invariant object matching and recognition systems employing oriented filters. Gabor filters have been derived as information-theoretically optimal bandpass filters, simultaneously maximizing the localization capability in space and spatial-frequency domains. Steerable oriented filters have been developed as a tool for reducing the amount of computation required in rotation invariant systems. In this work, the framework of steerable filters is applied to Gabor-type filters and novel analytical derivations for the required steering equations for them are presented. Gabor filters and some related filters are experimentally shown to be approximately steerable with low steering error, given suitable filter shape parameters. The effects of filter shape parameters in feature localization and object recognition are also studied using a complete feature matching system. A novel approach for modeling the pose variation of features due to depth rotations is introduced. Instead of manifold learning methods, the use synthetic data makes it possible to apply simpler regression modeling methods. The use of synthetic data in learning the pose models for local features is a central contribution of the work. The object matching methods considered in the work are based on probabilistic reasoning. The required object likelihood functions are constructed using feature similarity measures, and random sampling methods are applied for finding the modes of high probability in the likelihood probability distribution functions. The Population Monte Carlo algorithm is shown to solve successfully pose estimation problems in which simple Metropolis and Gibbs sampling methods give unsatisfactory performance.TÀmÀ vÀitöskirja kÀsittelee kahta keskeistÀ tietokonenÀön osa-aluetta, signaalin suunnalle herkkien kvadratuurisuodinten ominaisuuksia, ja nÀkymÀsuunnasta riippumattomia menetelmiÀ kohteiden sovittamiseksi malliin ja tunnistamiseksi. Kohteet mallinnetaan paikallisten piirteiden yhdistelminÀ, ja esimerkkikohdeluokkana kÀytetÀÀn ihmiskasvoja. TyössÀ kÀsitellÀÀn suodinpankin optimaalista suunnittelua piirteiden havaitsemisen ja kohteen tunnistuksen kannalta, nÀkymÀsuunnan piirteissÀ aiheuttamien ilmiöiden mallintamista sekÀ edellisen kaltaisia piirteitÀ kÀyttÀvÀn todennÀköisyyspohjaisen, nÀkymÀsuunnasta riippumattomaan havaitsemiseen kykenevÀn kohteidentunnistusjÀrjestelmÀn toteutusta. Gabor-suotimet ovat informaatioteoreettisista lÀhtökohdista johdettuja, aika- ja taajuustason paikallistamiskyvyltÀÀn optimaalisia kaistanpÀÀstösuotimia. Nk. ohjattavat (steerable) suuntaherkÀt suotimet on kehitetty vÀhentÀmÀÀn laskennan mÀÀrÀÀ tasorotaatioille invarianteissa jÀrjestelmissÀ. TyössÀ laajennetaan ohjattavien suodinten teoriaa Gabor-suotimiin ja esitetÀÀn Gabor-suodinten ohjaukseen vaadittavien approksimointiyhtÀlöiden johtaminen analyyttisesti. Kokeellisesti nÀytetÀÀn, ettÀ Gabor-suotimet ja erÀÀt niitÀ muistuttavat suotimet ovat sopivilla muotoparametrien arvoilla likimÀÀrin ohjattavia. LisÀksi tutkitaan muotoparametrien vaikutusta piirteiden havaittavuuteen sekÀ kohteen tunnistamiseen kokonaista kohteidentunnistusjÀrjestelmÀÀ kÀyttÀen. Piirteiden nÀkymÀsuunnasta johtuvaa vaihtelua mallinnetaan suoraviivaisesti regressiomenetelmillÀ. NÀiden kÀyttÀminen monisto-oppimismenetelmien (manifold learning methods) sijaan on mahdollista, koska malli muodostetaan synteettisen datan avulla. Työn keskeisiÀ kontribuutioita on synteettisen datan kÀyttÀminen paikallisten piirteiden nÀkymÀmallien oppimisessa. TyössÀ kÀsiteltÀvÀt mallinsovitusmenetelmÀt perustuvat todennÀköisyyspohjaiseen pÀÀttelyyn. Tarvittavat kohteen uskottavuusfunktiot muodostetaan piirteiden samankaltaisuusmitoista, ja uskottavuusfunktion suuren todennÀköisyysmassan keskittymÀt löydetÀÀn satunnaisotantamenetelmillÀ. Population Monte Carlo -algoritmin osoitetaan ratkaisevan onnistuneesti asennonestimointiongelmia, joissa Metropolis- ja Gibbs-otantamenetelmÀt antavat epÀtyydyttÀviÀ tuloksia.reviewe

    Deformable kernels for early vision

    Get PDF
    Early vision algorithms often have a first stage of linear-filtering that `extracts' from the image information at multiple scales of resolution and multiple orientations. A common difficulty in the design and implementation of such schemes is that one feels compelled to discretize coarsely the space of scales and orientations in order to reduce computation and storage costs. A technique is presented that allows: 1) computing the best approximation of a given family using linear combinations of a small number of `basis' functions; and 2) describing all finite-dimensional families, i.e., the families of filters for which a finite dimensional representation is possible with no error. The technique is based on singular value decomposition and may be applied to generating filters in arbitrary dimensions and subject to arbitrary deformations. The relevant functional analysis results are reviewed and precise conditions for the decomposition to be feasible are stated. Experimental results are presented that demonstrate the applicability of the technique to generating multiorientation multi-scale 2D edge-detection kernels. The implementation issues are also discussed

    3D Steerable Wavelets in Practice

    Full text link

    Biologically inspired feature extraction for rotation and scale tolerant pattern analysis

    Get PDF
    Biologically motivated information processing has been an important area of scientific research for decades. The central topic addressed in this dissertation is utilization of lateral inhibition and more generally, linear networks with recurrent connectivity along with complex-log conformal mapping in machine based implementations of information encoding, feature extraction and pattern recognition. The reasoning behind and method for spatially uniform implementation of inhibitory/excitatory network model in the framework of non-uniform log-polar transform is presented. For the space invariant connectivity model characterized by Topelitz-Block-Toeplitz matrix, the overall network response is obtained without matrix inverse operations providing the connection matrix generating function is bound by unity. It was shown that for the network with the inter-neuron connection function expandable in a Fourier series in polar angle, the overall network response is steerable. The decorrelating/whitening characteristics of networks with lateral inhibition are used in order to develop space invariant pre-whitening kernels specialized for specific category of input signals. These filters have extremely small memory footprint and are successfully utilized in order to improve performance of adaptive neural whitening algorithms. Finally, the method for feature extraction based on localized Independent Component Analysis (ICA) transform in log-polar domain and aided by previously developed pre-whitening filters is implemented. Since output codes produced by ICA are very sparse, a small number of non-zero coefficients was sufficient to encode input data and obtain reliable pattern recognition performance

    Complex Wavelet Bases, Steerability, and the Marr-Like Pyramid

    Get PDF
    Our aim in this paper is to tighten the link between wavelets, some classical image-processing operators, and David Marr's theory of early vision. The cornerstone of our approach is a new complex wavelet basis that behaves like a smoothed version of the Gradient-Laplace operator. Starting from first principles, we show that a single-generator wavelet can be defined analytically and that it yields a semi-orthogonal complex basis of L-2 (R-2), irrespective of the dilation matrix used. We also provide an efficient FFT-based filterbank implementation. We then propose a slightly redundant version of the transform that is nearly translation -invariant and that is optimized for better steerability (Gaussian-like smoothing kernel). We call it the Marr-like wavelet pyramid because it essentially replicates the processing steps in Marr's theory of early vision. We use it to derive a primal wavelet sketch which is a compact description of the image by a multiscale, subsampled edge map. Finally, we provide an efficient iterative algorithm for the reconstruction of an image from its primal wavelet sketch

    A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity

    Full text link
    The richness of natural images makes the quest for optimal representations in image processing and computer vision challenging. The latter observation has not prevented the design of image representations, which trade off between efficiency and complexity, while achieving accurate rendering of smooth regions as well as reproducing faithful contours and textures. The most recent ones, proposed in the past decade, share an hybrid heritage highlighting the multiscale and oriented nature of edges and patterns in images. This paper presents a panorama of the aforementioned literature on decompositions in multiscale, multi-orientation bases or dictionaries. They typically exhibit redundancy to improve sparsity in the transformed domain and sometimes its invariance with respect to simple geometric deformations (translation, rotation). Oriented multiscale dictionaries extend traditional wavelet processing and may offer rotation invariance. Highly redundant dictionaries require specific algorithms to simplify the search for an efficient (sparse) representation. We also discuss the extension of multiscale geometric decompositions to non-Euclidean domains such as the sphere or arbitrary meshed surfaces. The etymology of panorama suggests an overview, based on a choice of partially overlapping "pictures". We hope that this paper will contribute to the appreciation and apprehension of a stream of current research directions in image understanding.Comment: 65 pages, 33 figures, 303 reference

    Standardised convolutional filtering for radiomics

    Full text link
    The Image Biomarker Standardisation Initiative (IBSI) aims to improve reproducibility of radiomics studies by standardising the computational process of extracting image biomarkers (features) from images. We have previously established reference values for 169 commonly used features, created a standard radiomics image processing scheme, and developed reporting guidelines for radiomic studies. However, several aspects are not standardised. Here we present a preliminary version of a reference manual on the use of convolutional image filters in radiomics. Filters, such as wavelets or Laplacian of Gaussian filters, play an important part in emphasising specific image characteristics such as edges and blobs. Features derived from filter response maps have been found to be poorly reproducible. This reference manual forms the basis of ongoing work on standardising convolutional filters in radiomics, and will be updated as this work progresses.Comment: 62 pages. For additional information see https://theibsi.github.io
    • 

    corecore