8 research outputs found
A Lie group approach to steerable filters
Recently Freeman and Adelson (1991) published an approach to steer filters in their orientation by Fourier decompositions with respect to the angular coordinate of a polar representation. Simoncelli et al. (1992) generalized this method to steer other parameters than the orientation. In this paper we formulate the problem of steerability using the Lie group that performs the deformation of the filters. Within the presented theoretical framework we especially discuss the following points: (1) The possible scope and (2) the optimality of steerability by Fourier decompositions, (3) approximate steerability using a limited number of basis functions, (4) the nature of the singularity that occurs when steering the scale
Dynamic Steerable Blocks in Deep Residual Networks
Filters in convolutional networks are typically parameterized in a pixel
basis, that does not take prior knowledge about the visual world into account.
We investigate the generalized notion of frames designed with image properties
in mind, as alternatives to this parametrization. We show that frame-based
ResNets and Densenets can improve performance on Cifar-10+ consistently, while
having additional pleasant properties like steerability. By exploiting these
transformation properties explicitly, we arrive at dynamic steerable blocks.
They are an extension of residual blocks, that are able to seamlessly transform
filters under pre-defined transformations, conditioned on the input at training
and inference time. Dynamic steerable blocks learn the degree of invariance
from data and locally adapt filters, allowing them to apply a different
geometrical variant of the same filter to each location of the feature map.
When evaluated on the Berkeley Segmentation contour detection dataset, our
approach outperforms all competing approaches that do not utilize pre-training.
Our results highlight the benefits of image-based regularization to deep
networks
Rotationally Invariant Image Representation for Viewing Direction Classification in Cryo-EM
We introduce a new rotationally invariant viewing angle classification method
for identifying, among a large number of Cryo-EM projection images, similar
views without prior knowledge of the molecule. Our rotationally invariant
features are based on the bispectrum. Each image is denoised and compressed
using steerable principal component analysis (PCA) such that rotating an image
is equivalent to phase shifting the expansion coefficients. Thus we are able to
extend the theory of bispectrum of 1D periodic signals to 2D images. The
randomized PCA algorithm is then used to efficiently reduce the dimensionality
of the bispectrum coefficients, enabling fast computation of the similarity
between any pair of images. The nearest neighbors provide an initial
classification of similar viewing angles. In this way, rotational alignment is
only performed for images with their nearest neighbors. The initial nearest
neighbor classification and alignment are further improved by a new
classification method called vector diffusion maps. Our pipeline for viewing
angle classification and alignment is experimentally shown to be faster and
more accurate than reference-free alignment with rotationally invariant K-means
clustering, MSA/MRA 2D classification, and their modern approximations
Data symmetries and Learning in fully connected neural networks
Symmetries in the data and how they constrain the learned weights of modern deep networks is still an open problem. In this work we study the simple case of fully connected shallow non-linear neural networks and consider two types of symmetries: full dataset symmetries where the dataset X is mapped into itself by any transformation g , i.e. gX = X or single data point symmetries where gx = x , x ∈ X . We prove and experimentally confirm that symmetries in the data are directly inherited at the level of the network’s learned weights and relate these findings with the common practice of data augmentation in modern machine learning. Finally, we show how symmetry constraints have a profound impact on the spectrum of the learned weights, an aspect of the so-called network implicit bias
Probabilistic methods for pose-invariant recognition in computer vision
This thesis is concerned with two central themes in computer vision, the properties of oriented quadrature filters, and methods for implementing rotation invariance in an object matching and recognition system. Objects are modeled as combinations of local features, and human faces are used as the reference object class. The topics covered include optimal design of filter banks for feature detection and object recognition, modeling of pose effects in filter responses and the construction of probability-based pose-invariant object matching and recognition systems employing oriented filters.
Gabor filters have been derived as information-theoretically optimal bandpass filters, simultaneously maximizing the localization capability in space and spatial-frequency domains. Steerable oriented filters have been developed as a tool for reducing the amount of computation required in rotation invariant systems. In this work, the framework of steerable filters is applied to Gabor-type filters and novel analytical derivations for the required steering equations for them are presented. Gabor filters and some related filters are experimentally shown to be approximately steerable with low steering error, given suitable filter shape parameters. The effects of filter shape parameters in feature localization and object recognition are also studied using a complete feature matching system.
A novel approach for modeling the pose variation of features due to depth rotations is introduced. Instead of manifold learning methods, the use synthetic data makes it possible to apply simpler regression modeling methods. The use of synthetic data in learning the pose models for local features is a central contribution of the work.
The object matching methods considered in the work are based on probabilistic reasoning. The required object likelihood functions are constructed using feature similarity measures, and random sampling methods are applied for finding the modes of high probability in the likelihood probability distribution functions. The Population Monte Carlo algorithm is shown to solve successfully pose estimation problems in which simple Metropolis and Gibbs sampling methods give unsatisfactory performance.Tämä väitöskirja käsittelee kahta keskeistä tietokonenäön osa-aluetta, signaalin suunnalle herkkien kvadratuurisuodinten ominaisuuksia, ja näkymäsuunnasta riippumattomia menetelmiä kohteiden sovittamiseksi malliin ja tunnistamiseksi. Kohteet mallinnetaan paikallisten piirteiden yhdistelminä, ja esimerkkikohdeluokkana käytetään ihmiskasvoja. Työssä käsitellään suodinpankin optimaalista suunnittelua piirteiden havaitsemisen ja kohteen tunnistuksen kannalta, näkymäsuunnan piirteissä aiheuttamien ilmiöiden mallintamista sekä edellisen kaltaisia piirteitä käyttävän todennäköisyyspohjaisen, näkymäsuunnasta riippumattomaan havaitsemiseen kykenevän kohteidentunnistusjärjestelmän toteutusta.
Gabor-suotimet ovat informaatioteoreettisista lähtökohdista johdettuja, aika- ja taajuustason paikallistamiskyvyltään optimaalisia kaistanpäästösuotimia. Nk. ohjattavat (steerable) suuntaherkät suotimet on kehitetty vähentämään laskennan määrää tasorotaatioille invarianteissa järjestelmissä. Työssä laajennetaan ohjattavien suodinten teoriaa Gabor-suotimiin ja esitetään Gabor-suodinten ohjaukseen vaadittavien approksimointiyhtälöiden johtaminen analyyttisesti. Kokeellisesti näytetään, että Gabor-suotimet ja eräät niitä muistuttavat suotimet ovat sopivilla muotoparametrien arvoilla likimäärin ohjattavia. Lisäksi tutkitaan muotoparametrien vaikutusta piirteiden havaittavuuteen sekä kohteen tunnistamiseen kokonaista kohteidentunnistusjärjestelmää käyttäen.
Piirteiden näkymäsuunnasta johtuvaa vaihtelua mallinnetaan suoraviivaisesti regressiomenetelmillä. Näiden käyttäminen monisto-oppimismenetelmien (manifold learning methods) sijaan on mahdollista, koska malli muodostetaan synteettisen datan avulla. Työn keskeisiä kontribuutioita on synteettisen datan käyttäminen paikallisten piirteiden näkymämallien oppimisessa.
Työssä käsiteltävät mallinsovitusmenetelmät perustuvat todennäköisyyspohjaiseen päättelyyn. Tarvittavat kohteen uskottavuusfunktiot muodostetaan piirteiden samankaltaisuusmitoista, ja uskottavuusfunktion suuren todennäköisyysmassan keskittymät löydetään satunnaisotantamenetelmillä. Population Monte Carlo -algoritmin osoitetaan ratkaisevan onnistuneesti asennonestimointiongelmia, joissa Metropolis- ja Gibbs-otantamenetelmät antavat epätyydyttäviä tuloksia.reviewe