Search CORE

4,037 research outputs found

Symmetric image registration with directly calculated inverse deformation field

Author: Matuszewski Bogdan
Papiez Bartek
Publication venue: The British Machine Vision Association and Society for Pattern Recognition
Publication date: 01/01/2012
Field of study

This paper presents a novel technique for a symmetric deformable image registration based on a new method for fast and accurate direct inversion of a large motion model deformation field. The proposed image registration algorithm maintain a one-to-one mapping between registered images by symmetrically warping them to each other, and by ensuring the inverse consistency criterion at each iteration. This makes the final estimation of forward and backward deformation fields anatomically plausible. The quantitative validation of the method has been performed on magnetic resonance data obtained for a pelvis area demonstrating applicability of the method to adaptive prostate radiotherapy. The experiments demonstrate the improved robustness in terms of inverse consistency error when compared to previously proposed methods for symmetric image registration

CLoK

Structure-aware image denoising, super-resolution, and enhancement methods

Author: Bodduna Kireeti
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2021
Field of study

Denoising, super-resolution and structure enhancement are classical image processing applications. The motive behind their existence is to aid our visual analysis of raw digital images. Despite tremendous progress in these fields, certain difficult problems are still open to research. For example, denoising and super-resolution techniques which possess all the following properties, are very scarce: They must preserve critical structures like corners, should be robust to the type of noise distribution, avoid undesirable artefacts, and also be fast. The area of structure enhancement also has an unresolved issue: Very little efforts have been put into designing models that can tackle anisotropic deformations in the image acquisition process. In this thesis, we design novel methods in the form of partial differential equations, patch-based approaches and variational models to overcome the aforementioned obstacles. In most cases, our methods outperform the existing approaches in both quality and speed, despite being applicable to a broader range of practical situations.Entrauschen, Superresolution und Strukturverbesserung sind klassische Anwendungen der Bildverarbeitung. Ihre Existenz bedingt sich in dem Bestreben, die visuelle Begutachtung digitaler Bildrohdaten zu unterstützen. Trotz erheblicher Fortschritte in diesen Feldern bedürfen bestimmte schwierige Probleme noch weiterer Forschung. So sind beispielsweise Entrauschungsund Superresolutionsverfahren, welche alle der folgenden Eingenschaften besitzen, sehr selten: die Erhaltung wichtiger Strukturen wie Ecken, Robustheit bezüglich der Rauschverteilung, Vermeidung unerwünschter Artefakte und niedrige Laufzeit. Auch im Gebiet der Strukturverbesserung liegt ein ungelöstes Problem vor: Bisher wurde nur sehr wenig Forschungsaufwand in die Entwicklung von Modellen investieret, welche anisotrope Deformationen in bildgebenden Verfahren bewältigen können. In dieser Arbeit entwerfen wir neue Methoden in Form von partiellen Differentialgleichungen, patch-basierten Ansätzen und Variationsmodellen um die oben erwähnten Hindernisse zu überwinden. In den meisten Fällen übertreffen unsere Methoden nicht nur qualitativ die bisher verwendeten Ansätze, sondern lösen die gestellten Aufgaben auch schneller. Zudem decken wir mit unseren Modellen einen breiteren Bereich praktischer Fragestellungen ab

Universaar

Acronym

Method of on road vehicle tracking

Author: Yao Kaiyuan
Publication venue: The University of Edinburgh
Publication date: 01/01/2008
Field of study

Edinburgh Research Archive

Recommended from our members

Deep Convolutional Network on Point Clouds for 3D Scene Understanding

Author: Wu Wenxuan
Publication venue: 'Oregon State University'
Publication date
Field of study

As one of the most popular data types, the point cloud is widely used in various appli- cations, including computer vision, computer graphics and robotics. The capability to directly measure 3D point clouds is invaluable in those applications as depth information could remove a lot of the segmentation ambiguities in 2D images. Unlike images which are represented in regular dense grids, 3D point clouds are irregular and unordered, hence applying convolution on them can be difficult. To address this problem, we extend the dynamic filter to a new convolution operation, named PointConv. PointConv can be applied on point clouds to build deep convolutional networks. We treat convolution ker- nels as nonlinear functions of the local coordinates of 3D points comprised of weight and density functions. With respect to a given point, the weight functions are learned with multi-layer perceptron networks, and density functions through kernel density estima- tion. The most important contribution of this work is a novel reformulation proposed for efficiently computing the weight functions, which allowed us to dramatically scale up the network and significantly improve its performance. The learned convolution kernel can be used to compute translation-invariant and permutation-invariant convolution on any point set in the 3D space. The proposed PointConv have opened doors to new 3D-centric approaches to scene understanding. We show how we can adapt and apply PointConv to an important perception problem in robotics: 3D scene flow estimation. We propose a novel end-to- end deep scene flow model, called PointPWC-Net, that directly processes 3D point cloud scenes with large motions in a coarse-to-fine fashion. Flow computed at the coarse level is upsampled and warped to a finer level, enabling the algorithm to accommodate for large motion without a prohibitive search space. We introduce novel cost volume, upsampling, and warping layers to efficiently handle 3D point cloud data. Unlike traditional cost volumes that require exhaustively computing all the cost values on a high-dimensional grid, our point-based formulation discretizes the cost volume onto input 3D points, and a PointConv operation efficiently computes convolutions on the cost volume. Finally, inspired by the recent development of Transformer, We introduce PointCon- vFormer, a novel building block for point cloud based deep neural network architectures. PointConvFormer combines ideas from point convolution, where filter weights are only based on relative position, and Transformers where the attention computation takes the features into account. In our proposed new operation, feature difference between points in the neighborhood serves as an indicator to re-weight the convolutional weights. Hence, we preserved some of the translation-invariance of the convolution operation whereas taken attention into account to choose the relevant points for convolution. We also explore multi-head mechanisms as well. To validate the effectiveness of PointCon- vFormer, we experiment on both semantic segmentation and scene flow estimation tasks on point clouds with multiple datasets including ScanNet, SemanticKitti, FlyingTh- ings3D and KITTI. Our results show that PointConvFormer substantially outperforms classic convolutions, regular transformers, and voxelized sparse convolution approaches with smaller, more computationally efficient networks

ScholarsArchive@OSU

Integration of Static and Self-motion-Based Depth Cues for Efficient Reaching and Locomotor Actions

Author: Antonelli Marco
Castelló Vicente
del Pobil Angel P.
Grzyb Beata J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The common approach to estimate the distance of an object in computer vision and robotics is to use stereo vision. Stereopsis, however, provides good estimates only within near space and thus is more suitable for reaching actions. In order to successfully plan and execute an action in far space, other depth cues must be taken into account. Self-body movements, such as head and eye movements or locomotion can provide rich information of depth. This paper proposes a model for integration of static and self-motion-based depth cues for a humanoid robot. Our results show that self-motion-based visual cues improve the accuracy of distance perception and combined with other depth cues provide the robot with a robust distance estimator suitable for both reaching and walking actions

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Bootstrap Based Surface Reconstruction

Author: RAMLI AHMAD,LUTFI,AMRI,BIN
Publication venue
Publication date: 01/01/2012
Field of study

Surface reconstruction is one of the main research areas in computer graphics. The goal is to find the best surface representation of the boundary of a real object. The typical input of a surface reconstruction algorithm is a point cloud, possibly obtained by a laser 3D scanner. The raw data from the scanner is usually noisy and contains outliers. Apart from creating models of high visual quality, assuring that a model is as faithful as possible to the original object is also one of the main aims of surface reconstruction. Most surface reconstruction algorithms proposed in the literature assess the reconstructed models either by visual inspection or, in cases where subjective manual input is not possible, by measuring the training error of the model. However, the training error underestimates systematically the test error and encourages overfitting. In this thesis, we provide a method for quantitative assessment in surface reconstruction. We integrate a model averaging method from statistics called bootstrap and define it into our context. Bootstrapping is a resampling procedure that provides statistical parameter. In surface fitting, we obtained error estimate which detect error caused by noise or bad fitting. We also define bootstrap method in context of normal estimation. We obtain variance and error estimates which we use as a quality measure of normal estimates. As application, we provide smoothing algorithm for point clouds and normal smoothing that can handle feature area. We also developed feature detection algorithm

Durham e-Theses

Log-Euclidean Bag of Words for Human Action Recognition

Author: Bhatia R.
Conrad Sanderson
Lazebnik S.
Masoud Faraki
Maziar Palhang
Wong Y.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2015
Field of study

Representing videos by densely extracted local space-time features has recently become a popular approach for analysing actions. In this paper, we tackle the problem of categorising human actions by devising Bag of Words (BoW) models based on covariance matrices of spatio-temporal features, with the features formed from histograms of optical flow. Since covariance matrices form a special type of Riemannian manifold, the space of Symmetric Positive Definite (SPD) matrices, non-Euclidean geometry should be taken into account while discriminating between covariance matrices. To this end, we propose to embed SPD manifolds to Euclidean spaces via a diffeomorphism and extend the BoW approach to its Riemannian version. The proposed BoW approach takes into account the manifold geometry of SPD matrices during the generation of the codebook and histograms. Experiments on challenging human action datasets show that the proposed method obtains notable improvements in discrimination accuracy, in comparison to several state-of-the-art methods

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Queensland University of Technology ePrints Archive

University of Queensland eSpace

Human and Group Activity Recognition from Video Sequences

Author: Stephens Kyle
Publication venue: University of York
Publication date: 01/12/2016
Field of study

A good solution to human activity recognition enables the creation of a wide variety of useful applications such as applications in visual surveillance, vision-based Human-Computer-Interaction (HCI) and gesture recognition. In this thesis, a graph based approach to human activity recognition is proposed which models spatio-temporal features as contextual space-time graphs. In this method, spatio-temporal gradient cuboids were extracted at significant regions of activity, and feature graphs (gradient, space-time, local neighbours, immediate neighbours) are constructed using the similarity matrix. The Laplacian representation of the graph is utilised to reduce the computational complexity and to allow the use of traditional statistical classifiers. A second methodology is proposed to detect and localise abnormal activities in crowded scenes. This approach has two stages: training and identification. During the training stage, specific human activities are identified and characterised by employing modelling of medium-term movement flow through streaklines. Each streakline is formed by multiple optical flow vectors that represent and track locally the movement in the scene. A dictionary of activities is recorded for a given scene during the training stage. During the testing stage, the consistency of each observed activity with those from the dictionary is verified using the Kullback-Leibler (KL) divergence. The anomaly detection of the proposed methodology is compared to state of the art, producing state of the art results for localising anomalous activities. Finally, we propose an automatic group activity recognition approach by modelling the interdependencies of group activity features over time. We propose to model the group interdependences in both motion and location spaces. These spaces are extended to time-space and time-movement spaces and modelled using Kernel Density Estimation (KDE). The recognition performance of the proposed methodology shows an improvement in recognition performance over state of the art results on group activity datasets

White Rose E-theses Online