338 research outputs found

    Improving accuracy and efficiency of mutual information for multi-modal retinal image registration using adaptive probability density estimation

    Get PDF
    Mutual information (MI) is a popular similarity measure for performing image registration between different modalities. MI makes a statistical comparison between two images by computing the entropy from the probability distribution of the data. Therefore, to obtain an accurate registration it is important to have an accurate estimation of the true underlying probability distribution. Within the statistics literature, many methods have been proposed for finding the 'optimal' probability density, with the aim of improving the estimation by means of optimal histogram bin size selection. This provokes the common question of how many bins should actually be used when constructing a histogram. There is no definitive answer to this. This question itself has received little attention in the MI literature, and yet this issue is critical to the effectiveness of the algorithm. The purpose of this paper is to highlight this fundamental element of the MI algorithm. We present a comprehensive study that introduces methods from statistics literature and incorporates these for image registration. We demonstrate this work for registration of multi-modal retinal images: colour fundus photographs and scanning laser ophthalmoscope images. The registration of these modalities offers significant enhancement to early glaucoma detection, however traditional registration techniques fail to perform sufficiently well. We find that adaptive probability density estimation heavily impacts on registration accuracy and runtime, improving over traditional binning techniques. © 2013 Elsevier Ltd

    Enhanced phase congruency feature-based image registration for multimodal remote sensing imagery

    Get PDF
    Multimodal image registration is an essential image processing task in remote sensing. Basically, multimodal image registration searches for optimal alignment between images captured by different sensors for the same scene to provide better visualization and more informative images. Manual image registration is a tedious task and requires more effort, hence developing an automated image registration is very crucial to provide a faster and reliable solution. However, image registration faces many challenges from the nature of remote sensing image, the environment, and the technical shortcoming of the current methods that cause three issues, namely intensive processing power, local intensity variation, and rotational distortion. Since not all image details are significant, relying on the salient features will be more efficient in terms of processing power. Thus, the feature-based registration method was adopted as an efficient method to avoid intensive processing. The proposed method resolves rotation distortion issue using Oriented FAST and Rotated BRIEF (ORB) to produce invariant rotation features. However, since it is not intensity invariant, it cannot support multimodal data. To overcome the intensity variations issue, Phase Congruence (PC) was integrated with ORB to introduce ORB-PC feature extraction to generate feature invariance to rotation distortion and local intensity variation. However, the solution is not complete since the ORB-PC matching rate is below the expectation. Enhanced ORB-PC was proposed to solve the matching issue by modifying the feature descriptor. While better feature matches were achieved, a high number of outliers from multimodal data makes the common outlier removal methods unsuccessful. Therefore, the Normalized Barycentric Coordinate System (NBCS) outlier removal was utilized to find precise matches even with a high number of outliers. The experiments were conducted to verify the registration qualitatively and quantitatively. The qualitative experiment shows the proposed method has a broader and better features distribution, while the quantitative evaluation indicates improved performance in terms of registration accuracy by 18% compared to the related works

    Learning deep embeddings by learning to rank

    Full text link
    We study the problem of embedding high-dimensional visual data into low-dimensional vector representations. This is an important component in many computer vision applications involving nearest neighbor retrieval, as embedding techniques not only perform dimensionality reduction, but can also capture task-specific semantic similarities. In this thesis, we use deep neural networks to learn vector embeddings, and develop a gradient-based optimization framework that is capable of optimizing ranking-based retrieval performance metrics, such as the widely used Average Precision (AP) and Normalized Discounted Cumulative Gain (NDCG). Our framework is applied in three applications. First, we study Supervised Hashing, which is concerned with learning compact binary vector embeddings for fast retrieval, and propose two novel solutions. The first solution optimizes Mutual Information as a surrogate ranking objective, while the other directly optimizes AP and NDCG, based on the discovery of their closed-form expressions for discrete Hamming distances. These optimization problems are NP-hard, therefore we derive their continuous relaxations to enable gradient-based optimization with neural networks. Our solutions establish the state-of-the-art on several image retrieval benchmarks. Next, we learn deep neural networks to extract Local Feature Descriptors from image patches. Local features are used universally in low-level computer vision tasks that involve sparse feature matching, such as image registration and 3D reconstruction, and their matching is a nearest neighbor retrieval problem. We leverage our AP optimization technique to learn both binary and real-valued descriptors for local image patches. Compared to competing approaches, our solution eliminates complex heuristics, and performs more accurately in the tasks of patch verification, patch retrieval, and image matching. Lastly, we tackle Deep Metric Learning, the general problem of learning real-valued vector embeddings using deep neural networks. We propose a learning to rank solution through optimizing a novel quantization-based approximation of AP. For downstream tasks such as retrieval and clustering, we demonstrate promising results on standard benchmarks, especially in the few-shot learning scenario, where the number of labeled examples per class is limited

    Use of Multicomponent Non-Rigid Registration to Improve Alignment of Serial Oncological PET/CT Studies

    Get PDF
    Non-rigid registration of serial head and neck FDG PET/CT images from a combined scanner can be problematic. Registration techniques typically rely on similarity measures calculated from voxel intensity values; CT-CT registration is superior to PET-PET registration due to the higher quality of anatomical information present in this modality. However, when metal artefacts from dental fillings are present in a pair of CT images, a nonrigid registration will incorrectly attempt to register the two artefacts together since they are strong features compared to the features that represent the actual anatomy. This leads to localised registration errors in the deformation field in the vicinity of the artefacts. Our objective was to develop a registration technique which overcomes these limitations by using combined information from both modalities. To study the effect of artefacts on registration, metal artefacts were simulated with one CT image rotated by a small angle in the sagittal plane. Image pairs containing these simulated artifacts were then registered to evaluate the resulting errors. To improve the registration in the vicinity where there were artefacts, intensity information from the PET images was incorporated using several techniques. A well-established B-splines based non-rigid registration code was reworked to allow multicomponent registration. A similarity measure with four possible weighted components relating to the ways in which the CT and PET information can be combined to drive the registration of a pair of these dual-valued images was employed. Several registration methods based on using this multicomponent similarity measure were implemented with the goal of effectively registering the images containing the simulated artifacts. A method was also developed to swap control point displacements from the PET-derived transformation in the vicinity of the artefact. This method yielded the best result on the simulated images and was evaluated on images where actual dental artifacts were present

    Smart video surveillance of pedestrians : fixed, aerial, and multi-camera methods

    Get PDF
    Crowd analysis from video footage is an active research topic in the field of computer vision. Crowds can be analaysed using different approaches, depending on their characteristics. Furthermore, analysis can be performed from footage obtained through different sources. Fixed CCTV cameras can be used, as well as cameras mounted on moving vehicles. To begin, a literature review is provided, where research works in the the fields of crowd analysis, as well as object and people tracking, occlusion handling, multi-view and sensor fusion, and multi-target tracking are analyses and compared, and their advantages and limitations highlighted. Following that, the three contributions of this thesis are presented: in a first study, crowds will be classified based on various cues (i.e. density, entropy), so that the best approaches to further analyse behaviour can be selected; then, some of the challenges of individual target tracking from aerial video footage will be tackled; finally, a study on the analysis of groups of people from multiple cameras is proposed. The analysis entails the movements of people and objects in the scene. The idea is to track as many people as possible within the crowd, and to be able to obtain knowledge from their movements, as a group, and to classify different types of scenes. An additional contribution of this thesis, are two novel datasets: on the one hand, a first set to test the proposed aerial video analysis methods; on the other, a second to validate the third study, that is, with groups of people recorded from multiple overlapping cameras performing different actions

    Моделирование систем неразрушающего контроля электронных модулей на основе теории информации

    Get PDF
    В работе определены общие принципы выбора видов и характера источников информации для систем неразрушающего контроля электронных модулей на основе теории информации. Установлено, что наибольшее число типов дефектов позволяет определить комбинированная система оптического и рентгеновского контрол

    Digital watermark technology in security applications

    Get PDF
    With the rising emphasis on security and the number of fraud related crimes around the world, authorities are looking for new technologies to tighten security of identity. Among many modern electronic technologies, digital watermarking has unique advantages to enhance the document authenticity. At the current status of the development, digital watermarking technologies are not as matured as other competing technologies to support identity authentication systems. This work presents improvements in performance of two classes of digital watermarking techniques and investigates the issue of watermark synchronisation. Optimal performance can be obtained if the spreading sequences are designed to be orthogonal to the cover vector. In this thesis, two classes of orthogonalisation methods that generate binary sequences quasi-orthogonal to the cover vector are presented. One method, namely "Sorting and Cancelling" generates sequences that have a high level of orthogonality to the cover vector. The Hadamard Matrix based orthogonalisation method, namely "Hadamard Matrix Search" is able to realise overlapped embedding, thus the watermarking capacity and image fidelity can be improved compared to using short watermark sequences. The results are compared with traditional pseudo-randomly generated binary sequences. The advantages of both classes of orthogonalisation inethods are significant. Another watermarking method that is introduced in the thesis is based on writing-on-dirty-paper theory. The method is presented with biorthogonal codes that have the best robustness. The advantage and trade-offs of using biorthogonal codes with this watermark coding methods are analysed comprehensively. The comparisons between orthogonal and non-orthogonal codes that are used in this watermarking method are also made. It is found that fidelity and robustness are contradictory and it is not possible to optimise them simultaneously. Comparisons are also made between all proposed methods. The comparisons are focused on three major performance criteria, fidelity, capacity and robustness. aom two different viewpoints, conclusions are not the same. For fidelity-centric viewpoint, the dirty-paper coding methods using biorthogonal codes has very strong advantage to preserve image fidelity and the advantage of capacity performance is also significant. However, from the power ratio point of view, the orthogonalisation methods demonstrate significant advantage on capacity and robustness. The conclusions are contradictory but together, they summarise the performance generated by different design considerations. The synchronisation of watermark is firstly provided by high contrast frames around the watermarked image. The edge detection filters are used to detect the high contrast borders of the captured image. By scanning the pixels from the border to the centre, the locations of detected edges are stored. The optimal linear regression algorithm is used to estimate the watermarked image frames. Estimation of the regression function provides rotation angle as the slope of the rotated frames. The scaling is corrected by re-sampling the upright image to the original size. A theoretically studied method that is able to synchronise captured image to sub-pixel level accuracy is also presented. By using invariant transforms and the "symmetric phase only matched filter" the captured image can be corrected accurately to original geometric size. The method uses repeating watermarks to form an array in the spatial domain of the watermarked image and the the array that the locations of its elements can reveal information of rotation, translation and scaling with two filtering processes
    corecore