2,489 research outputs found

    A survey of visual preprocessing and shape representation techniques

    Get PDF
    Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)

    Point-based metric and topological localisation between lidar and overhead imagery

    Get PDF
    In this paper, we present a method for solving the localisation of a ground lidar using overhead imagery only. Public overhead imagery such as Google satellite images are readily available resources. They can be used as the map proxy for robot localisation, relaxing the requirement for a prior traversal for mapping as in traditional approaches. While prior approaches have focused on the metric localisation between range sensors and overhead imagery, our method is the first to learn both place recognition and metric localisation of a ground lidar using overhead imagery, and also outperforms prior methods on metric localisation with large initial pose offsets. To bridge the drastic domain gap between lidar data and overhead imagery, our method learns to transform an overhead image into a collection of 2D points, emulating the resulting point-cloud scanned by a lidar sensor situated near the centre of the overhead image. After both modalities are expressed as point sets, point-based machine learning methods for localisation are applied

    Automatic Alignment of 3D Multi-Sensor Point Clouds

    Get PDF
    Automatic 3D point cloud alignment is a major research topic in photogrammetry, computer vision and computer graphics. In this research, two keypoint feature matching approaches have been developed and proposed for the automatic alignment of 3D point clouds, which have been acquired from different sensor platforms and are in different 3D conformal coordinate systems. The first proposed approach is based on 3D keypoint feature matching. First, surface curvature information is utilized for scale-invariant 3D keypoint extraction. Adaptive non-maxima suppression (ANMS) is then applied to retain the most distinct and well-distributed set of keypoints. Afterwards, every keypoint is characterized by a scale, rotation and translation invariant 3D surface descriptor, called the radial geodesic distance-slope histogram. Similar keypoints descriptors on the source and target datasets are then matched using bipartite graph matching, followed by a modified-RANSAC for outlier removal. The second proposed method is based on 2D keypoint matching performed on height map images of the 3D point clouds. Height map images are generated by projecting the 3D point clouds onto a planimetric plane. Afterwards, a multi-scale wavelet 2D keypoint detector with ANMS is proposed to extract keypoints on the height maps. Then, a scale, rotation and translation-invariant 2D descriptor referred to as the Gabor, Log-Polar-Rapid Transform descriptor is computed for all keypoints. Finally, source and target height map keypoint correspondences are determined using a bi-directional nearest neighbour matching, together with the modified-RANSAC for outlier removal. Each method is assessed on multi-sensor, urban and non-urban 3D point cloud datasets. Results show that unlike the 3D-based method, the height map-based approach is able to align source and target datasets with differences in point density, point distribution and missing point data. Findings also show that the 3D-based method obtained lower transformation errors and a greater number of correspondences when the source and target have similar point characteristics. The 3D-based approach attained absolute mean alignment differences in the range of 0.23m to 2.81m, whereas the height map approach had a range from 0.17m to 1.21m. These differences meet the proximity requirements of the data characteristics and the further application of fine co-registration approaches

    Machine Learning and Pattern Recognition Methods for Remote Sensing Image Registration and Fusion

    Get PDF
    In the last decade, the remote sensing world has dramatically evolved. New types of sensor, each one collecting data with possibly different modalities, have been designed, developed, and deployed. Moreover, new missions have been planned and launched, aimed not only at collecting data of the Earth's surface, but also at acquiring planetary data in support of the study of the whole Solar system. Indeed, such a variety of technologies highlights the need for automatic methods able to effectively exploit all the available information. In the last years, lot of effort has been put in the design and development of advanced data fusion methods able to extract and make use of all the information available from as many complementary information sources as possible. Indeed, the goal of this thesis is to present novel machine learning and pattern recognition methodologies designed to support the exploitation of diverse sources of information, such as multisensor, multimodal, or multiresolution imagery. In this context, image registration plays a major role as is allows bringing two or more digital images into precise alignment for analysis and comparison. Here, image registration is tackled using both feature-based and area-based strategies. In the former case, the features of interest are extracted using a stochastic geometry model based on marked point processes, while, in the latter case, information theoretic functionals and the domain adaptation capabilities of generative adversarial networks are exploited. In addition, multisensor image registration is also applied in a large scale scenario by introducing a tiling-based strategy aimed at minimizing the computational burden, which is usually heavy in the multisensor case due to the need for information theoretic similarity measures. Moreover, automatic change detection with multiresolution and multimodality imagery is addressed via a novel Markovian framework based on a linear mixture model and on an ad-hoc multimodal energy function minimized using graph cuts or belied propagation methods. The statistics of the data at the various spatial scales is modelled through appropriate generalized Gaussian distributions and by iteratively estimating a set of virtual images, at the finest resolution, representing the data that would have been collected in case all the sensors worked at that resolution. All such methodologies have been experimentally evaluated with respect to different datasets, and with particular focus on the trade-off between the achievable performances and the demands in terms of computational resources. Moreover, such methods are also compared with state-of-the-art solutions, and are analyzed in terms of future developments, giving insights to possible future lines of research in this field

    Cortical Dynamics of Navigation and Steering in Natural Scenes: Motion-Based Object Segmentation, Heading, and Obstacle Avoidance

    Full text link
    Visually guided navigation through a cluttered natural scene is a challenging problem that animals and humans accomplish with ease. The ViSTARS neural model proposes how primates use motion information to segment objects and determine heading for purposes of goal approach and obstacle avoidance in response to video inputs from real and virtual environments. The model produces trajectories similar to those of human navigators. It does so by predicting how computationally complementary processes in cortical areas MT-/MSTv and MT+/MSTd compute object motion for tracking and self-motion for navigation, respectively. The model retina responds to transients in the input stream. Model V1 generates a local speed and direction estimate. This local motion estimate is ambiguous due to the neural aperture problem. Model MT+ interacts with MSTd via an attentive feedback loop to compute accurate heading estimates in MSTd that quantitatively simulate properties of human heading estimation data. Model MT interacts with MSTv via an attentive feedback loop to compute accurate estimates of speed, direction and position of moving objects. This object information is combined with heading information to produce steering decisions wherein goals behave like attractors and obstacles behave like repellers. These steering decisions lead to navigational trajectories that closely match human performance.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National Geospatial Intelligence Agency (NMA201-01-1-2016

    Relating Multimodal Imagery Data in 3D

    Get PDF
    This research develops and improves the fundamental mathematical approaches and techniques required to relate imagery and imagery derived multimodal products in 3D. Image registration, in a 2D sense, will always be limited by the 3D effects of viewing geometry on the target. Therefore, effects such as occlusion, parallax, shadowing, and terrain/building elevation can often be mitigated with even a modest amounts of 3D target modeling. Additionally, the imaged scene may appear radically different based on the sensed modality of interest; this is evident from the differences in visible, infrared, polarimetric, and radar imagery of the same site. This thesis develops a `model-centric\u27 approach to relating multimodal imagery in a 3D environment. By correctly modeling a site of interest, both geometrically and physically, it is possible to remove/mitigate some of the most difficult challenges associated with multimodal image registration. In order to accomplish this feat, the mathematical framework necessary to relate imagery to geometric models is thoroughly examined. Since geometric models may need to be generated to apply this `model-centric\u27 approach, this research develops methods to derive 3D models from imagery and LIDAR data. Of critical note, is the implementation of complimentary techniques for relating multimodal imagery that utilize the geometric model in concert with physics based modeling to simulate scene appearance under diverse imaging scenarios. Finally, the often neglected final phase of mapping localized image registration results back to the world coordinate system model for final data archival are addressed. In short, once a target site is properly modeled, both geometrically and physically, it is possible to orient the 3D model to the same viewing perspective as a captured image to enable proper registration. If done accurately, the synthetic model\u27s physical appearance can simulate the imaged modality of interest while simultaneously removing the 3-D ambiguity between the model and the captured image. Once registered, the captured image can then be archived as a texture map on the geometric site model. In this way, the 3D information that was lost when the image was acquired can be regained and properly related with other datasets for data fusion and analysis

    A transformation-aware perceptual image metric

    Get PDF
    Predicting human visual perception has several applications such as compression, rendering, editing and retargeting. Current approaches however, ignore the fact that the human visual system compensates for geometric transformations, e. g., we see that an image and a rotated copy are identical. Instead, they will report a large, false-positive difference. At the same time, if the transformations become too strong or too spatially incoherent, comparing two images indeed gets increasingly difficult. Between these two extrema, we propose a system to quantify the effect of transformations, not only on the perception of image differences, but also on saliency. To this end, we first fit local homographies to a given optical flow field and then convert this field into a field of elementary transformations such as translation, rotation, scaling, and perspective. We conduct a perceptual experiment quantifying the increase of difficulty when compensating for elementary transformations. Transformation entropy is proposed as a novel measure of complexity in a flow field. This representation is then used for applications, such as comparison of non-aligned images, where transformations cause threshold elevation, and detection of salient transformations

    Theoretical Engineering and Satellite Comlink of a PTVD-SHAM System

    Full text link
    This paper focuses on super helical memory system's design, 'Engineering, Architectural and Satellite Communications' as a theoretical approach of an invention-model to 'store time-data'. The current release entails three concepts: 1- an in-depth theoretical physics engineering of the chip including its, 2- architectural concept based on VLSI methods, and 3- the time-data versus data-time algorithm. The 'Parallel Time Varying & Data Super-helical Access Memory' (PTVD-SHAM), possesses a waterfall effect in its architecture dealing with the process of voltage output-switch into diverse logic and quantum states described as 'Boolean logic & image-logic', respectively. Quantum dot computational methods are explained by utilizing coiled carbon nanotubes (CCNTs) and CNT field effect transistors (CNFETs) in the chip's architecture. Quantum confinement, categorized quantum well substrate, and B-field flux involvements are discussed in theory. Multi-access of coherent sequences of 'qubit addressing' in any magnitude, gained as pre-defined, here e.g., the 'big O notation' asymptotically confined into singularity while possessing a magnitude of 'infinity' for the orientation of array displacement. Gaussian curvature of k(k<0) is debated in aim of specifying the 2D electron gas characteristics, data storage system for defining short and long time cycles for different CCNT diameters where space-time continuum is folded by chance for the particle. Precise pre/post data timing for, e.g., seismic waves before earthquake mantle-reach event occurrence, including time varying self-clocking devices in diverse geographic locations for radar systems is illustrated in the Subsections of the paper. The theoretical fabrication process, electromigration between chip's components is discussed as well.Comment: 50 pages, 10 figures (3 multi-figures), 2 tables. v.1: 1 postulate entailing hypothetical ideas, design and model on future technological advances of PTVD-SHAM. The results of the previous paper [arXiv:0707.1151v6], are extended in order to prove some introductory conjectures in theoretical engineering advanced to architectural analysi

    Hybrid And Hierarchical Image Registration Techniques

    Get PDF
    A large number of image registration techniques have been developed for various types of sensors and applications, with the aim to improve the accuracy, computational complexity, generality, and robustness. They can be broadly classified into two categories: intensity-based and feature-based methods. The primary drawback of the intensity-based approaches is that it may fail unless the two images are misaligned by a moderate difference in scale, rotation, and translation. In addition, intensity-based methods lack the robustness in the presence of non-spatial distortions due to different imaging conditions between images. In this dissertation, the image registration is formulated as a two-stage hybrid approach combining both an initial matching and a final matching in a coarse-to-fine manner. In the proposed hybrid framework, the initial matching algorithm is applied at the coarsest scale of images, where the approximate transformation parameters could be first estimated. Subsequently, the robust gradient-based estimation algorithm is incorporated into the proposed hybrid approach using a multi-resolution scheme. Several novel and effective initial matching algorithms have been proposed for the first stage. The variations of the intensity characteristics between images may be large and non-uniform because of non-spatial distortions. Therefore, in order to effectively incorporate the gradient-based robust estimation into our proposed framework, one of the fundamental questions should be addressed: what is a good image representation to work with using gradient-based robust estimation under non-spatial distortions. With the initial matching algorithms applied at the highest level of decomposition, the proposed hybrid approach exhibits superior range of convergence. The gradient-based algorithms in the second stage yield a robust solution that precisely registers images with sub-pixel accuracy. A hierarchical iterative searching further enhances the convergence range and rate. The simulation results demonstrated that the proposed techniques provide significant benefits to the performance of image registration
    corecore