1,086 research outputs found

    Methods for multi-spectral image fusion: identifying stable and repeatable information across the visible and infrared spectra

    Get PDF
    Fusion of images captured from different viewpoints is a well-known challenge in computer vision with many established approaches and applications; however, if the observations are captured by sensors also separated by wavelength, this challenge is compounded significantly. This dissertation presents an investigation into the fusion of visible and thermal image information from two front-facing sensors mounted side-by-side. The primary focus of this work is the development of methods that enable us to map and overlay multi-spectral information; the goal is to establish a combined image in which each pixel contains both colour and thermal information. Pixel-level fusion of these distinct modalities is approached using computational stereo methods; the focus is on the viewpoint alignment and correspondence search/matching stages of processing. Frequency domain analysis is performed using a method called phase congruency. An extensive investigation of this method is carried out with two major objectives: to identify predictable relationships between the elements extracted from each modality, and to establish a stable representation of the common information captured by both sensors. Phase congruency is shown to be a stable edge detector and repeatable spatial similarity measure for multi-spectral information; this result forms the basis for the methods developed in the subsequent chapters of this work. The feasibility of automatic alignment with sparse feature-correspondence methods is investigated. It is found that conventional methods fail to match inter-spectrum correspondences, motivating the development of an edge orientation histogram (EOH) descriptor which incorporates elements of the phase congruency process. A cost function, which incorporates the outputs of the phase congruency process and the mutual information similarity measure, is developed for computational stereo correspondence matching. An evaluation of the proposed cost function shows it to be an effective similarity measure for multi-spectral information

    Learning and Searching Methods for Robust, Real-Time Visual Odometry.

    Full text link
    Accurate position estimation provides a critical foundation for mobile robot perception and control. While well-studied, it remains difficult to provide timely, precise, and robust position estimates for applications that operate in uncontrolled environments, such as robotic exploration and autonomous driving. Continuous, high-rate egomotion estimation is possible using cameras and Visual Odometry (VO), which tracks the movement of sparse scene content known as image keypoints or features. However, high update rates, often 30~Hz or greater, leave little computation time per frame, while variability in scene content stresses robustness. Due to these challenges, implementing an accurate and robust visual odometry system remains difficult. This thesis investigates fundamental improvements throughout all stages of a visual odometry system, and has three primary contributions: The first contribution is a machine learning method for feature detector design. This method considers end-to-end motion estimation accuracy during learning. Consequently, accuracy and robustness are improved across multiple challenging datasets in comparison to state of the art alternatives. The second contribution is a proposed feature descriptor, TailoredBRIEF, that builds upon recent advances in the field in fast, low-memory descriptor extraction and matching. TailoredBRIEF is an in-situ descriptor learning method that improves feature matching accuracy by efficiently customizing descriptor structures on a per-feature basis. Further, a common asymmetry in vision system design between reference and query images is described and exploited, enabling approaches that would otherwise exceed runtime constraints. The final contribution is a new algorithm for visual motion estimation: Perspective Alignment Search~(PAS). Many vision systems depend on the unique appearance of features during matching, despite a large quantity of non-unique features in otherwise barren environments. A search-based method, PAS, is proposed to employ features that lack unique appearance through descriptorless matching. This method simplifies visual odometry pipelines, defining one method that subsumes feature matching, outlier rejection, and motion estimation. Throughout this work, evaluations of the proposed methods and systems are carried out on ground-truth datasets, often generated with custom experimental platforms in challenging environments. Particular focus is placed on preserving runtimes compatible with real-time operation, as is necessary for deployment in the field.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113365/1/chardson_1.pd

    Sub-pixel Registration In Computational Imaging And Applications To Enhancement Of Maxillofacial Ct Data

    Get PDF
    In computational imaging, data acquired by sampling the same scene or object at different times or from different orientations result in images in different coordinate systems. Registration is a crucial step in order to be able to compare, integrate and fuse the data obtained from different measurements. Tomography is the method of imaging a single plane or slice of an object. A Computed Tomography (CT) scan, also known as a CAT scan (Computed Axial Tomography scan), is a Helical Tomography, which traditionally produces a 2D image of the structures in a thin section of the body. It uses X-ray, which is ionizing radiation. Although the actual dose is typically low, repeated scans should be limited. In dentistry, implant dentistry in specific, there is a need for 3D visualization of internal anatomy. The internal visualization is mainly based on CT scanning technologies. The most important technological advancement which dramatically enhanced the clinician\u27s ability to diagnose, treat, and plan dental implants has been the CT scan. Advanced 3D modeling and visualization techniques permit highly refined and accurate assessment of the CT scan data. However, in addition to imperfections of the instrument and the imaging process, it is not uncommon to encounter other unwanted artifacts in the form of bright regions, flares and erroneous pixels due to dental bridges, metal braces, etc. Currently, removing and cleaning up the data from acquisition backscattering imperfections and unwanted artifacts is performed manually, which is as good as the experience level of the technician. On the other hand the process is error prone, since the editing process needs to be performed image by image. We address some of these issues by proposing novel registration methods and using stonecast models of patient\u27s dental imprint as reference ground truth data. Stone-cast models were originally used by dentists to make complete or partial dentures. The CT scan of such stone-cast models can be used to automatically guide the cleaning of patients\u27 CT scans from defects or unwanted artifacts, and also as an automatic segmentation system for the outliers of the CT scan data without use of stone-cast models. Segmented data is subsequently used to clean the data from artifacts using a new proposed 3D inpainting approach

    A Framework for Tumor Localization in Robot-Assisted Minimally Invasive Surgery

    Get PDF
    Manual palpation of tissue is frequently used in open surgery, e.g., for localization of tumors and buried vessels and for tissue characterization. The overall objective of this work is to explore how tissue palpation can be performed in Robot-Assisted Minimally Invasive Surgery (RAMIS) using laparoscopic instruments conventionally used in RAMIS. This thesis presents a framework where a surgical tool is moved teleoperatively in a manner analogous to the repetitive pressing motion of a finger during manual palpation. We interpret the changes in parameters due to this motion such as the applied force and the resulting indentation depth to accurately determine the variation in tissue stiffness. This approach requires the sensorization of the laparoscopic tool for force sensing. In our work, we have used a da Vinci needle driver which has been sensorized in our lab at CSTAR for force sensing using Fiber Bragg Grating (FBG). A computer vision algorithm has been developed for 3D surgical tool-tip tracking using the da Vinci \u27s stereo endoscope. This enables us to measure changes in surface indentation resulting from pressing the needle driver on the tissue. The proposed palpation framework is based on the hypothesis that the indentation depth is inversely proportional to the tissue stiffness when a constant pressing force is applied. This was validated in a telemanipulated setup using the da Vinci surgical system with a phantom in which artificial tumors were embedded to represent areas of different stiffnesses. The region with high stiffness representing tumor and region with low stiffness representing healthy tissue showed an average indentation depth change of 5.19 mm and 10.09 mm respectively while maintaining a maximum force of 8N during robot-assisted palpation. These indentation depth variations were then distinguished using the k-means clustering algorithm to classify groups of low and high stiffnesses. The results were presented in a colour-coded map. The unique feature of this framework is its use of a conventional laparoscopic tool and minimal re-design of the existing da Vinci surgical setup. Additional work includes a vision-based algorithm for tracking the motion of the tissue surface such as that of the lung resulting from respiratory and cardiac motion. The extracted motion information was analyzed to characterize the lung tissue stiffness based on the lateral strain variations as the surface inflates and deflates

    3D Face Recognition

    Get PDF
    • …
    corecore