644 research outputs found

    Weighted and filtered mutual information: A Metric for the automated creation of panoramas from views of complex scenes

    Get PDF
    To contribute a novel approach in the field of image registration and panorama creation, this algorithm foregoes any scene knowledge, requiring only modest scene overlap and an acceptable amount of entropy within each overlapping view. The weighted and filtered mutual information (WFMI) algorithm has been developed for multiple stationary, color, surveillance video camera views and relies on color gradients for feature correspondence. This is a novel extension of well-established maximization of mutual information (MMI) algorithms. Where MMI algorithms are typically applied to high altitude photography and medical imaging (scenes with relatively simple shapes and affine relationships between views), the WFMI algorithm has been designed for scenes with occluded objects and significant parallax variation between non-affine related views. Despite these typically non-affine surveillance scenarios, searching in the affine space for a homography is a practical assumption that provides computational efficiency and accurate results, even with complex scene views. The WFMI algorithm can perfectly register affine views, performs exceptionally well with near-affine related views, and in complex scene views (well beyond affine constraints) the WFMI algorithm provides an accurate estimate of the overlap regions between the views. The WFMI algorithm uses simple calculations (vector field color gradient, Laplacian filtering, and feature histograms) to generate the WFMI metric and provide the optimal affine relationship. This algorithm is unique when compared to typical MMI algorithms and modern registration algorithms because it avoids almost all a priori knowledge and calculations, while still providing an accurate or useful estimate for realistic scenes. With mutual information weighting and the Laplacian filtering operation, the WFMI algorithm overcomes the failures of typical MMI algorithms in scenes where complex or occluded shapes do not provide sufficiently large peaks in the mutual information maps to determine the overlap region. This work has currently been applied to individual video frames and it will be shown that future work could easily extend the algorithm into utilizing motion information or temporal frame registrations to enhance scenes with smaller overlap regions, lower entropy, or even more significant parallax and occlusion variations between views

    Content-preserving image stitching with piecewise rectangular boundary constraints

    Get PDF
    This paper proposes an approach to content-preserving image stitching with regular boundary constraints, which aims to stitch multiple images to generate a panoramic image with a piecewise rectangular boundary. Existing methods treat image stitching and rectangling as two separate steps, which may result in suboptimal results as the stitching process is not aware of the further warping needs for rectangling. We address these limitations by formulating image stitching with regular boundaries in a unified optimization. Starting from the initial stitching results produced by the traditional warping-based optimization, we obtain the irregular boundary from the warped meshes by polygon Boolean operations which robustly handle arbitrary mesh compositions. By analyzing the irregular boundary, we construct a piecewise rectangular boundary. Based on this, we further incorporate line and regular boundary preservation constraints into the image stitching framework, and conduct iterative optimization to obtain an optimal piecewise rectangular boundary. Thus we can make the boundary of the stitching results as close as possible to a rectangle, while reducing unwanted distortions. We further extend our method to video stitching, by integrating the temporal coherence into the optimization. Experiments show that our method efficiently produces visually pleasing panoramas with regular boundaries and unnoticeable distortions

    MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images

    Get PDF
    We introduce a method to convert stereo 360{\deg} (omnidirectional stereo) imagery into a layered, multi-sphere image representation for six degree-of-freedom (6DoF) rendering. Stereo 360{\deg} imagery can be captured from multi-camera systems for virtual reality (VR), but lacks motion parallax and correct-in-all-directions disparity cues. Together, these can quickly lead to VR sickness when viewing content. One solution is to try and generate a format suitable for 6DoF rendering, such as by estimating depth. However, this raises questions as to how to handle disoccluded regions in dynamic scenes. Our approach is to simultaneously learn depth and disocclusions via a multi-sphere image representation, which can be rendered with correct 6DoF disparity and motion parallax in VR. This significantly improves comfort for the viewer, and can be inferred and rendered in real time on modern GPU hardware. Together, these move towards making VR video a more comfortable immersive medium.Comment: 25 pages, 13 figures, Published at European Conference on Computer Vision (ECCV 2020), Project Page: http://visual.cs.brown.edu/matryodshk

    Accurate Calibration Scheme for a Multi-Camera Mobile Mapping System

    Get PDF
    Mobile mapping systems (MMS) are increasingly used for many photogrammetric and computer vision applications, especially encouraged by the fast and accurate geospatial data generation. The accuracy of point position in an MMS is mainly dependent on the quality of calibration, accuracy of sensor synchronization, accuracy of georeferencing and stability of geometric configuration of space intersections. In this study, we focus on multi-camera calibration (interior and relative orientation parameter estimation) and MMS calibration (mounting parameter estimation). The objective of this study was to develop a practical scheme for rigorous and accurate system calibration of a photogrammetric mapping station equipped with a multi-projective camera (MPC) and a global navigation satellite system (GNSS) and inertial measurement unit (IMU) for direct georeferencing. The proposed technique is comprised of two steps. Firstly, interior orientation parameters of each individual camera in an MPC and the relative orientation parameters of each cameras of the MPC with respect to the first camera are estimated. In the second step the offset and misalignment between MPC and GNSS/IMU are estimated. The global accuracy of the proposed method was assessed using independent check points. A correspondence map for a panorama is introduced that provides metric information. Our results highlight that the proposed calibration scheme reaches centimeter-level global accuracy for 3D point positioning. This level of global accuracy demonstrates the feasibility of the proposed technique and has the potential to fit accurate mapping purposes

    Skyline matching: absolute localisation for planetary exploration rovers

    Get PDF
    Skyline matching is a technique for absolute localisation framed in the category of autonomous long-range exploration. Absolute localisation becomes crucial for planetary exploration to recalibrate position during long traverses or to estimate position with no a-priori information. In this project, a skyline matching algorithm is proposed, implemented and evaluated using real acquisitions and simulated data. The function is based on comparing the skyline extracted from rover images and orbital data. The results are promising but intensive testing on more real data is needed to further characterize the algorithm

    Development, Implementation and Pre-clinical Evaluation of Medical Image Computing Tools in Support of Computer-aided Diagnosis: Respiratory, Orthopedic and Cardiac Applications

    Get PDF
    Over the last decade, image processing tools have become crucial components of all clinical and research efforts involving medical imaging and associated applications. The imaging data available to the radiologists continue to increase their workload, raising the need for efficient identification and visualization of the required image data necessary for clinical assessment. Computer-aided diagnosis (CAD) in medical imaging has evolved in response to the need for techniques that can assist the radiologists to increase throughput while reducing human error and bias without compromising the outcome of the screening, diagnosis or disease assessment. More intelligent, but simple, consistent and less time-consuming methods will become more widespread, reducing user variability, while also revealing information in a more clear, visual way. Several routine image processing approaches, including localization, segmentation, registration, and fusion, are critical for enhancing and enabling the development of CAD techniques. However, changes in clinical workflow require significant adjustments and re-training and, despite the efforts of the academic research community to develop state-of-the-art algorithms and high-performance techniques, their footprint often hampers their clinical use. Currently, the main challenge seems to not be the lack of tools and techniques for medical image processing, analysis, and computing, but rather the lack of clinically feasible solutions that leverage the already developed and existing tools and techniques, as well as a demonstration of the potential clinical impact of such tools. Recently, more and more efforts have been dedicated to devising new algorithms for localization, segmentation or registration, while their potential and much intended clinical use and their actual utility is dwarfed by the scientific, algorithmic and developmental novelty that only result in incremental improvements over already algorithms. In this thesis, we propose and demonstrate the implementation and evaluation of several different methodological guidelines that ensure the development of image processing tools --- localization, segmentation and registration --- and illustrate their use across several medical imaging modalities --- X-ray, computed tomography, ultrasound and magnetic resonance imaging --- and several clinical applications: Lung CT image registration in support for assessment of pulmonary nodule growth rate and disease progression from thoracic CT images. Automated reconstruction of standing X-ray panoramas from multi-sector X-ray images for assessment of long limb mechanical axis and knee misalignment. Left and right ventricle localization, segmentation, reconstruction, ejection fraction measurement from cine cardiac MRI or multi-plane trans-esophageal ultrasound images for cardiac function assessment. When devising and evaluating our developed tools, we use clinical patient data to illustrate the inherent clinical challenges associated with highly variable imaging data that need to be addressed before potential pre-clinical validation and implementation. In an effort to provide plausible solutions to the selected applications, the proposed methodological guidelines ensure the development of image processing tools that help achieve sufficiently reliable solutions that not only have the potential to address the clinical needs, but are sufficiently streamlined to be potentially translated into eventual clinical tools provided proper implementation. G1: Reducing the number of degrees of freedom (DOF) of the designed tool, with a plausible example being avoiding the use of inefficient non-rigid image registration methods. This guideline addresses the risk of artificial deformation during registration and it clearly aims at reducing complexity and the number of degrees of freedom. G2: The use of shape-based features to most efficiently represent the image content, either by using edges instead of or in addition to intensities and motion, where useful. Edges capture the most useful information in the image and can be used to identify the most important image features. As a result, this guideline ensures a more robust performance when key image information is missing. G3: Efficient method of implementation. This guideline focuses on efficiency in terms of the minimum number of steps required and avoiding the recalculation of terms that only need to be calculated once in an iterative process. An efficient implementation leads to reduced computational effort and improved performance. G4: Commence the workflow by establishing an optimized initialization and gradually converge toward the final acceptable result. This guideline aims to ensure reasonable outcomes in consistent ways and it avoids convergence to local minima, while gradually ensuring convergence to the global minimum solution. These guidelines lead to the development of interactive, semi-automated or fully-automated approaches that still enable the clinicians to perform final refinements, while they reduce the overall inter- and intra-observer variability, reduce ambiguity, increase accuracy and precision, and have the potential to yield mechanisms that will aid with providing an overall more consistent diagnosis in a timely fashion

    Multi-Projective Camera-Calibration, Modeling, and Integration in Mobile-Mapping Systems

    Get PDF
    Optical systems are vital parts of most modern systems such as mobile mapping systems, autonomous cars, unmanned aerial vehicles (UAV), and game consoles. Multi-camera systems (MCS) are commonly employed for precise mapping including aerial and close-range applications. In the first part of this thesis a simple and practical calibration model and a calibration scheme for multi-projective cameras (MPC) is presented. The calibration scheme is enabled by implementing a camera test field equipped with a customized coded target as FGI’s camera calibration room. The first hypothesis was that a test field is necessary to calibrate an MPC. Two commercially available MPCs with 6 and 36 cameras were successfully calibrated in FGI’s calibration room. The calibration results suggest that the proposed model is able to estimate parameters of the MPCs with high geometric accuracy, and reveals the internal structure of the MPCs. In the second part, the applicability of an MPC calibrated by the proposed approach was investigated in a mobile mapping system (MMS). The second hypothesis was that a system calibration is necessary to achieve high geometric accuracies in a multi-camera MMS. The MPC model was updated to consider mounting parameters with respect to GNSS and IMU. A system calibration scheme for an MMS was proposed. The results showed that the proposed system calibration approach was able to produce accurate results by direct georeferencing of multi-images in an MMS. Results of geometric assessments suggested that a centimeter-level accuracy is achievable by employing the proposed approach. A novel correspondence map is demonstrated for MPCs that helps to create metric panoramas. In the third part, the problem of real-time trajectory estimation of a UAV equipped with a projective camera was studied. The main objective of this part was to address the problem of real-time monocular simultaneous localization and mapping (SLAM) of a UAV. An angular framework was discussed to address the gimbal lock singular situation. The results suggest that the proposed solution is an effective and rigorous monocular SLAM for aerial cases where the object is near-planar. In the last part, the problem of tree-species classification by a UAV equipped with two hyper-spectral an RGB cameras was studied. The objective of this study was to investigate different aspects of a precise tree-species classification problem by employing state-of-art methods. A 3D convolutional neural-network (3D-CNN) and a multi-layered perceptron (MLP) were proposed and compared. Both classifiers were highly successful in their tasks, while the 3D-CNN was superior in performance. The classification result was the most accurate results published in comparison to other works.Optiset kuvauslaitteet ovat keskeisessä roolissa moderneissa konenäköön perustuvissa järjestelmissä kuten autonomiset autot, miehittämättömät lentolaitteet (UAV) ja pelikonsolit. Tällaisissa sovelluksissa hyödynnetään tyypillisesti monikamerajärjestelmiä. Väitöskirjan ensimmäisessä osassa kehitetään yksinkertainen ja käytännöllinen matemaattinen malli ja kalibrointimenetelmä monikamerajärjestelmille. Koodatut kohteet ovat keinotekoisia kuvia, joita voidaan tulostaa esimerkiksi A4-paperiarkeille ja jotka voidaan mitata automaattisesti tietokonealgoritmeillä. Matemaattinen malli määritetään hyödyntämällä 3-ulotteista kamerakalibrointihuonetta, johon kehitetyt koodatut kohteet asennetaan. Kaksi kaupallista monikamerajärjestelmää, jotka muodostuvat 6 ja 36 erillisestä kamerasta, kalibroitiin onnistuneesti ehdotetulla menetelmällä. Tulokset osoittivat, että menetelmä tuotti tarkat estimaatit monikamerajärjestelmän geometrisille parametreille ja että estimoidut parametrit vastasivat hyvin kameran sisäistä rakennetta. Työn toisessa osassa tutkittiin ehdotetulla menetelmällä kalibroidun monikamerajärjestelmän mittauskäyttöä liikkuvassa kartoitusjärjestelmässä (MMS). Tavoitteena oli kehittää ja tutkia korkean geometrisen tarkkuuden kartoitusmittauksia. Monikameramallia laajennettiin navigointilaitteiston paikannus ja kallistussensoreihin (GNSS/IMU) liittyvillä parametreillä ja ehdotettiin järjestelmäkalibrointimenetelmää liikkuvalle kartoitusjärjestelmälle. Kalibroidulla järjestelmällä saavutettiin senttimetritarkkuus suorapaikannusmittauksissa. Työssä myös esitettiin monikuville vastaavuuskartta, joka mahdollistaa metristen panoraamojen luonnin monikamarajärjestelmän kuvista. Kolmannessa osassa tutkittiin UAV:​​n liikeradan reaaliaikaista estimointia hyödyntäen yhteen kameraan perustuvaa menetelmää. Päätavoitteena oli kehittää monokulaariseen kuvaamiseen perustuva reaaliaikaisen samanaikaisen paikannuksen ja kartoituksen (SLAM) menetelmä. Työssä ehdotettiin moniresoluutioisiin kuvapyramideihin ja eteneviin suorakulmaisiin alueisiin perustuvaa sovitusmenetelmää. Ehdotetulla lähestymistavalla pystyttiin alentamaan yhteensovittamisen kustannuksia sovituksen tarkkuuden säilyessä muuttumattomana. Kardaanilukko (gimbal lock) tilanteen käsittelemiseksi toteutettiin uusi kulmajärjestelmä. Tulokset osoittivat, että ehdotettu ratkaisu oli tehokas ja tarkka tilanteissa joissa kohde on lähes tasomainen. Suorituskyvyn arviointi osoitti, että kehitetty menetelmä täytti UAV:n reaaliaikaiselle reitinestimoinnille annetut aika- ja tarkkuustavoitteet. Työn viimeisessä osassa tutkittiin puulajiluokitusta käyttäen hyperspektri- ja RGB-kameralla varustettua UAV-järjestelmää. Tavoitteena oli tutkia uusien koneoppimismenetelmien käyttöä tarkassa puulajiluokituksessa ja lisäksi vertailla hyperspektri ja RGB-aineistojen suorituskykyä. Työssä verrattiin 3D-konvoluutiohermoverkkoa (3D-CNN) ja monikerroksista perceptronia (MLP). Molemmat luokittelijat tuottivat hyvän luokittelutarkkuuden, mutta 3D-CNN tuotti tarkimmat tulokset. Saavutettu tarkkuus oli parempi kuin aikaisemmat julkaistut tulokset vastaavilla aineistoilla. Hyperspektrisen ja RGB-datan yhdistelmä tuotti parhaan tarkkuuden, mutta myös RGB-kamera yksin tuotti tarkan tuloksen ja on edullinen ja tehokas aineisto monille luokittelusovelluksille

    Image-Based Rendering Of Real Environments For Virtual Reality

    Get PDF

    A Computer-Aided Training (CAT) System for Short Track Speed Skating

    Get PDF
    Short track speed skating has become popular all over the world. The demands of a computer-aided training (CAT) system are booming due to this fact. However, the existing commercial systems for sports are highly dependent on expensive equipment and complicated hardware calibration. This dissertation presents a novel CAT system for tracking multiple skaters in short track skating competitions. Aiming at the challenges, we utilize global rink information to compensate camera motion and obtain the global spatial information of skaters; apply Random Forest to fuse multiple cues and predict the blobs for each of the skaters; and finally develop a silhouette and edge-based template matching and blob growing method to allocate each blob to corresponding skaters. The proposed multiple skaters tracking algorithm organically integrates multi-cue fusion, dynamic appearance modeling, machine learning, etc. to form an efficient and robust CAT system. The effectiveness and robustness of the proposed method are presented through experiments
    corecore