4,345 research outputs found

    Real-time quality visualization of medical models on commodity and mobile devices

    Get PDF
    This thesis concerns the specific field of visualization of medical models using commodity and mobile devices. Mechanisms for medical imaging acquisition such as MRI, CT, and micro-CT scanners are continuously evolving, up to the point of obtaining volume datasets of large resolutions (> 512^3). As these datasets grow in resolution, its treatment and visualization become more and more expensive due to their computational requirements. For this reason, special techniques such as data pre-processing (filtering, construction of multi-resolution structures, etc.) and sophisticated algorithms have to be introduced in different points of the visualization pipeline to achieve the best visual quality without compromising performance times. The problem of managing big datasets comes from the fact that we have limited computational resources. Not long ago, the only physicians that were rendering volumes were radiologists. Nowadays, the outcome of diagnosis is the data itself, and medical doctors need to render them in commodity PCs (even patients may want to render the data, and the DVDs are commonly accompanied with a DICOM viewer software). Furthermore, with the increasing use of technology in daily clinical tasks, small devices such as mobile phones and tablets can fit the needs of medical doctors in some specific areas. Visualizing diagnosis images of patients becomes more challenging when it comes to using these devices instead of desktop computers, as they generally have more restrictive hardware specifications. The goal of this Ph.D. thesis is the real-time, quality visualization of medium to large medical volume datasets (resolutions >= 512^3 voxels) on mobile phones and commodity devices. To address this problem, we use multiresolution techniques that apply downsampling techniques on the full resolution datasets to produce coarser representations which are easier to handle. We have focused our efforts on the application of Volume Visualization in the clinical practice, so we have a particular interest in creating solutions that require short pre-processing times that quickly provide the specialists with the data outcome, maximize the preservation of features and the visual quality of the final images, achieve high frame rates that allow interactive visualizations, and make efficient use of the computational resources. The contributions achieved during this thesis comprise improvements in several stages of the visualization pipeline. The techniques we propose are located in the stages of multi-resolution generation, transfer function design and the GPU ray casting algorithm itself.Esta tesis se centra en la visualización de modelos médicos de volumen en dispositivos móviles y de bajas prestaciones. Los sistemas médicos de captación tales como escáners MRI, CT y micro-CT, están en constante evolución, hasta el punto de obtener modelos de volumen de gran resolución (> 512^3). A medida que estos datos crecen en resolución, su manejo y visualización se vuelve más y más costoso debido a sus requisitos computacionales. Por este motivo, técnicas especiales como el pre-proceso de datos (filtrado, construcción de estructuras multiresolución, etc.) y algoritmos específicos se tienen que introducir en diferentes puntos de la pipeline de visualización para conseguir la mejor calidad visual posible sin comprometer el rendimiento. El problema que supone manejar grandes volumenes de datos es debido a que tenemos recursos computacionales limitados. Hace no mucho, las únicas personas en el ámbito médico que visualizaban datos de volumen eran los radiólogos. Hoy en día, el resultado de la diagnosis son los datos en sí, y los médicos necesitan renderizar estos datos en PCs de características modestas (incluso los pacientes pueden querer visualizar estos datos, pues los DVDs con los resultados suelen venir acompañados de un visor de imágenes DICOM). Además, con el reciente aumento del uso de las tecnologías en la clínica práctica habitual, dispositivos pequeños como teléfonos móviles o tablets son los más convenientes en algunos casos. La visualización de volumen es más difícil en este tipo de dispositivos que en equipos de sobremesa, pues las limitaciones de su hardware son superiores. El objetivo de esta tesis doctoral es la visualización de calidad en tiempo real de modelos grandes de volumen (resoluciones >= 512^3 voxels) en teléfonos móviles y dispositivos de bajas prestaciones. Para enfrentarnos a este problema, utilizamos técnicas multiresolución que aplican técnicas de reducción de datos a los modelos en resolución original, para así obtener modelos de menor resolución. Hemos centrado nuestros esfuerzos en la aplicación de la visualización de volumen en la práctica clínica, así que tenemos especial interés en diseñar soluciones que requieran cortos tiempos de pre-proceso para que los especialistas tengan rápidamente los resultados a su disposición. También, queremos maximizar la conservación de detalles de interés y la calidad de las imágenes finales, conseguir frame rates altos que faciliten visualizaciones interactivas y que hagan un uso eficiente de los recursos computacionales. Las contribuciones aportadas por esta tesis són mejoras en varias etapas de la pipeline de visualización. Las técnicas que proponemos se situan en las etapas de generación de la estructura multiresolución, el diseño de la función de transferencia y el algoritmo de ray casting en la GPU.Postprint (published version

    Real-time quality visualization of medical models on commodity and mobile devices

    Get PDF
    This thesis concerns the specific field of visualization of medical models using commodity and mobile devices. Mechanisms for medical imaging acquisition such as MRI, CT, and micro-CT scanners are continuously evolving, up to the point of obtaining volume datasets of large resolutions (> 512^3). As these datasets grow in resolution, its treatment and visualization become more and more expensive due to their computational requirements. For this reason, special techniques such as data pre-processing (filtering, construction of multi-resolution structures, etc.) and sophisticated algorithms have to be introduced in different points of the visualization pipeline to achieve the best visual quality without compromising performance times. The problem of managing big datasets comes from the fact that we have limited computational resources. Not long ago, the only physicians that were rendering volumes were radiologists. Nowadays, the outcome of diagnosis is the data itself, and medical doctors need to render them in commodity PCs (even patients may want to render the data, and the DVDs are commonly accompanied with a DICOM viewer software). Furthermore, with the increasing use of technology in daily clinical tasks, small devices such as mobile phones and tablets can fit the needs of medical doctors in some specific areas. Visualizing diagnosis images of patients becomes more challenging when it comes to using these devices instead of desktop computers, as they generally have more restrictive hardware specifications. The goal of this Ph.D. thesis is the real-time, quality visualization of medium to large medical volume datasets (resolutions >= 512^3 voxels) on mobile phones and commodity devices. To address this problem, we use multiresolution techniques that apply downsampling techniques on the full resolution datasets to produce coarser representations which are easier to handle. We have focused our efforts on the application of Volume Visualization in the clinical practice, so we have a particular interest in creating solutions that require short pre-processing times that quickly provide the specialists with the data outcome, maximize the preservation of features and the visual quality of the final images, achieve high frame rates that allow interactive visualizations, and make efficient use of the computational resources. The contributions achieved during this thesis comprise improvements in several stages of the visualization pipeline. The techniques we propose are located in the stages of multi-resolution generation, transfer function design and the GPU ray casting algorithm itself.Esta tesis se centra en la visualización de modelos médicos de volumen en dispositivos móviles y de bajas prestaciones. Los sistemas médicos de captación tales como escáners MRI, CT y micro-CT, están en constante evolución, hasta el punto de obtener modelos de volumen de gran resolución (> 512^3). A medida que estos datos crecen en resolución, su manejo y visualización se vuelve más y más costoso debido a sus requisitos computacionales. Por este motivo, técnicas especiales como el pre-proceso de datos (filtrado, construcción de estructuras multiresolución, etc.) y algoritmos específicos se tienen que introducir en diferentes puntos de la pipeline de visualización para conseguir la mejor calidad visual posible sin comprometer el rendimiento. El problema que supone manejar grandes volumenes de datos es debido a que tenemos recursos computacionales limitados. Hace no mucho, las únicas personas en el ámbito médico que visualizaban datos de volumen eran los radiólogos. Hoy en día, el resultado de la diagnosis son los datos en sí, y los médicos necesitan renderizar estos datos en PCs de características modestas (incluso los pacientes pueden querer visualizar estos datos, pues los DVDs con los resultados suelen venir acompañados de un visor de imágenes DICOM). Además, con el reciente aumento del uso de las tecnologías en la clínica práctica habitual, dispositivos pequeños como teléfonos móviles o tablets son los más convenientes en algunos casos. La visualización de volumen es más difícil en este tipo de dispositivos que en equipos de sobremesa, pues las limitaciones de su hardware son superiores. El objetivo de esta tesis doctoral es la visualización de calidad en tiempo real de modelos grandes de volumen (resoluciones >= 512^3 voxels) en teléfonos móviles y dispositivos de bajas prestaciones. Para enfrentarnos a este problema, utilizamos técnicas multiresolución que aplican técnicas de reducción de datos a los modelos en resolución original, para así obtener modelos de menor resolución. Hemos centrado nuestros esfuerzos en la aplicación de la visualización de volumen en la práctica clínica, así que tenemos especial interés en diseñar soluciones que requieran cortos tiempos de pre-proceso para que los especialistas tengan rápidamente los resultados a su disposición. También, queremos maximizar la conservación de detalles de interés y la calidad de las imágenes finales, conseguir frame rates altos que faciliten visualizaciones interactivas y que hagan un uso eficiente de los recursos computacionales. Las contribuciones aportadas por esta tesis són mejoras en varias etapas de la pipeline de visualización. Las técnicas que proponemos se situan en las etapas de generación de la estructura multiresolución, el diseño de la función de transferencia y el algoritmo de ray casting en la GPU

    Vid2Curve: Simultaneous Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video

    Get PDF
    Thin structures, such as wire-frame sculptures, fences, cables, power lines, and tree branches, are common in the real world. It is extremely challenging to acquire their 3D digital models using traditional image-based or depth-based reconstruction methods because thin structures often lack distinct point features and have severe self-occlusion. We propose the first approach that simultaneously estimates camera motion and reconstructs the geometry of complex 3D thin structures in high quality from a color video captured by a handheld camera. Specifically, we present a new curve-based approach to estimate accurate camera poses by establishing correspondences between featureless thin objects in the foreground in consecutive video frames, without requiring visual texture in the background scene to lock on. Enabled by this effective curve-based camera pose estimation strategy, we develop an iterative optimization method with tailored measures on geometry, topology as well as self-occlusion handling for reconstructing 3D thin structures. Extensive validations on a variety of thin structures show that our method achieves accurate camera pose estimation and faithful reconstruction of 3D thin structures with complex shape and topology at a level that has not been attained by other existing reconstruction methods.Comment: Accepted by SIGGRAPH 202

    Progressive Structure from Motion

    Full text link
    Structure from Motion or the sparse 3D reconstruction out of individual photos is a long studied topic in computer vision. Yet none of the existing reconstruction pipelines fully addresses a progressive scenario where images are only getting available during the reconstruction process and intermediate results are delivered to the user. Incremental pipelines are capable of growing a 3D model but often get stuck in local minima due to wrong (binding) decisions taken based on incomplete information. Global pipelines on the other hand need the access to the complete viewgraph and are not capable of delivering intermediate results. In this paper we propose a new reconstruction pipeline working in a progressive manner rather than in a batch processing scheme. The pipeline is able to recover from failed reconstructions in early stages, avoids to take binding decisions, delivers a progressive output and yet maintains the capabilities of existing pipelines. We demonstrate and evaluate our method on diverse challenging public and dedicated datasets including those with highly symmetric structures and compare to the state of the art.Comment: Accepted to ECCV 201

    {Vid2Curve}: {S}imultaneous Camera Motion Estimation and Thin Structure Reconstruction from an {RGB} Video

    Get PDF
    Thin structures, such as wire-frame sculptures, fences, cables, power lines, and tree branches, are common in the real world. It is extremely challenging to acquire their 3D digital models using traditional image-based or depth-based reconstruction methods because thin structures often lack distinct point features and have severe self-occlusion. We propose the first approach that simultaneously estimates camera motion and reconstructs the geometry of complex 3D thin structures in high quality from a color video captured by a handheld camera. Specifically, we present a new curve-based approach to estimate accurate camera poses by establishing correspondences between featureless thin objects in the foreground in consecutive video frames, without requiring visual texture in the background scene to lock on. Enabled by this effective curve-based camera pose estimation strategy, we develop an iterative optimization method with tailored measures on geometry, topology as well as self-occlusion handling for reconstructing 3D thin structures. Extensive validations on a variety of thin structures show that our method achieves accurate camera pose estimation and faithful reconstruction of 3D thin structures with complex shape and topology at a level that has not been attained by other existing reconstruction methods

    NOVEL DENSE STEREO ALGORITHMS FOR HIGH-QUALITY DEPTH ESTIMATION FROM IMAGES

    Get PDF
    This dissertation addresses the problem of inferring scene depth information from a collection of calibrated images taken from different viewpoints via stereo matching. Although it has been heavily investigated for decades, depth from stereo remains a long-standing challenge and popular research topic for several reasons. First of all, in order to be of practical use for many real-time applications such as autonomous driving, accurate depth estimation in real-time is of great importance and one of the core challenges in stereo. Second, for applications such as 3D reconstruction and view synthesis, high-quality depth estimation is crucial to achieve photo realistic results. However, due to the matching ambiguities, accurate dense depth estimates are difficult to achieve. Last but not least, most stereo algorithms rely on identification of corresponding points among images and only work effectively when scenes are Lambertian. For non-Lambertian surfaces, the brightness constancy assumption is no longer valid. This dissertation contributes three novel stereo algorithms that are motivated by the specific requirements and limitations imposed by different applications. In addressing high speed depth estimation from images, we present a stereo algorithm that achieves high quality results while maintaining real-time performance. We introduce an adaptive aggregation step in a dynamic-programming framework. Matching costs are aggregated in the vertical direction using a computationally expensive weighting scheme based on color and distance proximity. We utilize the vector processing capability and parallelism in commodity graphics hardware to speed up this process over two orders of magnitude. In addressing high accuracy depth estimation, we present a stereo model that makes use of constraints from points with known depths - the Ground Control Points (GCPs) as referred to in stereo literature. Our formulation explicitly models the influences of GCPs in a Markov Random Field. A novel regularization prior is naturally integrated into a global inference framework in a principled way using the Bayes rule. Our probabilistic framework allows GCPs to be obtained from various modalities and provides a natural way to integrate information from various sensors. In addressing non-Lambertian reflectance, we introduce a new invariant for stereo correspondence which allows completely arbitrary scene reflectance (bidirectional reflectance distribution functions - BRDFs). This invariant can be used to formulate a rank constraint on stereo matching when the scene is observed by several lighting configurations in which only the lighting intensity varies

    An empirical assessment of real-time progressive stereo reconstruction

    Get PDF
    3D reconstruction from images, the problem of reconstructing depth from images, is one of the most well-studied problems within computer vision. In part because it is academically interesting, but also because of the significant growth in the use of 3D models. This growth can be attributed to the development of augmented reality, 3D printing and indoor mapping. Progressive stereo reconstruction is the sequential application of stereo reconstructions to reconstruct a scene. To achieve a reliable progressive stereo reconstruction a combination of best practice algorithms needs to be used. The purpose of this research is to determine the combinat ion of best practice algorithms that lead to the most accurate and efficient progressive stereo reconstruction i.e the best practice combination. In order to obtain a similarity reconstruction the in t rinsic parameters of the camera need to be known. If they are not known they are determined by capturing ten images of a checkerboard with a known calibration pattern from different angles and using the moving plane algori thm. Thereafter in order to perform a near real-time reconstruction frames are acquired and reconstructed simultaneously. For the first pair of frames keypoints are detected and matched using a best practice keypoint detection and matching algorithm. The motion of the camera between the frames is then determined by decomposing the essential matrix which is determined from the fundamental matrix, which is determined using a best practice ego-motion estimation algorithm. Finally the keypoints are reconstructed using a best practice reconstruction algorithm. For sequential frames each frame is paired with t he previous frame and keypoints are therefore only detected in the sequential frame. They are detected , matched and reconstructed in the same fashion as the first pair of frames, however to ensure that the reconstructed points are in the same scale as the points reconstructed from the first pair of frames the motion of the camera between t he frames is estimated from 3D-2D correspondences using a best practice algorithm. If the purpose of progressive reconstruction is for visualization the best practice combination algorithm for keypoint detection was found to be Speeded Up Robust Features (SURF) as it results in more reconstructed points than Scale-Invariant Feature Transform (SIFT). SIFT is however more computationally efficient and thus better suited if the number of reconstructed points does not matter, for example if the purpose of progressive reconstruction is for camera tracking. For all purposes the best practice combination algorithm for matching was found to be optical flow as it is the most efficient and for ego-motion estimation the best practice combination algorithm was found to be the 5-point algorithm as it is robust to points located on planes. This research is significant as the effects of the key steps of progressive reconstruction and the choices made at each step on the accuracy and efficiency of the reconstruction as a whole have never been studied. As a result progressive stereo reconstruction can now be performed in near real-time on a mobile device without compromising the accuracy of reconstruction
    corecore