20 research outputs found

    Evaluation of 3D gradient filters for estimation of the surface orientation in CTC

    Get PDF
    The extraction of the gradient information from 3D surfaces plays an important role for many applications including 3D graphics and medical imaging. The extraction of the 3D gradient information is performed by filtering the input data with high pass filters that are typically implemented using 3×3×3 masks. Since these filters extract the gradient information in small neighborhood, the estimated gradient information will be very sensitive to image noise. The development of a 3D gradient operator that is robust to image noise is particularly important since the medical datasets are characterized by a relatively low signal to noise ratio. The aim of this paper is to detail the implementation of an optimized 3D gradient operator that is applied to sample the local curvature of the colon wall in CT data and its influence on the overall performance of our CAD-CTC method. The developed 3D gradient operator has been applied to extract the local curvature of the colon wall in a large number CT datasets captured with different radiation doses and the experimental results are presented and discussed

    One Object at a Time: Accurate and Robust Structure From Motion for Robots

    Full text link
    A gaze-fixating robot perceives distance to the fixated object and relative positions of surrounding objects immediately, accurately, and robustly. We show how fixation, which is the act of looking at one object while moving, exploits regularities in the geometry of 3D space to obtain this information. These regularities introduce rotation-translation couplings that are not commonly used in structure from motion. To validate, we use a Franka Emika Robot with an RGB camera. We a) find that error in distance estimate is less than 5 mm at a distance of 15 cm, and b) show how relative position can be used to find obstacles under challenging scenarios. We combine accurate distance estimates and obstacle information into a reactive robot behavior that is able to pick up objects of unknown size, while impeded by unforeseen obstacles. Project page: https://oxidification.com/p/one-object-at-a-time/ .Comment: v3: Add link to project page v2: Update DOI v1: Accepted at 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS

    Riesz pyramids for fast phase-based video magnification

    Get PDF
    We present a new compact image pyramid representation, the Riesz pyramid, that can be used for real-time phase-based motion magnification. Our new representation is less overcomplete than even the smallest two orientation, octave-bandwidth complex steerable pyramid, and can be implemented using compact, efficient linear filters in the spatial domain. Motion-magnified videos produced with this new representation are of comparable quality to those produced with the complex steerable pyramid. When used with phase-based video magnification, the Riesz pyramid phase-shifts image features along only their dominant orientation rather than every orientation like the complex steerable pyramid.Quanta Computer (Firm)Shell ResearchNational Science Foundation (U.S.) (CGV-1111415)Microsoft Research (PhD Fellowship)Massachusetts Institute of Technology. Department of MathematicsNational Science Foundation (U.S.). Graduate Research Fellowship (Grant 1122374

    Colour Constrained 4D Flow

    Full text link

    Removing outliers from the Lucas-Kanade method with a weighted median filter

    Get PDF
    Master's thesis in Automation and signal processingThe definition of optical flow is stated as a brightness pattern of apparent motion of objects, through surfaces and edges in a visual scene. This technique is used in motion detection and segmentation, video compression and robot navigation. The Lucas-Kanade method uses information from the image structure to compose a gradient based solution to estimate velocities, also known as movement of X- and Y-direction in a scene. The goal is to obtain an accurate pixel motion from an image sequence The objective of this thesis is to implement a post processing step with a weighted median lter to a well known optical flow method; the Lucas-Kanade. The purpose is to use the weighted median lter to remove outliers, vectors that are lost due to illumination changes and partial occlusions. The median filer will replace velocities that are under represented in neighbourhoods. A moving object will have corners not just edges, and these vectors have to be preserved. A weighted median filter is introduced to ensure that the under represented vectors is preserved. Error is measured through angular and endpoint error, describing accuracy of the vector field. The iterative and hierarchical LK method have been studied. The iterative estimation struggles less with single error. Because of this the weighted median filter did not improve the iterative LK-method. The hierarchical estimation is improved by the weighted median and reduced the average error of both angular and endpoint error

    Implementation and Qualitative/ Quantitative Comparison and Evaluation of Range Flow and Scene Flow

    Get PDF
    This thesis investigates the use of 3D scene flow and 3D range flow as a means of computing 3D observer (sensor or camera) motion. We implemented and evaluated the scene flow algorithm presented in Stereoscopic Scene Flow Computation for 3D Motion Understanding by Wedel et al. 2010. We modified (performed pyramidal image processing with warping) and re-implemented the range flow algorithms pre­ sented in Quantitative Regularized Range Flow by Spies et al. 2000. Both algorithms are 2-frame algorithms using a pyramid. The results for these scene and range flow algorithms were quantitatively and qualitatively compared on synthetic and real car driving stereo sequences

    Angular variation as a monocular cue for spatial percepcion

    Get PDF
    Monocular cues are spatial sensory inputs which are picked up exclusively from one eye. They are in majority static features that provide depth information and are extensively used in graphic art to create realistic representations of a scene. Since the spatial information contained in these cues is picked up from the retinal image, the existence of a link between it and the theory of direct perception can be conveniently assumed. According to this theory, spatial information of an environment is directly contained in the optic array. Thus, this assumption makes possible the modeling of visual perception processes through computational approaches. In this thesis, angular variation is considered as a monocular cue, and the concept of direct perception is adopted by a computer vision approach that considers it as a suitable principle from which innovative techniques to calculate spatial information can be developed. The expected spatial information to be obtained from this monocular cue is the position and orientation of an object with respect to the observer, which in computer vision is a well known field of research called 2D-3D pose estimation. In this thesis, the attempt to establish the angular variation as a monocular cue and thus the achievement of a computational approach to direct perception is carried out by the development of a set of pose estimation methods. Parting from conventional strategies to solve the pose estimation problem, a first approach imposes constraint equations to relate object and image features. In this sense, two algorithms based on a simple line rotation motion analysis were developed. These algorithms successfully provide pose information; however, they depend strongly on scene data conditions. To overcome this limitation, a second approach inspired in the biological processes performed by the human visual system was developed. It is based in the proper content of the image and defines a computational approach to direct perception. The set of developed algorithms analyzes the visual properties provided by angular variations. The aim is to gather valuable data from which spatial information can be obtained and used to emulate a visual perception process by establishing a 2D-3D metric relation. Since it is considered fundamental in the visual-motor coordination and consequently essential to interact with the environment, a significant cognitive effect is produced by the application of the developed computational approach in environments mediated by technology. In this work, this cognitive effect is demonstrated by an experimental study where a number of participants were asked to complete an action-perception task. The main purpose of the study was to analyze the visual guided behavior in teleoperation and the cognitive effect caused by the addition of 3D information. The results presented a significant influence of the 3D aid in the skill improvement, which showed an enhancement of the sense of presence.Las señales monoculares son entradas sensoriales capturadas exclusivamente por un solo ojo que ayudan a la percepción de distancia o espacio. Son en su mayoría características estáticas que proveen información de profundidad y son muy utilizadas en arte gráfico para crear apariencias reales de una escena. Dado que la información espacial contenida en dichas señales son extraídas de la retina, la existencia de una relación entre esta extracción de información y la teoría de percepción directa puede ser convenientemente asumida. De acuerdo a esta teoría, la información espacial de todo le que vemos está directamente contenido en el arreglo óptico. Por lo tanto, esta suposición hace posible el modelado de procesos de percepción visual a través de enfoques computacionales. En esta tesis doctoral, la variación angular es considerada como una señal monocular, y el concepto de percepción directa adoptado por un enfoque basado en algoritmos de visión por computador que lo consideran un principio apropiado para el desarrollo de nuevas técnicas de cálculo de información espacial. La información espacial esperada a obtener de esta señal monocular es la posición y orientación de un objeto con respecto al observador, lo cual en visión por computador es un conocido campo de investigación llamado estimación de la pose 2D-3D. En esta tesis doctoral, establecer la variación angular como señal monocular y conseguir un modelo matemático que describa la percepción directa, se lleva a cabo mediante el desarrollo de un grupo de métodos de estimación de la pose. Partiendo de estrategias convencionales, un primer enfoque implanta restricciones geométricas en ecuaciones para relacionar características del objeto y la imagen. En este caso, dos algoritmos basados en el análisis de movimientos de rotación de una línea recta fueron desarrollados. Estos algoritmos exitosamente proveen información de la pose. Sin embargo, dependen fuertemente de condiciones de la escena. Para superar esta limitación, un segundo enfoque inspirado en los procesos biológicos ejecutados por el sistema visual humano fue desarrollado. Está basado en el propio contenido de la imagen y define un enfoque computacional a la percepción directa. El grupo de algoritmos desarrollados analiza las propiedades visuales suministradas por variaciones angulares. El propósito principal es el de reunir datos de importancia con los cuales la información espacial pueda ser obtenida y utilizada para emular procesos de percepción visual mediante el establecimiento de relaciones métricas 2D- 3D. Debido a que dicha relación es considerada fundamental en la coordinación visuomotora y consecuentemente esencial para interactuar con lo que nos rodea, un efecto cognitivo significativo puede ser producido por la aplicación de métodos de L estimación de pose en entornos mediados tecnológicamente. En esta tesis doctoral, este efecto cognitivo ha sido demostrado por un estudio experimental en el cual un número de participantes fueron invitados a ejecutar una tarea de acción-percepción. El propósito principal de este estudio fue el análisis de la conducta guiada visualmente en teleoperación y el efecto cognitivo causado por la inclusión de información 3D. Los resultados han presentado una influencia notable de la ayuda 3D en la mejora de la habilidad, así como un aumento de la sensación de presencia

    메모리 대역폭이 감소된 다중 프레임 레이트 옵티칼 플로우

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·정보공학부, 2015. 2. 김수환.최근 high frame rate camera의 비약적인 발전으로 이미 4K 1000FPS camera가 출시되었고 휴대폰에서도 1080P 240FPS를 지원하고 있다. Camera의 Frame rate 증가는 optical flow의 구현에 시사하는 바가 큰데, 그 이유는 frame rate이 올라갈수록 frame 간의 움직임 크기가 줄어들기 때문이다. 그 동안 큰 움직임에 대한 부정확한 optical flow 문제를 해결하기 위해서 다양한 알고리즘이 사용되어 왔지만, 이로 인한 computation의 증가 또는 알고리즘 dependency로 인해 늘어난 연산 시간은 real-time operation에 제약으로 작용한다. 하지만 camera의 frame rate이 올라가면 모든 움직임들은 이에 반비례해서 작아지므로, 결국 high frame rate camera는 간단한 알고리즘으로 정확한 optical flow를 얻을 수 있는 길을 열고 있다. 본 논문은 accurate real-time optical flow의 구현을 위해서 multi-frame rate and multi-scale optical flow 알고리즘을 제안한다. High frame rate camera를 이용한 multi-frame rate and multi-scale optical flow 알고리즘은 real-time optical flow의 hardware 구현에 적합하도록 iterative calculation없는 알고리즘이다. Multi-frame rate 알고리즘은 다양한 frame rate의 optical flow를 연산하고 서로간의 연관관계를 이용하여 slow motion 뿐만 아니라 high motion에 관해서도 optical flow 결과를 얻게 함으로써 측정 가능한 움직임을 확장시킨 알고리즘이다. 이 알고리즘은 frame rate 증가에 따른 시스템 연산량 증가를 기존 연구의 O(n)에서 O(log n) 수준으로 감소시킴으로써 system performance에 의한 제약을 크게 줄인다. Multi scale 알고리즘은 high frame rate system을 위한 full density 지원 알고리즘이다. 또한 본 논문에서는 frame rate의 증가에 따른 external memory access bandwidth 증가 문제를 풀기 위해서 spatial & temporal bandwidth reduction 알고리즘을 제안한다. 이 방법은 기존 LK optical flow알고리즘의 연산 순서를 바꾸고, iterative sub-sampling scheme, temporal Gaussian tail cut 그리고 frame reuse 등 다양한 방식의 알고리즘들을 제안함으로써 high frame rate system의 external memory access bandwidth를 감소시킨다. 마지막으로 Multi-Frame rate and multi-scale optical flow 알고리즘의 Multi-scale 구조의 hardware 의 구현 시 multiplier의 개수를 mxm크기의 윈도우처리를 위해 m개의 multiplier를 이용해서 convolution방식으로 구현하던 기존의 방법을 윈도우의 크기에 상관없이 2개의 multiplier로 mxm multiplication을 구현하는 방식을 제안한다. 이 방식을 기반으로 multi frame rate과 multi-scale의 hardware architecture를 제안하고 single level LK optical flow의 fpga구현을 통해서 제안한 architecture의 hardware 동작을 검증한다. 이상의 과정들을 통해서 accurate real-time optical flow system을 위한 multi-frame rate and multi-scale optical flow의 알고리즘 제안부터 architecture 검증까지의 연구를 진행한다.차 례 초 록 i 차 례 iii 그림 목차 vii 표 목 차 x 제1장 서 론 1 1.1 연구 배경 1 1.2 연구 내용 4 1.3 논문 구성 6 제2장 이전연구 7 2.1 LK Optical Flow 7 2.2 Large Displacement Problem 10 2.2.1 Pyramidal LK Optical Flow 10 2.2.2 High Frame Rate Optical Flow 11 2.2.3 Oversampled Optical Flow System 11 2.2.4 High Frame Rate Optical Flow System 14 2.3 Problems in High Frame Rate System 16 2.3.1 Test Sequence for High Frame Rate System 16 2.3.2 Saturated Accuracy for High frame rate 17 2.3.3 Accurate displacement for LK optical flow 21 2.3.1 Accurate frame rate of High frame rate system 23 제3장 Multi-Frame Rate Optical Flow 28 3.1 Ideal and Real Optical Flow System 29 3.2 Multi-Frame Rate Optical Flow 31 3.3 Accurate Frame Rate Selection 33 3.3.1 Magnitude Selection Algorithm 33 3.3.2 Magnitude Algorithm Validation 36 3.3.3 SSD(Sum of Squared Difference) 45 3.3.4 Magnitude with NR Selection Algorithm 48 3.3.5 Temporal Aliasing 51 3.4 Multi Frame Rate Optical Flow Test Result 52 3.5 Comparisons with previous works 57 제4장 Multi-Scale Optical Flow 59 4.1 Pyramidal Level Selection 61 4.2 Level Selection Algorithm 62 4.3 Pyramidal Image Generation 63 4.4 Proposed Algorithm Verification 66 4.4.1 Accuracy comparison 66 4.4.2 연산량 및 알고리즘 특성 comparisons 67 4.4.3 Graphical Result with various test sequences 69 제5장 Memory Bandwidth Reduction 76 5.1 Single Level LK Optical Flow System 76 5.1.1 Bandwidth Problem 76 5.1.2 Matrix multiplication 77 5.1.3 FPGA Implementation 77 5.2 Spatial Bandwidth Reduction 79 5.2.1 LK Optical Flow System Architecture 79 5.2.2 External Memory Bandwidth Requirement 80 5.2.1 제안하는 알고리즘 83 5.2.2 Simulation Result 86 5.3 Temporal Bandwidth Reduction 93 5.3.1 Tail Cut 93 5.3.2 Frame Reuse 97 5.3.3 Experimental Result 101 5.4 Matrix Generation 107 5.4.1 Matrix Multiplication 107 5.4.2 Proposed Matrix Multiplication 108 5.5 FPGA Implementation 111 제6장 결론 117 참 고 문 헌 118 Abstract 122Docto

    Automatic colonic polyp detection using curvature analysis for standard and low dose CT data

    Get PDF
    Colon cancer is the second leading cause of cancer related deaths in the developed nations. Early detection and removal of colorectal polyps via screening is the most effective way to reduce colorectal cancer (CRC) mortality. Computed Tomography Colonography (CTC) or Virtual Colonoscopy (VC) is a rapidly evolving non-invasive technique and the medical community view this medical procedure as an alternative to the standard colonoscopy for the detection of colonic polyps. In CTC the first step for automatic polyp detection for 3D visualization of the colon structure and automatic polyp detection addresses the segmentation of the colon lumen. The segmentation of colon lumen is far from a trivial task as in practice many datasets are collapsed due to incorrect patient preparation or blockages caused by residual water/materials left in the colon. In this thesis a robust multi-stage technique for automatic segmentation of the colon is proposed tha t maximally uses the anatomical model of a generic colon. In this regard, the colon is reconstructed using volume by length analysis, orientation, length, end points, geometrical position in the volumetric data, and gradient of the centreline of each candidate air region detected in the CT data. The proposed method was validated using a total of 151 standard dose (lOOmAs) and 13 low-dose (13mAs-40mAs) datasets and the collapsed colon surface detection was always higher than 95% with an average of 1.58% extra colonic surface inclusion. The second major step of automated CTC attempts the identification of colorectal polyps. In this thesis a robust method for polyp detection based on surface curvature analysis has been developed and evaluated. The convexity of the segmented colon surface is sampled using the surface normal intersection, Hough transform, 3D histogram, Gaussian distribution, convexity constraint and 3D region growing. For each polyp candidate surface the morphological and statistical features are extracted and the candidate surface is classified as a polyp/fold structure using a Feature Normalized Nearest Neighbourhood classifier. The devised polyp detection scheme entails a low computational overhead (typically takes 3.60 minute per dataset) and shows 100% sensitivity for polyps larger than 10mm, 92% sensitivity for polyps in the range 5 to 10mm and 64.28% sensitivity for polyp smaller than 5mm. The developed technique returns in average 4.01 false positives per dataset. The patient exposure to ionising radiation is the major concern in using CTC as a mass screening technique for colonic polyp detection. A reduction of the radiation dose will increase the level of noise during the acquisition process and as a result the quality of the CT d a ta is degraded. To fully investigate the effect of the low-dose radiation on the performance of automated polyp detection, a phantom has been developed and scanned using different radiation doses. The phantom polyps have realistic shapes (sessile, pedunculated, and flat) and sizes (3 to 20mm) and were designed to closely approximate the real polyps encountered in clinical CT data. Automatic polyp detection shows 100% sensitivity for polyps larger than 10mm and shows 95% sensitivity for polyps in the range 5 to 10mm. The developed method was applied to CT data acquired at radiation doses between 13 to 40mAs and the experimental results indicate th a t robust polyp detection can be obtained even at radiation doses as low as 13mAs
    corecore