20 research outputs found
Evaluation of 3D gradient filters for estimation of the surface orientation in CTC
The extraction of the gradient information from 3D surfaces plays an important role for many applications including 3D graphics and medical imaging. The extraction of the 3D gradient information is performed by filtering the input data with high pass filters that are typically implemented using 3×3×3 masks. Since these filters extract the
gradient information in small neighborhood, the estimated gradient information will be very sensitive to image noise. The development of a 3D gradient operator that is robust
to image noise is particularly important since the medical datasets are characterized by a relatively low signal to noise ratio. The aim of this paper is to detail the
implementation of an optimized 3D gradient operator that is applied to sample the local curvature of the colon wall in CT data and its influence on the overall performance of
our CAD-CTC method. The developed 3D gradient operator has been applied to extract the local curvature of the colon wall in a large number CT datasets captured with different radiation doses and the experimental results are presented and discussed
One Object at a Time: Accurate and Robust Structure From Motion for Robots
A gaze-fixating robot perceives distance to the fixated object and relative
positions of surrounding objects immediately, accurately, and robustly. We show
how fixation, which is the act of looking at one object while moving, exploits
regularities in the geometry of 3D space to obtain this information. These
regularities introduce rotation-translation couplings that are not commonly
used in structure from motion. To validate, we use a Franka Emika Robot with an
RGB camera. We a) find that error in distance estimate is less than 5 mm at a
distance of 15 cm, and b) show how relative position can be used to find
obstacles under challenging scenarios. We combine accurate distance estimates
and obstacle information into a reactive robot behavior that is able to pick up
objects of unknown size, while impeded by unforeseen obstacles. Project page:
https://oxidification.com/p/one-object-at-a-time/ .Comment: v3: Add link to project page v2: Update DOI v1: Accepted at 2022
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS
Riesz pyramids for fast phase-based video magnification
We present a new compact image pyramid representation, the Riesz pyramid, that can be used for real-time phase-based motion magnification. Our new representation is less overcomplete than even the smallest two orientation, octave-bandwidth complex steerable pyramid, and can be implemented using compact, efficient linear filters in the spatial domain. Motion-magnified videos produced with this new representation are of comparable quality to those produced with the complex steerable pyramid. When used with phase-based video magnification, the Riesz pyramid phase-shifts image features along only their dominant orientation rather than every orientation like the complex steerable pyramid.Quanta Computer (Firm)Shell ResearchNational Science Foundation (U.S.) (CGV-1111415)Microsoft Research (PhD Fellowship)Massachusetts Institute of Technology. Department of MathematicsNational Science Foundation (U.S.). Graduate Research Fellowship (Grant 1122374
Removing outliers from the Lucas-Kanade method with a weighted median filter
Master's thesis in Automation and signal processingThe definition of optical flow is stated as a brightness pattern of apparent motion of
objects, through surfaces and edges in a visual scene. This technique is used in motion
detection and segmentation, video compression and robot navigation.
The Lucas-Kanade method uses information from the image structure to compose a gradient based solution to estimate velocities, also known as movement of X- and Y-direction in a scene. The goal is to obtain an accurate pixel motion from an image sequence
The objective of this thesis is to implement a post processing step with a weighted median
lter to a well known optical flow method; the Lucas-Kanade. The purpose is to use the
weighted median lter to remove outliers, vectors that are lost due to illumination changes
and partial occlusions.
The median filer will replace velocities that are under represented in neighbourhoods. A
moving object will have corners not just edges, and these vectors have to be preserved.
A weighted median filter is introduced to ensure that the under represented vectors is
preserved. Error is measured through angular and endpoint error, describing accuracy of
the vector field.
The iterative and hierarchical LK method have been studied. The iterative estimation
struggles less with single error. Because of this the weighted median filter did not improve
the iterative LK-method. The hierarchical estimation is improved by the weighted median
and reduced the average error of both angular and endpoint error
Implementation and Qualitative/ Quantitative Comparison and Evaluation of Range Flow and Scene Flow
This thesis investigates the use of 3D scene flow and 3D range flow as a means of computing 3D observer (sensor or camera) motion. We implemented and evaluated the scene flow algorithm presented in Stereoscopic Scene Flow Computation for 3D Motion Understanding by Wedel et al. 2010. We modified (performed pyramidal image processing with warping) and re-implemented the range flow algorithms pre sented in Quantitative Regularized Range Flow by Spies et al. 2000. Both algorithms are 2-frame algorithms using a pyramid. The results for these scene and range flow algorithms were quantitatively and qualitatively compared on synthetic and real car driving stereo sequences
Angular variation as a monocular cue for spatial percepcion
Monocular cues are spatial sensory inputs which are picked up exclusively from one eye. They are in majority static features that
provide depth information and are extensively used in graphic art to create realistic representations of a scene. Since the spatial
information contained in these cues is picked up from the retinal image, the existence of a link between it and the theory of direct
perception can be conveniently assumed. According to this theory, spatial information of an environment is directly contained in the
optic array. Thus, this assumption makes possible the modeling of visual perception processes through computational approaches.
In this thesis, angular variation is considered as a monocular cue, and the concept of direct perception is adopted by a computer
vision approach that considers it as a suitable principle from which innovative techniques to calculate spatial information can be
developed.
The expected spatial information to be obtained from this monocular cue is the position and orientation of an object with respect to
the observer, which in computer vision is a well known field of research called 2D-3D pose estimation. In this thesis, the attempt to
establish the angular variation as a monocular cue and thus the achievement of a computational approach to direct perception is
carried out by the development of a set of pose estimation methods. Parting from conventional strategies to solve the pose
estimation problem, a first approach imposes constraint equations to relate object and image features. In this sense, two algorithms
based on a simple line rotation motion analysis were developed. These algorithms successfully provide pose information; however,
they depend strongly on scene data conditions. To overcome this limitation, a second approach inspired in the biological processes
performed by the human visual system was developed. It is based in the proper content of the image and defines a computational
approach to direct perception.
The set of developed algorithms analyzes the visual properties provided by angular variations. The aim is to gather valuable data
from which spatial information can be obtained and used to emulate a visual perception process by establishing a 2D-3D metric
relation. Since it is considered fundamental in the visual-motor coordination and consequently essential to interact with the
environment, a significant cognitive effect is produced by the application of the developed computational approach in environments
mediated by technology. In this work, this cognitive effect is demonstrated by an experimental study where a number of participants
were asked to complete an action-perception task. The main purpose of the study was to analyze the visual guided behavior in
teleoperation and the cognitive effect caused by the addition of 3D information. The results presented a significant influence of the
3D aid in the skill improvement, which showed an enhancement of the sense of presence.Las señales monoculares son entradas sensoriales capturadas exclusivamente por un
solo ojo que ayudan a la percepción de distancia o espacio. Son en su mayoría
características estáticas que proveen información de profundidad y son muy
utilizadas en arte gráfico para crear apariencias reales de una escena. Dado que la
información espacial contenida en dichas señales son extraídas de la retina, la
existencia de una relación entre esta extracción de información y la teoría de
percepción directa puede ser convenientemente asumida. De acuerdo a esta teoría, la
información espacial de todo le que vemos está directamente contenido en el arreglo
óptico. Por lo tanto, esta suposición hace posible el modelado de procesos de
percepción visual a través de enfoques computacionales. En esta tesis doctoral, la
variación angular es considerada como una señal monocular, y el concepto de
percepción directa adoptado por un enfoque basado en algoritmos de visión por
computador que lo consideran un principio apropiado para el desarrollo de nuevas
técnicas de cálculo de información espacial.
La información espacial esperada a obtener de esta señal monocular es la posición y
orientación de un objeto con respecto al observador, lo cual en visión por computador
es un conocido campo de investigación llamado estimación de la pose 2D-3D. En esta
tesis doctoral, establecer la variación angular como señal monocular y conseguir un
modelo matemático que describa la percepción directa, se lleva a cabo mediante el
desarrollo de un grupo de métodos de estimación de la pose. Partiendo de estrategias
convencionales, un primer enfoque implanta restricciones geométricas en ecuaciones
para relacionar características del objeto y la imagen. En este caso, dos algoritmos
basados en el análisis de movimientos de rotación de una línea recta fueron
desarrollados. Estos algoritmos exitosamente proveen información de la pose. Sin
embargo, dependen fuertemente de condiciones de la escena. Para superar esta
limitación, un segundo enfoque inspirado en los procesos biológicos ejecutados por el
sistema visual humano fue desarrollado. Está basado en el propio contenido de la
imagen y define un enfoque computacional a la percepción directa.
El grupo de algoritmos desarrollados analiza las propiedades visuales suministradas
por variaciones angulares. El propósito principal es el de reunir datos de importancia
con los cuales la información espacial pueda ser obtenida y utilizada para emular
procesos de percepción visual mediante el establecimiento de relaciones métricas 2D-
3D. Debido a que dicha relación es considerada fundamental en la coordinación
visuomotora y consecuentemente esencial para interactuar con lo que nos rodea, un
efecto cognitivo significativo puede ser producido por la aplicación de métodos de
L
estimación de pose en entornos mediados tecnológicamente. En esta tesis doctoral, este
efecto cognitivo ha sido demostrado por un estudio experimental en el cual un número
de participantes fueron invitados a ejecutar una tarea de acción-percepción. El
propósito principal de este estudio fue el análisis de la conducta guiada visualmente en
teleoperación y el efecto cognitivo causado por la inclusión de información 3D. Los
resultados han presentado una influencia notable de la ayuda 3D en la mejora de la
habilidad, así como un aumento de la sensación de presencia
메모리 대역폭이 감소된 다중 프레임 레이트 옵티칼 플로우
학위논문 (박사)-- 서울대학교 대학원 : 전기·정보공학부, 2015. 2. 김수환.최근 high frame rate camera의 비약적인 발전으로 이미 4K 1000FPS camera가 출시되었고 휴대폰에서도 1080P 240FPS를 지원하고 있다. Camera의 Frame rate 증가는 optical flow의 구현에 시사하는 바가 큰데, 그 이유는 frame rate이 올라갈수록 frame 간의 움직임 크기가 줄어들기 때문이다. 그 동안 큰 움직임에 대한 부정확한 optical flow 문제를 해결하기 위해서 다양한 알고리즘이 사용되어 왔지만, 이로 인한 computation의 증가 또는 알고리즘 dependency로 인해 늘어난 연산 시간은 real-time operation에 제약으로 작용한다. 하지만 camera의 frame rate이 올라가면 모든 움직임들은 이에 반비례해서 작아지므로, 결국 high frame rate camera는 간단한 알고리즘으로 정확한 optical flow를 얻을 수 있는 길을 열고 있다.
본 논문은 accurate real-time optical flow의 구현을 위해서 multi-frame rate and multi-scale optical flow 알고리즘을 제안한다. High frame rate camera를 이용한 multi-frame rate and multi-scale optical flow 알고리즘은 real-time optical flow의 hardware 구현에 적합하도록 iterative calculation없는 알고리즘이다. Multi-frame rate 알고리즘은 다양한 frame rate의 optical flow를 연산하고 서로간의 연관관계를 이용하여 slow motion 뿐만 아니라 high motion에 관해서도 optical flow 결과를 얻게 함으로써 측정 가능한 움직임을 확장시킨 알고리즘이다. 이 알고리즘은 frame rate 증가에 따른 시스템 연산량 증가를 기존 연구의 O(n)에서 O(log n) 수준으로 감소시킴으로써 system performance에 의한 제약을 크게 줄인다. Multi scale 알고리즘은 high frame rate system을 위한 full density 지원 알고리즘이다.
또한 본 논문에서는 frame rate의 증가에 따른 external memory access bandwidth 증가 문제를 풀기 위해서 spatial & temporal bandwidth reduction 알고리즘을 제안한다. 이 방법은 기존 LK optical flow알고리즘의 연산 순서를 바꾸고, iterative sub-sampling scheme, temporal Gaussian tail cut 그리고 frame reuse 등 다양한 방식의 알고리즘들을 제안함으로써 high frame rate system의 external memory access bandwidth를 감소시킨다.
마지막으로 Multi-Frame rate and multi-scale optical flow 알고리즘의 Multi-scale 구조의 hardware 의 구현 시 multiplier의 개수를 mxm크기의 윈도우처리를 위해 m개의 multiplier를 이용해서 convolution방식으로 구현하던 기존의 방법을 윈도우의 크기에 상관없이 2개의 multiplier로 mxm multiplication을 구현하는 방식을 제안한다. 이 방식을 기반으로 multi frame rate과 multi-scale의 hardware architecture를 제안하고 single level LK optical flow의 fpga구현을 통해서 제안한 architecture의 hardware 동작을 검증한다. 이상의 과정들을 통해서 accurate real-time optical flow system을 위한 multi-frame rate and multi-scale optical flow의 알고리즘 제안부터 architecture 검증까지의 연구를 진행한다.차 례
초 록 i
차 례 iii
그림 목차 vii
표 목 차 x
제1장 서 론 1
1.1 연구 배경 1
1.2 연구 내용 4
1.3 논문 구성 6
제2장 이전연구 7
2.1 LK Optical Flow 7
2.2 Large Displacement Problem 10
2.2.1 Pyramidal LK Optical Flow 10
2.2.2 High Frame Rate Optical Flow 11
2.2.3 Oversampled Optical Flow System 11
2.2.4 High Frame Rate Optical Flow System 14
2.3 Problems in High Frame Rate System 16
2.3.1 Test Sequence for High Frame Rate System 16
2.3.2 Saturated Accuracy for High frame rate 17
2.3.3 Accurate displacement for LK optical flow 21
2.3.1 Accurate frame rate of High frame rate system 23
제3장 Multi-Frame Rate Optical Flow 28
3.1 Ideal and Real Optical Flow System 29
3.2 Multi-Frame Rate Optical Flow 31
3.3 Accurate Frame Rate Selection 33
3.3.1 Magnitude Selection Algorithm 33
3.3.2 Magnitude Algorithm Validation 36
3.3.3 SSD(Sum of Squared Difference) 45
3.3.4 Magnitude with NR Selection Algorithm 48
3.3.5 Temporal Aliasing 51
3.4 Multi Frame Rate Optical Flow Test Result 52
3.5 Comparisons with previous works 57
제4장 Multi-Scale Optical Flow 59
4.1 Pyramidal Level Selection 61
4.2 Level Selection Algorithm 62
4.3 Pyramidal Image Generation 63
4.4 Proposed Algorithm Verification 66
4.4.1 Accuracy comparison 66
4.4.2 연산량 및 알고리즘 특성 comparisons 67
4.4.3 Graphical Result with various test sequences 69
제5장 Memory Bandwidth Reduction 76
5.1 Single Level LK Optical Flow System 76
5.1.1 Bandwidth Problem 76
5.1.2 Matrix multiplication 77
5.1.3 FPGA Implementation 77
5.2 Spatial Bandwidth Reduction 79
5.2.1 LK Optical Flow System Architecture 79
5.2.2 External Memory Bandwidth Requirement 80
5.2.1 제안하는 알고리즘 83
5.2.2 Simulation Result 86
5.3 Temporal Bandwidth Reduction 93
5.3.1 Tail Cut 93
5.3.2 Frame Reuse 97
5.3.3 Experimental Result 101
5.4 Matrix Generation 107
5.4.1 Matrix Multiplication 107
5.4.2 Proposed Matrix Multiplication 108
5.5 FPGA Implementation 111
제6장 결론 117
참 고 문 헌 118
Abstract 122Docto
Automatic colonic polyp detection using curvature analysis for standard and low dose CT data
Colon cancer is the second leading cause of cancer related deaths in the developed nations. Early detection and removal of colorectal polyps via screening is the most effective way to reduce colorectal cancer (CRC) mortality. Computed Tomography Colonography (CTC) or Virtual Colonoscopy (VC) is a rapidly evolving non-invasive technique and the medical community view this medical procedure as an alternative to the standard colonoscopy for the detection of colonic polyps. In CTC the first step for automatic polyp detection for 3D visualization of the colon structure and automatic polyp detection addresses the segmentation of the colon lumen. The segmentation of colon lumen is far from a trivial task as in practice many datasets are collapsed due to incorrect patient preparation or blockages caused by residual water/materials left in the colon. In this thesis a robust multi-stage technique for automatic segmentation of the colon is proposed tha t maximally uses the anatomical model of a generic colon. In this regard, the colon is reconstructed using volume by length analysis, orientation, length, end points, geometrical position in the volumetric data, and gradient of the centreline of each candidate air region detected in the CT data. The proposed method was validated using a total of 151 standard dose (lOOmAs) and 13 low-dose (13mAs-40mAs) datasets and the collapsed colon surface detection was always higher than 95% with an average of 1.58% extra colonic surface inclusion.
The second major step of automated CTC attempts the identification of colorectal polyps. In this thesis a robust method for polyp detection based on surface curvature analysis has been developed and evaluated. The convexity of the segmented colon surface is sampled using the surface normal intersection, Hough transform, 3D histogram, Gaussian distribution, convexity constraint and 3D region growing. For each polyp candidate surface the morphological and statistical features are extracted and the candidate surface is classified as a polyp/fold structure using a Feature Normalized Nearest Neighbourhood classifier. The devised polyp detection scheme entails a low computational overhead (typically takes 3.60 minute per dataset) and shows 100% sensitivity for polyps larger than 10mm, 92% sensitivity for polyps in the range 5 to 10mm and 64.28% sensitivity for polyp smaller than 5mm. The developed technique returns in average 4.01 false positives per dataset.
The patient exposure to ionising radiation is the major concern in using CTC as a mass screening technique for colonic polyp detection. A reduction of the radiation dose will increase the level of noise during the acquisition process and as a result the quality of the CT d a ta is degraded. To fully investigate the effect of the low-dose radiation on the performance of automated polyp detection, a phantom has been developed and scanned using different radiation doses. The phantom polyps have realistic shapes (sessile, pedunculated, and flat) and sizes (3 to 20mm) and were designed to closely approximate the real polyps encountered in clinical CT data. Automatic polyp detection shows 100% sensitivity for polyps larger than 10mm and shows 95% sensitivity for polyps in the range 5 to 10mm. The developed method was applied to CT data acquired at radiation doses between 13 to 40mAs and the experimental results indicate th a t robust polyp detection can be obtained even at radiation doses as low as 13mAs