66 research outputs found

    Study of Computational Image Matching Techniques: Improving Our View of Biomedical Image Data

    Get PDF
    Image matching techniques are proven to be necessary in various fields of science and engineering, with many new methods and applications introduced over the years. In this PhD thesis, several computational image matching methods are introduced and investigated for improving the analysis of various biomedical image data. These improvements include the use of matching techniques for enhancing visualization of cross-sectional imaging modalities such as Computed Tomography (CT) and Magnetic Resonance Imaging (MRI), denoising of retinal Optical Coherence Tomography (OCT), and high quality 3D reconstruction of surfaces from Scanning Electron Microscope (SEM) images. This work greatly improves the process of data interpretation of image data with far reaching consequences for basic sciences research. The thesis starts with a general notion of the problem of image matching followed by an overview of the topics covered in the thesis. This is followed by introduction and investigation of several applications of image matching/registration in biomdecial image processing: a) registration-based slice interpolation, b) fast mesh-based deformable image registration and c) use of simultaneous rigid registration and Robust Principal Component Analysis (RPCA) for speckle noise reduction of retinal OCT images. Moving towards a different notion of image matching/correspondence, the problem of view synthesis and 3D reconstruction, with a focus on 3D reconstruction of microscopic samples from 2D images captured by SEM, is considered next. Starting from sparse feature-based matching techniques, an extensive analysis is provided for using several well-known feature detector/descriptor techniques, namely ORB, BRIEF, SURF and SIFT, for the problem of multi-view 3D reconstruction. This chapter contains qualitative and quantitative comparisons in order to reveal the shortcomings of the sparse feature-based techniques. This is followed by introduction of a novel framework using sparse-dense matching/correspondence for high quality 3D reconstruction of SEM images. As will be shown, the proposed framework results in better reconstructions when compared with state-of-the-art sparse-feature based techniques. Even though the proposed framework produces satisfactory results, there is room for improvements. These improvements become more necessary when dealing with higher complexity microscopic samples imaged by SEM as well as in cases with large displacements between corresponding points in micrographs. Therefore, based on the proposed framework, a new approach is proposed for high quality 3D reconstruction of microscopic samples. While in case of having simpler microscopic samples the performance of the two proposed techniques are comparable, the new technique results in more truthful reconstruction of highly complex samples. The thesis is concluded with an overview of the thesis and also pointers regarding future directions of the research using both multi-view and photometric techniques for 3D reconstruction of SEM images

    Disambiguating Multi–Modal Scene Representations Using Perceptual Grouping Constraints

    Get PDF
    In its early stages, the visual system suffers from a lot of ambiguity and noise that severely limits the performance of early vision algorithms. This article presents feedback mechanisms between early visual processes, such as perceptual grouping, stereopsis and depth reconstruction, that allow the system to reduce this ambiguity and improve early representation of visual information. In the first part, the article proposes a local perceptual grouping algorithm that — in addition to commonly used geometric information — makes use of a novel multi–modal measure between local edge/line features. The grouping information is then used to: 1) disambiguate stereopsis by enforcing that stereo matches preserve groups; and 2) correct the reconstruction error due to the image pixel sampling using a linear interpolation over the groups. The integration of mutual feedback between early vision processes is shown to reduce considerably ambiguity and noise without the need for global constraints

    Automatic visual detection of human behavior: a review from 2000 to 2014

    Get PDF
    Due to advances in information technology (e.g., digital video cameras, ubiquitous sensors), the automatic detection of human behaviors from video is a very recent research topic. In this paper, we perform a systematic and recent literature review on this topic, from 2000 to 2014, covering a selection of 193 papers that were searched from six major scientific publishers. The selected papers were classified into three main subjects: detection techniques, datasets and applications. The detection techniques were divided into four categories (initialization, tracking, pose estimation and recognition). The list of datasets includes eight examples (e.g., Hollywood action). Finally, several application areas were identified, including human detection, abnormal activity detection, action recognition, player modeling and pedestrian detection. Our analysis provides a road map to guide future research for designing automatic visual human behavior detection systems.This work is funded by the Portuguese Foundation for Science and Technology (FCT - Fundacao para a Ciencia e a Tecnologia) under research Grant SFRH/BD/84939/2012

    Enhancing low-level features with mid-level cues

    Get PDF
    Local features have become an essential tool in visual recognition. Much of the progress in computer vision over the past decade has built on simple, local representations such as SIFT or HOG. SIFT in particular shifted the paradigm in feature representation. Subsequent works have often focused on improving either computational efficiency, or invariance properties. This thesis belongs to the latter group. Invariance is a particularly relevant aspect if we intend to work with dense features. The traditional approach to sparse matching is to rely on stable interest points, such as corners, where scale and orientation can be reliably estimated, enforcing invariance; dense features need to be computed on arbitrary points. Dense features have been shown to outperform sparse matching techniques in many recognition problems, and form the bulk of our work. In this thesis we present strategies to enhance low-level, local features with mid-level, global cues. We devise techniques to construct better features, and use them to handle complex ambiguities, occlusions and background changes. To deal with ambiguities, we explore the use of motion to enforce temporal consistency with optical flow priors. We also introduce a novel technique to exploit segmentation cues, and use it to extract features invariant to background variability. For this, we downplay image measurements most likely to belong to a region different from that where the descriptor is computed. In both cases we follow the same strategy: we incorporate mid-level, "big picture" information into the construction of local features, and proceed to use them in the same manner as we would the baseline features. We apply these techniques to different feature representations, including SIFT and HOG, and use them to address canonical vision problems such as stereo and object detection, demonstrating that the introduction of global cues yields consistent improvements. We prioritize solutions that are simple, general, and efficient. Our main contributions are as follows: (a) An approach to dense stereo reconstruction with spatiotemporal features, which unlike existing works remains applicable to wide baselines. (b) A technique to exploit segmentation cues to construct dense descriptors invariant to background variability, such as occlusions or background motion. (c) A technique to integrate bottom-up segmentation with recognition efficiently, amenable to sliding window detectors.Les "features" locals s'han convertit en una eina fonamental en el camp del reconeixement visual. Gran part del progrés experimentat en el camp de la visió per computador al llarg de l'última decada es basa en representacions locals de baixa complexitat, com SIFT o HOG. SIFT, en concret, ha canviat el paradigma en representació de característiques visuals. Els treballs que l'han succeït s'acostumen a centrar o bé a millorar la seva eficiencia computacional, o bé propietats d'invariança. El treball presentat en aquesta tesi pertany al segon grup. L'invariança es un aspecte especialment rellevant quan volem treballab amb "features" denses, és a dir per a cada pixel. La manera tradicional d'atacar el problema amb "features" de baixa densitat consisteix en seleccionar punts d'interés estables, com per exemple cantonades, on l'escala i l'orientació poden ser estimades de manera robusta. Les "features" denses, per definició, han de ser calculades en punts arbitraris de la imatge. S'ha demostrat que les "features" denses obtenen millors resultats en tècniques de correspondència per a molts problemes en reconeixement, i formen la major part del nostre treball. En aquesta tesi presentem estratègies per a enriquir "features" locals de baix nivell amb "cues" o dades globals, de mitja complexitat. Dissenyem tècniques per a construïr millors "features", que usem per a atacar problemes tals com correspondències amb un grau elevat d'ambigüetat, oclusions, i canvis del fons de la imatge. Per a atacar ambigüetats, explorem l'ús del moviment per a imposar consistència espai-temporal mitjançant informació d'"optical flow". També presentem una tècnica per explotar dades de segmentació que fem servir per a extreure "features" invariants a canvis en el fons de la imatge. Aquest mètode consisteix en atenuar els components de la imatge (i per tant les "features") que probablement corresponguin a regions diferents a la del descriptor que estem calculant. En ambdós casos seguim la mateixa estratègia: la nostra voluntat és incorporar dades globals d'un nivell de complexitat mitja a la construcció de "features" locals, que procedim a utilitzar de la mateixa manera que les "features" originals. Aquestes tècniques són aplicades a diferents tipus de representacions, incloent SIFT i HOG, i mostrem com utilitzar-les per a atacar problemes fonamentals en visió per computador tals com l'estèreo i la detecció d'objectes. En aquest treball demostrem que introduïnt informació global en la construcció de "features" locals podem obtenir millores consistentment. Donem prioritat a solucions senzilles, generals i eficients. Aquestes són les principals contribucions de la tesi: (a) Una tècnica per a reconstrucció estèreo densa mitjançant "features" espai-temporals, amb l'avantatge respecte a treballs existents que podem aplicar-la a càmeres en qualsevol configuració geomètrica ("wide-baseline"). (b) Una tècnica per a explotar dades de segmentació dins la construcció de descriptors densos, fent-los invariants a canvis al fons de la imatge, i per tant a problemes com les oclusions en estèreo o objectes en moviment. (c) Una tècnica per a integrar segmentació de manera ascendent ("bottom-up") en problemes de reconeixement d'una manera eficient, dissenyada per a detectors de tipus "sliding window"

    Automatic Spatiotemporal Analysis of Cardiac Image Series

    Get PDF
    RÉSUMÉ À ce jour, les maladies cardiovasculaires demeurent au premier rang des principales causes de décès en Amérique du Nord. Chez l’adulte et au sein de populations de plus en plus jeunes, la soi-disant épidémie d’obésité entraînée par certaines habitudes de vie tels que la mauvaise alimentation, le manque d’exercice et le tabagisme est lourde de conséquences pour les personnes affectées, mais aussi sur le système de santé. La principale cause de morbidité et de mortalité chez ces patients est l’athérosclérose, une accumulation de plaque à l’intérieur des vaisseaux sanguins à hautes pressions telles que les artères coronaires. Les lésions athérosclérotiques peuvent entraîner l’ischémie en bloquant la circulation sanguine et/ou en provoquant une thrombose. Cela mène souvent à de graves conséquences telles qu’un infarctus. Outre les problèmes liés à la sténose, les parois artérielles des régions criblées de plaque augmentent la rigidité des parois vasculaires, ce qui peut aggraver la condition du patient. Dans la population pédiatrique, la pathologie cardiovasculaire acquise la plus fréquente est la maladie de Kawasaki. Il s’agit d’une vasculite aigüe pouvant affecter l’intégrité structurale des parois des artères coronaires et mener à la formation d’anévrismes. Dans certains cas, ceux-ci entravent l’hémodynamie artérielle en engendrant une perfusion myocardique insuffisante et en activant la formation de thromboses. Le diagnostic de ces deux maladies coronariennes sont traditionnellement effectués à l’aide d’angiographies par fluoroscopie. Pendant ces examens paracliniques, plusieurs centaines de projections radiographiques sont acquises en séries suite à l’infusion artérielle d’un agent de contraste. Ces images révèlent la lumière des vaisseaux sanguins et la présence de lésions potentiellement pathologiques, s’il y a lieu. Parce que les séries acquises contiennent de l’information très dynamique en termes de mouvement du patient volontaire et involontaire (ex. battements cardiaques, respiration et déplacement d’organes), le clinicien base généralement son interprétation sur une seule image angiographique où des mesures géométriques sont effectuées manuellement ou semi-automatiquement par un technicien en radiologie. Bien que l’angiographie par fluoroscopie soit fréquemment utilisé partout dans le monde et souvent considéré comme l’outil de diagnostic “gold-standard” pour de nombreuses maladies vasculaires, la nature bidimensionnelle de cette modalité d’imagerie est malheureusement très limitante en termes de spécification géométrique des différentes régions pathologiques. En effet, la structure tridimensionnelle des sténoses et des anévrismes ne peut pas être pleinement appréciée en 2D car les caractéristiques observées varient selon la configuration angulaire de l’imageur. De plus, la présence de lésions affectant les artères coronaires peut ne pas refléter la véritable santé du myocarde, car des mécanismes compensatoires naturels (ex. vaisseaux----------ABSTRACT Cardiovascular disease continues to be the leading cause of death in North America. In adult and, alarmingly, ever younger populations, the so-called obesity epidemic largely driven by lifestyle factors that include poor diet, lack of exercise and smoking, incurs enormous stresses on the healthcare system. The primary cause of serious morbidity and mortality for these patients is atherosclerosis, the build up of plaque inside high pressure vessels like the coronary arteries. These lesions can lead to ischemic disease and may progress to precarious blood flow blockage or thrombosis, often with infarction or other severe consequences. Besides the stenosis-related outcomes, the arterial walls of plaque-ridden regions manifest increased stiffness, which may exacerbate negative patient prognosis. In pediatric populations, the most prevalent acquired cardiovascular pathology is Kawasaki disease. This acute vasculitis may affect the structural integrity of coronary artery walls and progress to aneurysmal lesions. These can hinder the blood flow’s hemodynamics, leading to inadequate downstream perfusion, and may activate thrombus formation which may lead to precarious prognosis. Diagnosing these two prominent coronary artery diseases is traditionally performed using fluoroscopic angiography. Several hundred serial x-ray projections are acquired during selective arterial infusion of a radiodense contrast agent, which reveals the vessels’ luminal area and possible pathological lesions. The acquired series contain highly dynamic information on voluntary and involuntary patient movement: respiration, organ displacement and heartbeat, for example. Current clinical analysis is largely limited to a single angiographic image where geometrical measures will be performed manually or semi-automatically by a radiological technician. Although widely used around the world and generally considered the gold-standard diagnosis tool for many vascular diseases, the two-dimensional nature of this imaging modality is limiting in terms of specifying the geometry of various pathological regions. Indeed, the 3D structures of stenotic or aneurysmal lesions may not be fully appreciated in 2D because their observable features are dependent on the angular configuration of the imaging gantry. Furthermore, the presence of lesions in the coronary arteries may not reflect the true health of the myocardium, as natural compensatory mechanisms may obviate the need for further intervention. In light of this, cardiac magnetic resonance perfusion imaging is increasingly gaining attention and clinical implementation, as it offers a direct assessment of myocardial tissue viability following infarction or suspected coronary artery disease. This type of modality is plagued, however, by motion similar to that present in fluoroscopic imaging. This issue predisposes clinicians to laborious manual intervention in order to align anatomical structures in sequential perfusion frames, thus hindering automation o

    Automatic Spatiotemporal Analysis of Cardiac Image Series

    Get PDF
    RÉSUMÉ À ce jour, les maladies cardiovasculaires demeurent au premier rang des principales causes de décès en Amérique du Nord. Chez l’adulte et au sein de populations de plus en plus jeunes, la soi-disant épidémie d’obésité entraînée par certaines habitudes de vie tels que la mauvaise alimentation, le manque d’exercice et le tabagisme est lourde de conséquences pour les personnes affectées, mais aussi sur le système de santé. La principale cause de morbidité et de mortalité chez ces patients est l’athérosclérose, une accumulation de plaque à l’intérieur des vaisseaux sanguins à hautes pressions telles que les artères coronaires. Les lésions athérosclérotiques peuvent entraîner l’ischémie en bloquant la circulation sanguine et/ou en provoquant une thrombose. Cela mène souvent à de graves conséquences telles qu’un infarctus. Outre les problèmes liés à la sténose, les parois artérielles des régions criblées de plaque augmentent la rigidité des parois vasculaires, ce qui peut aggraver la condition du patient. Dans la population pédiatrique, la pathologie cardiovasculaire acquise la plus fréquente est la maladie de Kawasaki. Il s’agit d’une vasculite aigüe pouvant affecter l’intégrité structurale des parois des artères coronaires et mener à la formation d’anévrismes. Dans certains cas, ceux-ci entravent l’hémodynamie artérielle en engendrant une perfusion myocardique insuffisante et en activant la formation de thromboses. Le diagnostic de ces deux maladies coronariennes sont traditionnellement effectués à l’aide d’angiographies par fluoroscopie. Pendant ces examens paracliniques, plusieurs centaines de projections radiographiques sont acquises en séries suite à l’infusion artérielle d’un agent de contraste. Ces images révèlent la lumière des vaisseaux sanguins et la présence de lésions potentiellement pathologiques, s’il y a lieu. Parce que les séries acquises contiennent de l’information très dynamique en termes de mouvement du patient volontaire et involontaire (ex. battements cardiaques, respiration et déplacement d’organes), le clinicien base généralement son interprétation sur une seule image angiographique où des mesures géométriques sont effectuées manuellement ou semi-automatiquement par un technicien en radiologie. Bien que l’angiographie par fluoroscopie soit fréquemment utilisé partout dans le monde et souvent considéré comme l’outil de diagnostic “gold-standard” pour de nombreuses maladies vasculaires, la nature bidimensionnelle de cette modalité d’imagerie est malheureusement très limitante en termes de spécification géométrique des différentes régions pathologiques. En effet, la structure tridimensionnelle des sténoses et des anévrismes ne peut pas être pleinement appréciée en 2D car les caractéristiques observées varient selon la configuration angulaire de l’imageur. De plus, la présence de lésions affectant les artères coronaires peut ne pas refléter la véritable santé du myocarde, car des mécanismes compensatoires naturels (ex. vaisseaux----------ABSTRACT Cardiovascular disease continues to be the leading cause of death in North America. In adult and, alarmingly, ever younger populations, the so-called obesity epidemic largely driven by lifestyle factors that include poor diet, lack of exercise and smoking, incurs enormous stresses on the healthcare system. The primary cause of serious morbidity and mortality for these patients is atherosclerosis, the build up of plaque inside high pressure vessels like the coronary arteries. These lesions can lead to ischemic disease and may progress to precarious blood flow blockage or thrombosis, often with infarction or other severe consequences. Besides the stenosis-related outcomes, the arterial walls of plaque-ridden regions manifest increased stiffness, which may exacerbate negative patient prognosis. In pediatric populations, the most prevalent acquired cardiovascular pathology is Kawasaki disease. This acute vasculitis may affect the structural integrity of coronary artery walls and progress to aneurysmal lesions. These can hinder the blood flow’s hemodynamics, leading to inadequate downstream perfusion, and may activate thrombus formation which may lead to precarious prognosis. Diagnosing these two prominent coronary artery diseases is traditionally performed using fluoroscopic angiography. Several hundred serial x-ray projections are acquired during selective arterial infusion of a radiodense contrast agent, which reveals the vessels’ luminal area and possible pathological lesions. The acquired series contain highly dynamic information on voluntary and involuntary patient movement: respiration, organ displacement and heartbeat, for example. Current clinical analysis is largely limited to a single angiographic image where geometrical measures will be performed manually or semi-automatically by a radiological technician. Although widely used around the world and generally considered the gold-standard diagnosis tool for many vascular diseases, the two-dimensional nature of this imaging modality is limiting in terms of specifying the geometry of various pathological regions. Indeed, the 3D structures of stenotic or aneurysmal lesions may not be fully appreciated in 2D because their observable features are dependent on the angular configuration of the imaging gantry. Furthermore, the presence of lesions in the coronary arteries may not reflect the true health of the myocardium, as natural compensatory mechanisms may obviate the need for further intervention. In light of this, cardiac magnetic resonance perfusion imaging is increasingly gaining attention and clinical implementation, as it offers a direct assessment of myocardial tissue viability following infarction or suspected coronary artery disease. This type of modality is plagued, however, by motion similar to that present in fluoroscopic imaging. This issue predisposes clinicians to laborious manual intervention in order to align anatomical structures in sequential perfusion frames, thus hindering automation o

    Video foreground extraction for mobile camera platforms

    Get PDF
    Foreground object detection is a fundamental task in computer vision with many applications in areas such as object tracking, event identification, and behavior analysis. Most conventional foreground object detection methods work only in a stable illumination environments using fixed cameras. In real-world applications, however, it is often the case that the algorithm needs to operate under the following challenging conditions: drastic lighting changes, object shape complexity, moving cameras, low frame capture rates, and low resolution images. This thesis presents four novel approaches for foreground object detection on real-world datasets using cameras deployed on moving vehicles.The first problem addresses passenger detection and tracking tasks for public transport buses investigating the problem of changing illumination conditions and low frame capture rates. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modelling method with a human shape model into a weighted Bayesian framework to detect passengers. To deal with the problem of tracking multiple targets, we employ the Reversible Jump Monte Carlo Markov Chain tracking algorithm. Using the SVM classifier, the appearance transformation models capture changes in the appearance of the foreground objects across two consecutives frames under low frame rate conditions. In the second problem, we present a system for pedestrian detection involving scenes captured by a mobile bus surveillance system. It integrates scene localization, foreground-background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data.In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarity, and the second stage further clusters these aligned frames according to consistency in illumination. This produces clusters of images that are differential in viewpoint and lighting. A kernel density estimation (KDE) technique for colour and gradient is then used to construct background models for each image cluster, which is further used to detect candidate foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be detected.In addition to the second problem, we present three direct pedestrian detection methods that extend the HOG (Histogram of Oriented Gradient) techniques (Dalal and Triggs, 2005) and provide a comparative evaluation of these approaches. The three approaches include: a) a new histogram feature, that is formed by the weighted sum of both the gradient magnitude and the filter responses from a set of elongated Gaussian filters (Leung and Malik, 2001) corresponding to the quantised orientation, which we refer to as the Histogram of Oriented Gradient Banks (HOGB) approach; b) the codebook based HOG feature with branch-and-bound (efficient subwindow search) algorithm (Lampert et al., 2008) and; c) the codebook based HOGB approach.In the third problem, a unified framework that combines 3D and 2D background modelling is proposed to detect scene changes using a camera mounted on a moving vehicle. The 3D scene is first reconstructed from a set of videos taken at different times. The 3D background modelling identifies inconsistent scene structures as foreground objects. For the 2D approach, foreground objects are detected using the spatio-temporal MRF algorithm. Finally, the 3D and 2D results are combined using morphological operations.The significance of these research is that it provides basic frameworks for automatic large-scale mobile surveillance applications and facilitates many higher-level applications such as object tracking and behaviour analysis

    Optical flow estimation using steered-L1 norm

    Get PDF
    Motion is a very important part of understanding the visual picture of the surrounding environment. In image processing it involves the estimation of displacements for image points in an image sequence. In this context dense optical flow estimation is concerned with the computation of pixel displacements in a sequence of images, therefore it has been used widely in the field of image processing and computer vision. A lot of research was dedicated to enable an accurate and fast motion computation in image sequences. Despite the recent advances in the computation of optical flow, there is still room for improvements and optical flow algorithms still suffer from several issues, such as motion discontinuities, occlusion handling, and robustness to illumination changes. This thesis includes an investigation for the topic of optical flow and its applications. It addresses several issues in the computation of dense optical flow and proposes solutions. Specifically, this thesis is divided into two main parts dedicated to address two main areas of interest in optical flow. In the first part, image registration using optical flow is investigated. Both local and global image registration has been used for image registration. An image registration based on an improved version of the combined Local-global method of optical flow computation is proposed. A bi-lateral filter was used in this optical flow method to improve the edge preserving performance. It is shown that image registration via this method gives more robust results compared to the local and the global optical flow methods previously investigated. The second part of this thesis encompasses the main contribution of this research which is an improved total variation L1 norm. A smoothness term is used in the optical flow energy function to regularise this function. The L1 is a plausible choice for such a term because of its performance in preserving edges, however this term is known to be isotropic and hence decreases the penalisation near motion boundaries in all directions. The proposed improved L1 (termed here as the steered-L1 norm) smoothness term demonstrates similar performance across motion boundaries but improves the penalisation performance along such boundaries
    corecore