1,412 research outputs found

    Multi-scale lines and edges in V1 and beyond: brightness, object categorization and recognition, and consciousness

    Get PDF
    In this paper we present an improved model for line and edge detection in cortical area V1. This model is based on responses of simple and complex cells, and it is multi-scale with no free parameters. We illustrate the use of the multi-scale line/edge representation in different processes: visual reconstruction or brightness perception, automatic scale selection and object segregation. A two-level object categorization scenario is tested in which pre-categorization is based on coarse scales only and final categorization on coarse plus fine scales. We also present a multi-scale object and face recognition model. Processing schemes are discussed in the framework of a complete cortical architecture. The fact that brightness perception and object recognition may be based on the same symbolic image representation is an indication that the entire (visual) cortex is involved in consciousness

    Doctor of Philosophy

    Get PDF
    dissertationCongenital heart defects are classes of birth defects that affect the structure and function of the heart. These defects are attributed to the abnormal or incomplete development of a fetal heart during the first few weeks following conception. The overall detection rate of congenital heart defects during routine prenatal examination is low. This is attributed to the insufficient number of trained personnel in many local health centers where many cases of congenital heart defects go undetected. This dissertation presents a system to identify congenital heart defects to improve pregnancy outcomes and increase their detection rates. The system was developed and its performance assessed in identifying the presence of ventricular defects (congenital heart defects that affect the size of the ventricles) using four-dimensional fetal chocardiographic images. The designed system consists of three components: 1) a fetal heart location estimation component, 2) a fetal heart chamber segmentation component, and 3) a detection component that detects congenital heart defects from the segmented chambers. The location estimation component is used to isolate a fetal heart in any four-dimensional fetal echocardiographic image. It uses a hybrid region of interest extraction method that is robust to speckle noise degradation inherent in all ultrasound images. The location estimation method's performance was analyzed on 130 four-dimensional fetal echocardiographic images by comparison with manually identified fetal heart region of interest. The location estimation method showed good agreement with the manually identified standard using four quantitative indexes: Jaccard index, Sørenson-Dice index, Sensitivity index and Specificity index. The average values of these indexes were measured at 80.70%, 89.19%, 91.04%, and 99.17%, respectively. The fetal heart chamber segmentation component uses velocity vector field estimates computed on frames contained in a four-dimensional image to identify the fetal heart chambers. The velocity vector fields are computed using a histogram-based optical flow technique which is formulated on local image characteristics to reduces the effect of speckle noise and nonuniform echogenicity on the velocity vector field estimates. Features based on the velocity vector field estimates, voxel brightness/intensity values, and voxel Cartesian coordinate positions were extracted and used with kernel k-means algorithm to identify the individual chambers. The segmentation method's performance was evaluated on 130 images from 31 patients by comparing the segmentation results with manually identified fetal heart chambers. Evaluation was based on the Sørenson-Dice index, the absolute volume difference and the Hausdorff distance, with each resulting in per patient average values of 69.92%, 22.08%, and 2.82 mm, respectively. The detection component uses the volumes of the identified fetal heart chambers to flag the possible occurrence of hypoplastic left heart syndrome, a type of congenital heart defect. An empirical volume threshold defined on the relative ratio of adjacent fetal heart chamber volumes obtained manually is used in the detection process. The performance of the detection procedure was assessed by comparison with a set of images with confirmed diagnosis of hypoplastic left heart syndrome and a control group of normal fetal hearts. Of the 130 images considered 18 of 20 (90%) fetal hearts were correctly detected as having hypoplastic left heart syndrome and 84 of 110 (76.36%) fetal hearts were correctly detected as normal in the control group. The results show that the detection system performs better than the overall detection rate for congenital heart defect which is reported to be between 30% and 60%

    Multispectral Image Road Extraction Based Upon Automated Map Conflation

    Get PDF
    Road network extraction from remotely sensed imagery enables many important and diverse applications such as vehicle tracking, drone navigation, and intelligent transportation studies. There are, however, a number of challenges to road detection from an image. Road pavement material, width, direction, and topology vary across a scene. Complete or partial occlusions caused by nearby buildings, trees, and the shadows cast by them, make maintaining road connectivity difficult. The problems posed by occlusions are exacerbated with the increasing use of oblique imagery from aerial and satellite platforms. Further, common objects such as rooftops and parking lots are made of materials similar or identical to road pavements. This problem of common materials is a classic case of a single land cover material existing for different land use scenarios. This work addresses these problems in road extraction from geo-referenced imagery by leveraging the OpenStreetMap digital road map to guide image-based road extraction. The crowd-sourced cartography has the advantages of worldwide coverage that is constantly updated. The derived road vectors follow only roads and so can serve to guide image-based road extraction with minimal confusion from occlusions and changes in road material. On the other hand, the vector road map has no information on road widths and misalignments between the vector map and the geo-referenced image are small but nonsystematic. Properly correcting misalignment between two geospatial datasets, also known as map conflation, is an essential step. A generic framework requiring minimal human intervention is described for multispectral image road extraction and automatic road map conflation. The approach relies on the road feature generation of a binary mask and a corresponding curvilinear image. A method for generating the binary road mask from the image by applying a spectral measure is presented. The spectral measure, called anisotropy-tunable distance (ATD), differs from conventional measures and is created to account for both changes of spectral direction and spectral magnitude in a unified fashion. The ATD measure is particularly suitable for differentiating urban targets such as roads and building rooftops. The curvilinear image provides estimates of the width and orientation of potential road segments. Road vectors derived from OpenStreetMap are then conflated to image road features by applying junction matching and intermediate point matching, followed by refinement with mean-shift clustering and morphological processing to produce a road mask with piecewise width estimates. The proposed approach is tested on a set of challenging, large, and diverse image data sets and the performance accuracy is assessed. The method is effective for road detection and width estimation of roads, even in challenging scenarios when extensive occlusion occurs

    Faraday Rotation of Extended Emission as a Probe of the Large-Scale Galactic Magnetic Field

    Full text link
    The Galactic magnetic field is an integral constituent of the interstellar medium (ISM), and knowledge of its structure is crucial to understanding Galactic dynamics. The Rotation Measures (RM) of extragalactic (EG) sources have been the basis of comprehensive Galactic magnetic field models. Polarised extended emission (XE) is also seen along lines of sight through the Galactic disk, and also displays the effects of Faraday rotation. Our aim is to investigate and understand the relationship between EG and XE RMs near the Galactic plane, and to determine how the XE RMs, a hitherto unused resource, can be used as a probe of the large-scale Galactic magnetic field. We used polarisation data from the Canadian Galactic Plane Survey (CGPS), observed near 1420 MHz with the Dominion Radio Astrophysical Observatory (DRAO) Synthesis Telescope. We calculated RMs from a linear fit to the polarisation angles as a function of wavelength squared in four frequency channels, for both the EG sources and the XE. Across the CGPS area, 55∘<ℓ<193∘,−3∘<b<5∘55^{\circ} < {\ell} <193^{\circ}, -3^{\circ} < b < 5^{\circ}, the RMs of the XE closely track the RMs of the EG sources, with XE RMs about half the value of EG-source RMs. The exceptions are places where large local HII complexes heavily depolarise more distant emission. We conclude that there is valuable information in the XE RM dataset. The factor of 2 between the two types of RM values is close to that expected from a Burn slab model of the ISM. This result indicates that, at least in the outer Galaxy, the EG and XE sources are likely probing similar depths, and that the Faraday rotating medium and the synchrotron emitting medium have similar variation with galactocentric distance.Comment: Accepted to Galaxies, March 22, 201

    Combinatorial Solutions for Shape Optimization in Computer Vision

    Get PDF
    This thesis aims at solving so-called shape optimization problems, i.e. problems where the shape of some real-world entity is sought, by applying combinatorial algorithms. I present several advances in this field, all of them based on energy minimization. The addressed problems will become more intricate in the course of the thesis, starting from problems that are solved globally, then turning to problems where so far no global solutions are known. The first two chapters treat segmentation problems where the considered grouping criterion is directly derived from the image data. That is, the respective data terms do not involve any parameters to estimate. These problems will be solved globally. The first of these chapters treats the problem of unsupervised image segmentation where apart from the image there is no other user input. Here I will focus on a contour-based method and show how to integrate curvature regularity into a ratio-based optimization framework. The arising optimization problem is reduced to optimizing over the cycles in a product graph. This problem can be solved globally in polynomial, effectively linear time. As a consequence, the method does not depend on initialization and translational invariance is achieved. This is joint work with Daniel Cremers and Simon Masnou. I will then proceed to the integration of shape knowledge into the framework, while keeping translational invariance. This problem is again reduced to cycle-finding in a product graph. Being based on the alignment of shape points, the method actually uses a more sophisticated shape measure than most local approaches and still provides global optima. It readily extends to tracking problems and allows to solve some of them in real-time. I will present an extension to highly deformable shape models which can be included in the global optimization framework. This method simultaneously allows to decompose a shape into a set of deformable parts, based only on the input images. This is joint work with Daniel Cremers. In the second part segmentation is combined with so-called correspondence problems, i.e. the underlying grouping criterion is now based on correspondences that have to be inferred simultaneously. That is, in addition to inferring the shapes of objects, one now also tries to put into correspondence the points in several images. The arising problems become more intricate and are no longer optimized globally. This part is divided into two chapters. The first chapter treats the topic of real-time motion segmentation where objects are identified based on the observations that the respective points in the video will move coherently. Rather than pre-estimating motion, a single energy functional is minimized via alternating optimization. The main novelty lies in the real-time capability, which is achieved by exploiting a fast combinatorial segmentation algorithm. The results are furthermore improved by employing a probabilistic data term. This is joint work with Daniel Cremers. The final chapter presents a method for high resolution motion layer decomposition and was developed in combination with Daniel Cremers and Thomas Pock. Layer decomposition methods support the notion of a scene model, which allows to model occlusion and enforce temporal consistency. The contributions are twofold: from a practical point of view the proposed method allows to recover fine-detailed layer images by minimizing a single energy. This is achieved by integrating a super-resolution method into the layer decomposition framework. From a theoretical viewpoint the proposed method introduces layer-based regularity terms as well as a graph cut-based scheme to solve for the layer domains. The latter is combined with powerful continuous convex optimization techniques into an alternating minimization scheme. Lastly I want to mention that a significant part of this thesis is devoted to the recent trend of exploiting parallel architectures, in particular graphics cards: many combinatorial algorithms are easily parallelized. In Chapter 3 we will see a case where the standard algorithm is hard to parallelize, but easy for the respective problem instances

    Brain MRI Segmentation using Template-Based Training and Visual Perception Augmentation

    Full text link
    Deep learning models usually require sufficient training data to achieve high accuracy, but obtaining labeled data can be time-consuming and labor-intensive. Here we introduce a template-based training method to train a 3D U-Net model from scratch using only one population-averaged brain MRI template and its associated segmentation label. The process incorporated visual perception augmentation to enhance the model's robustness in handling diverse image inputs and mitigating overfitting. Leveraging this approach, we trained 3D U-Net models for mouse, rat, marmoset, rhesus, and human brain MRI to achieve segmentation tasks such as skull-stripping, brain segmentation, and tissue probability mapping. This tool effectively addresses the limited availability of training data and holds significant potential for expanding deep learning applications in image analysis, providing researchers with a unified solution to train deep neural networks with only one image sample
    • …
    corecore