2,118 research outputs found

    A Neural Model of How the Brain Computes Heading from Optic Flow in Realistic Scenes

    Full text link
    Animals avoid obstacles and approach goals in novel cluttered environments using visual information, notably optic flow, to compute heading, or direction of travel, with respect to objects in the environment. We present a neural model of how heading is computed that describes interactions among neurons in several visual areas of the primate magnocellular pathway, from retina through V1, MT+, and MSTd. The model produces outputs which are qualitatively and quantitatively similar to human heading estimation data in response to complex natural scenes. The model estimates heading to within 1.5° in random dot or photo-realistically rendered scenes and within 3° in video streams from driving in real-world environments. Simulated rotations of less than 1 degree per second do not affect model performance, but faster simulated rotation rates deteriorate performance, as in humans. The model is part of a larger navigational system that identifies and tracks objects while navigating in cluttered environments.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National-Geospatial Intelligence Agency (NMA201-01-1-2016

    Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation

    Full text link
    This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint training of these two model paradigms improves performance and allows us to significantly outperform existing state-of-the-art techniques

    Quantitative Techniques for PET/CT: A Clinical Assessment of the Impact of PSF and TOF

    Get PDF
    Tomographic reconstruction has been a challenge for many imaging applications, and it is particularly problematic for count-limited modalities such as Positron Emission Tomography (PET). Recent advances in PET, including the incorporation of time-of-flight (TOF) information and modeling the variation of the point response across the imaging field (PSF), have resulted in significant improvements in image quality. While the effects of these techniques have been characterized with simulations and mathematical modeling, there has been relatively little work investigating the potential impact of such methods in the clinical setting. The objective of this work is to quantify these techniques in the context of realistic lesion detection and localization tasks for a medical environment. Mathematical observers are used to first identify optimal reconstruction parameters and then later to evaluate the performance of the reconstructions. The effect on the reconstruction algorithms is then evaluated for various patient sizes and imaging conditions. The findings for the mathematical observers are compared to, and validated by, the performance of three experienced nuclear medicine physicians completing the same task

    Breast Tomosynthesis: Aspects on detection and perception of simulated lesions

    Get PDF
    The aim of this thesis was to investigate aspects on detectability of simulated lesions (microcalcifications and masses) in digital mammography (DM) and breast tomosynthesis (BT). Perception in BT image volumes were also investigated by evaluating certain reading conditions. The first study concerned the effect of system noise on the detection of masses and microcalcification clusters in DM images using a free-response task. System noise has an impact on image quality and is related to the dose level. It was found to have a substantial impact on the detection of microcalcification clusters, whereas masses were relatively unaffected. The effect of superimposed tissue in DM is the major limitation hampering the detection of masses. BT is a three-dimensional technique that reduces the effect of superimposed tissue. In the following two studies visibility was quantified for both imaging modalities in terms of the required contrast at a fixed detection performance (92% correct decisions). Contrast detail plots for lesions with sizes 0.2, 1, 3, 8 and 25 mm were generated. The first study involved only an in-plane BT slice, where the lesion centre appeared. The second study repeated the same procedure in BT image volumes for 3D distributed microcalcification clusters and 8 mm masses at two dose levels. Both studies showed that BT needs substantially less contrast than DM for lesions above 1 mm. Furthermore, the contrast threshold increased as the lesion size increased for both modalities. This is in accordance with the reduced effect of superimposed tissue in BT. For 0.2 mm lesions, substantially more contrast was needed. At equal dose, DM was better than BT for 0.2 mm lesions and microcalcification clusters. Doubling the dose substantially improved the detection in BT. Thus, system noise has a substantial impact on detection. The final study evaluated reading conditions for BT image volumes. Four viewing procedures were assessed: free scroll browsing only or combined with initial cine loops at frame rates of 9, 14 and 25 fps. They were viewed on a wide screen monitor placed in vertical or horizontal positions. A free-response task and eye tracking were utilized to record the detection performance, analysis time, visual attention and search strategies. Improved reading conditions were found for horizontally aligned BT image volumes when using free scroll browsing only or combined with a cine loop at the fastest frame rate

    3D Object Representations for Recognition.

    Full text link
    Object recognition from images is a longstanding and challenging problem in computer vision. The main challenge is that the appearance of objects in images is affected by a number of factors, such as illumination, scale, camera viewpoint, intra-class variability, occlusion, truncation, and so on. How to handle all these factors in object recognition is still an open problem. In this dissertation, I present my efforts in building 3D object representations for object recognition. Compared to 2D appearance based object representations, 3D object representations can capture the 3D nature of objects and better handle viewpoint variation, occlusion and truncation in object recognition. I introduce three new 3D object representations: the 3D aspect part representation, the 3D aspectlet representation and the 3D voxel pattern representation. These representations are built to handle different challenging factors in object recognition. The 3D aspect part representation is able to capture the appearance change of object categories due to viewpoint transformation. The 3D aspectlet representation and the 3D voxel pattern representation are designed to handle occlusions between objects in addition to viewpoint change. Based on these representations, we propose new object recognition methods and conduct experiments on benchmark datasets to verify the advantages of our methods. Furthermore, we introduce, PASCAL3D+, a new large scale dataset for 3D object recognition by aligning objects in images with 3D CAD models. We also propose two novel methods to tackle object co-detection and multiview object tracking using our 3D aspect part representation, and a novel Convolutional Neural Network-based approach for object detection using our 3D voxel pattern representation. In order to track multiple objects in videos, we introduce a new online multi-object tracking framework based on Markov Decision Processes. Lastly, I conclude the dissertation and discuss future steps for 3D object recognition.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120836/1/yuxiang_1.pd
    • …
    corecore