11,017 research outputs found

    CONFIGR: A Vision-Based Model for Long-Range Figure Completion

    Full text link
    CONFIGR (CONtour FIgure GRound) is a computational model based on principles of biological vision that completes sparse and noisy image figures. Within an integrated vision/recognition system, CONFIGR posits an initial recognition stage which identifies figure pixels from spatially local input information. The resulting, and typically incomplete, figure is fed back to the “early vision” stage for long-range completion via filling-in. The reconstructed image is then re-presented to the recognition system for global functions such as object recognition. In the CONFIGR algorithm, the smallest independent image unit is the visible pixel, whose size defines a computational spatial scale. Once pixel size is fixed, the entire algorithm is fully determined, with no additional parameter choices. Multi-scale simulations illustrate the vision/recognition system. Open-source CONFIGR code is available online, but all examples can be derived analytically, and the design principles applied at each step are transparent. The model balances filling-in as figure against complementary filling-in as ground, which blocks spurious figure completions. Lobe computations occur on a subpixel spatial scale. Originally designed to fill-in missing contours in an incomplete image such as a dashed line, the same CONFIGR system connects and segments sparse dots, and unifies occluded objects from pieces locally identified as figure in the initial recognition stage. The model self-scales its completion distances, filling-in across gaps of any length, where unimpeded, while limiting connections among dense image-figure pixel groups that already have intrinsic form. Long-range image completion promises to play an important role in adaptive processors that reconstruct images from highly compressed video and still camera images.Air Force Office of Scientific Research (F49620-01-1-0423); National Geospatial-Intelligence Agency (NMA 201-01-1-0216); National Science Foundation (SBE-0354378); Office of Naval Research (N000014-01-1-0624

    Anatomical curve identification

    Get PDF
    Methods for capturing images in three dimensions are now widely available, with stereo-photogrammetry and laser scanning being two common approaches. In anatomical studies, a number of landmarks are usually identified manually from each of these images and these form the basis of subsequent statistical analysis. However, landmarks express only a very small proportion of the information available from the images. Anatomically defined curves have the advantage of providing a much richer expression of shape. This is explored in the context of identifying the boundary of breasts from an image of the female torso and the boundary of the lips from a facial image. The curves of interest are characterised by ridges or valleys. Key issues in estimation are the ability to navigate across the anatomical surface in three-dimensions, the ability to recognise the relevant boundary and the need to assess the evidence for the presence of the surface feature of interest. The first issue is addressed by the use of principal curves, as an extension of principal components, the second by suitable assessment of curvature and the third by change-point detection. P-spline smoothing is used as an integral part of the methods but adaptations are made to the specific anatomical features of interest. After estimation of the boundary curves, the intermediate surfaces of the anatomical feature of interest can be characterised by surface interpolation. This allows shape variation to be explored using standard methods such as principal components. These tools are applied to a collection of images of women where one breast has been reconstructed after mastectomy and where interest lies in shape differences between the reconstructed and unreconstructed breasts. They are also applied to a collection of lip images where possible differences in shape between males and females are of interest

    Evaluating Visual Realism in Drawing Areas of Interest on UML Diagrams

    Get PDF
    Areas of interest (AOIs) are defined as an addition to UML diagrams: groups of elements of system architecture diagrams that share some common property. Some methods have been proposed to automatically draw AOIs on UML diagrams. However, it is not clear how users perceive the results of such methods as compared to human-drawn areas of interest. We present here a process of studying and improving the perceived quality of computer-drawn AOIs. We qualitatively evaluated how users perceive the quality of computer- and human-drawn AOIs, and used these results to improve an existing algorithm for drawing AOIs. Finally, we designed a quantitative comparison for AOI drawings and used it to show that our improved renderings are closer to human drawings than the original rendering algorithm results. The combined user evaluation, algorithmic improvements, and quantitative comparison support our claim of improving the perceived quality of AOIs rendered on UML diagrams.

    Adaptive Binning of X-ray data with Weighted Voronoi Tesselations

    Full text link
    We present a technique to adaptively bin sparse X-ray data using weighted Voronoi tesselations (WVTs). WVT binning is a generalisation of Cappellari & Copin's (2001) Voronoi binning algorithm, developed for integral field spectroscopy. WVT binning is applicable to many types of data and creates unbiased binning structures with compact bins that do not lead the eye. We apply the algorithm to simulated data, as well as several X-ray data sets, to create adaptively binned intensity images, hardness ratio maps and temperature maps with constant signal-to-noise ratio per bin. We also illustrate the separation of diffuse gas emission from contributions of unresolved point sources in elliptical galaxies. We compare the performance of WVT binning with other adaptive binning and adaptive smoothing techniques. We find that the CIAO tool csmooth creates serious artefacts and advise against its use to interpret diffuse X-ray emission.Comment: 14 pages; submitted to MNRAS; code freely available at http://www.phy.ohiou.edu/~diehl/WVT/index.html with user manual, examples and high-resolution version of this pape

    Grounding semantics in robots for Visual Question Answering

    Get PDF
    In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning
    corecore