138,156 research outputs found

    Grounding semantics in robots for Visual Question Answering

    Get PDF
    In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning

    Fusion of monocular cues to detect man-made structures in aerial imagery

    Get PDF
    The extraction of buildings from aerial imagery is a complex problem for automated computer vision. It requires locating regions in a scene that possess properties distinguishing them as man-made objects as opposed to naturally occurring terrain features. It is reasonable to assume that no single detection method can correctly delineate or verify buildings in every scene. A cooperative-methods paradigm is useful in approaching the building extraction problem. Using this paradigm, each extraction technique provides information which can be added or assimilated into an overall interpretation of the scene. Thus, the main objective is to explore the development of computer vision system that integrates the results of various scene analysis techniques into an accurate and robust interpretation of the underlying three dimensional scene. The problem of building hypothesis fusion in aerial imagery is discussed. Building extraction techniques are briefly surveyed, including four building extraction, verification, and clustering systems. A method for fusing the symbolic data generated by these systems is described, and applied to monocular image and stereo image data sets. Evaluation methods for the fusion results are described, and the fusion results are analyzed using these methods

    Dynamic spot analysis in the 2D electrophoresis gels images

    Get PDF
    Práce shrnuje faktory a parametry, které ovlivňují výsledky 2D elektroforézy, se zaměřením na zpracování obrazu jako jeden ze způsobů snížení nesprávné interpretace jejích výstupů. Proces zpracování obrazu využívá jako zdroj dat především obrazů z opakovaných provedení téhož pokusu, neboli víceplik. Pomocí analýzy obrazů víceplik je možno pozorovat nebo korigovat změny jednoho pokusu a také porovnávat je s výstupy jiných pokusů. Cílem práce je poskytnout podporu specialistovi, který má na starosti popsat vlastnosti struktur nacházejících se v elektroforetických obrazech.The text briefly describes factors and parameters which influence the results of 2D electrophoresis focusing on image processing as one manner to reduce incorrect interpretation of its outputs. As dataset, image processing performance uses images from repeated execution of one experiment also known as multiplicates. Using multiplicates analysis it is possible to observe or lower the changes of one experiment and to compare them with outputs of other experiments. The aim of this work is to provide support for specialist who takes care about describing the character patterns located in electrophoretic images.

    What do we perceive in a glance of a real-world scene?

    Get PDF
    What do we see when we glance at a natural scene and how does it change as the glance becomes longer? We asked naive subjects to report in a free-form format what they saw when looking at briefly presented real-life photographs. Our subjects received no specific information as to the content of each stimulus. Thus, our paradigm differs from previous studies where subjects were cued before a picture was presented and/or were probed with multiple-choice questions. In the first stage, 90 novel grayscale photographs were foveally shown to a group of 22 native-English-speaking subjects. The presentation time was chosen at random from a set of seven possible times (from 27 to 500 ms). A perceptual mask followed each photograph immediately. After each presentation, subjects reported what they had just seen as completely and truthfully as possible. In the second stage, another group of naive individuals was instructed to score each of the descriptions produced by the subjects in the first stage. Individual scores were assigned to more than a hundred different attributes. We show that within a single glance, much object- and scene-level information is perceived by human subjects. The richness of our perception, though, seems asymmetrical. Subjects tend to have a propensity toward perceiving natural scenes as being outdoor rather than indoor. The reporting of sensory- or feature-level information of a scene (such as shading and shape) consistently precedes the reporting of the semantic-level information. But once subjects recognize more semantic-level components of a scene, there is little evidence suggesting any bias toward either scene-level or object-level recognition

    Facial Expression Recognition

    Get PDF
    corecore